Introduction

Transmission Control Protocol (TCP): core transport layer protocol in Internet protocol suite. Provides reliable, ordered, and error-checked delivery of data between applications running on hosts across IP networks. Enables connection-oriented communication with flow and congestion control mechanisms. Essential for web browsing, email, file transfers, and many other network services.

"TCP is the backbone of reliable communication on the Internet, ensuring data integrity and delivery despite the unreliable nature of underlying networks." -- W. Richard Stevens

TCP Overview

Protocol Characteristics

Connection-oriented: establishes virtual circuit before data transfer. Reliable: guarantees data delivery, order, and integrity. Full-duplex: simultaneous bidirectional communication. Stream-oriented: data viewed as continuous byte stream, not discrete messages. Flow control: prevents sender from overwhelming receiver. Congestion control: prevents network overload. Multiplexing: uses ports to support multiple applications.

Role in Transport Layer

Provides logical communication between processes on hosts. Works atop IP (Internet Protocol), which provides best-effort datagram service. Adds reliability, sequencing, error detection, and retransmission. Differentiates from UDP (User Datagram Protocol) by reliability and connection orientation.

Applications Using TCP

HTTP/HTTPS: web browsing. FTP: file transfers. SMTP/POP3/IMAP: email services. Telnet/SSH: remote terminal access. Database communication. Any application requiring reliable data delivery.

TCP Header Structure

Header Fields Overview

Fixed 20-byte header, optional fields up to 60 bytes. Contains source/destination ports, sequence and acknowledgment numbers, flags, window size, checksum, urgent pointer, options.

Key Header Fields

Source Port (16 bits): identifies sender application. Destination Port (16 bits): identifies receiver application. Sequence Number (32 bits): position of first data byte in segment. Acknowledgment Number (32 bits): next expected byte from sender. Data Offset (4 bits): header length. Reserved (6 bits): reserved for future use. Flags (6 bits): control bits (URG, ACK, PSH, RST, SYN, FIN). Window Size (16 bits): flow control buffer size. Checksum (16 bits): error-checking of header and data. Urgent Pointer (16 bits): indicates urgent data offset.

Options Field

Variable length. Common options: Maximum Segment Size (MSS), Window Scale, Timestamp, Selective Acknowledgment (SACK). Enhances performance, reliability, and scalability.

FieldSize (bits)Description
Source Port16Sender application identifier
Destination Port16Receiver application identifier
Sequence Number32First data byte number
Acknowledgment Number32Next expected byte number
Flags6Control bits (SYN, ACK, FIN, etc.)

Connection Establishment

Three-Way Handshake

Purpose: synchronize sequence numbers, establish connection parameters. Steps: 1. SYN from client with initial sequence number (ISN). 2. SYN-ACK from server with its ISN, acknowledging client's SYN. 3. ACK from client acknowledging server's SYN.

Sequence Number Initialization

Each side selects ISN pseudo-randomly. Prevents old connection confusion. Ensures ordered byte stream.

State Transitions

Client: CLOSED → SYN-SENT → ESTABLISHED. Server: LISTEN → SYN-RECEIVED → ESTABLISHED. Connection ready for data transfer after handshake.

Client Server | SYN (seq=x) | |-------------------------->| | | SYN-ACK (seq=y, ack=x+1) |<--------------------------| | ACK (ack=y+1) | |-------------------------->|Connection established

Connection Termination

Four-Way Handshake

Initiated by one side sending FIN to signal end of data. Other side acknowledges FIN, sends its own FIN. Initiator acknowledges FIN. Half-close state allows one-way data flow. Full close after both FINs acknowledged.

States Involved

FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT, CLOSED. TIME-WAIT ensures delayed packets discarded.

TIME-WAIT Importance

Duration: 2 * Maximum Segment Lifetime (MSL). Prevents confusion from delayed segments of old connections. Ensures reliable connection release.

Side A Side B | FIN | |-------------------------->| | | ACK |<--------------------------| | | FIN |<--------------------------| | ACK | |-------------------------->|Connection closed

Flow Control

Purpose

Prevents sender from overwhelming receiver buffer. Ensures receiver can process incoming data without loss.

Window Size

Advertised by receiver in TCP header. Indicates available buffer space. Sender limits data sent according to window size.

Sliding Window Mechanism

Dynamic window adjusts as data acknowledged and buffer space freed. Sender advances window base on acknowledgments. Enables efficient, continuous transmission.

TermDescription
Window SizeReceiver buffer capacity in bytes
Advertised WindowValue sent in ACK to control sender
Sliding WindowDynamic window shifting as ACKs received

Congestion Control

Objective

Prevent network congestion collapse. Adjust sender rate based on network conditions to maintain fairness and efficiency.

Key Algorithms

Slow Start: exponential increase of congestion window (cwnd) initially. Congestion Avoidance: linear increase after threshold. Fast Retransmit: detect loss via duplicate ACKs. Fast Recovery: avoid slow start after loss detection.

Congestion Window (cwnd)

Sender-side variable limiting bytes in flight. Interacts with receiver advertised window. Effective window = min(cwnd, advertised window).

Algorithm:Initialize cwnd = 1 MSSWhile no loss: cwnd = cwnd * 2 (Slow Start)Once cwnd >= ssthresh: cwnd = cwnd + 1 MSS per RTT (Congestion Avoidance)On loss: ssthresh = cwnd / 2 cwnd = 1 MSS (Slow Start restart)

Reliable Data Transfer

Mechanisms

Sequence numbers: track byte ordering. Acknowledgments: confirm receipt, cumulative ACKs. Retransmissions: resend lost or corrupted segments. Checksums: detect errors in header and data. Timers: trigger retransmission on timeout.

Duplicate Acknowledgments

Indicate missing segment. Trigger fast retransmission without waiting for timeout.

Selective Acknowledgment (SACK)

Enhancement allowing receiver to inform sender about all received blocks. Improves efficiency in case of multiple losses.

TCP Segmentation and Reassembly

Segmentation

Breaks large application data into segments fitting Maximum Segment Size (MSS). MSS negotiated during handshake. Ensures compatibility with underlying network MTU.

Reassembly

Receiver reorders segments according to sequence numbers. Buffers out-of-order segments. Delivers data to application as continuous byte stream.

Fragmentation vs Segmentation

Fragmentation: IP layer splits packets if exceeding MTU. Segmentation: TCP divides data before IP layer. TCP aware of segment boundaries, IP not.

TCP Timers and Retransmission

Retransmission Timer

Starts when segment sent. Timeout triggers retransmission. Timer value adaptive based on Round Trip Time (RTT) estimation.

RTT Estimation

Uses exponential weighted moving average (EWMA) to smooth measurements. Prevents premature timeouts or delayed retransmissions.

Other Timers

Delayed ACK timer: reduces ACK traffic by waiting briefly before sending ACK. Persist timer: probes zero-window to avoid deadlock. Keepalive timer: checks if connection is alive.

RTT estimation formulas:SampleRTT = measured RTTEstimatedRTT = (1 - α) * EstimatedRTT + α * SampleRTTDevRTT = (1 - β) * DevRTT + β * |SampleRTT - EstimatedRTT|TimeoutInterval = EstimatedRTT + 4 * DevRTTWhere α = 1/8, β = 1/4

Performance Optimization Techniques

Window Scaling

Extends 16-bit window field to support windows larger than 65,535 bytes. Uses scale factor negotiated in options.

Selective Acknowledgment (SACK)

Allows acknowledgment of non-contiguous blocks. Improves retransmission efficiency in lossy networks.

Timestamps

Enhances RTT measurement accuracy. Prevents sequence number wrapping ambiguity. Used in PAWS (Protect Against Wrapped Sequence numbers).

Delayed ACK

Waits short interval before sending ACK. Reduces overhead by combining ACKs with data segments.

TCP Variants and Extensions

TCP Reno

Classic congestion control with fast retransmit and recovery. Aggressive after loss detection.

TCP NewReno

Improves fast recovery to handle multiple packet losses better.

TCP Vegas

Proactive congestion avoidance based on RTT variation. Adjusts sending rate before losses.

TCP Cubic

Default in Linux. Uses cubic function for congestion window growth. Better scalability on high-bandwidth networks.

Explicit Congestion Notification (ECN)

Allows routers to signal congestion without packet loss. TCP reacts by reducing sending rate.

References

  • W. Richard Stevens, "TCP/IP Illustrated, Volume 1: The Protocols," Addison-Wesley, 1994, pp. 523-610.
  • J. Postel, "Transmission Control Protocol," RFC 793, IETF, 1981, pp. 1-85.
  • V. Jacobson, "Congestion Avoidance and Control," SIGCOMM '88, ACM, 1988, pp. 314-329.
  • Mathis et al., "TCP Selective Acknowledgment Options," RFC 2018, IETF, 1996, pp. 1-13.
  • Allman et al., "TCP Congestion Control," RFC 5681, IETF, 2009, pp. 1-38.