TCP for me (part two)

Hi all -- with this post I elaborate on the Transmission Control Protocol, including the the basics of initiating, using and ending a TCP connection.

TCP is used extensively for webpages (WWW), FTP, SSH, Email, and much else. It's one of the foundational protocols for internet communciations, so it's well worth understanding it's tradeoffs. It's optimized for accurate delivery over timely delivery, so this ought to be taken into consideration when assessing the right protocol for your transmission needs.

Data structure

TCP provides reliability in network packet transmission. Data is sent as a stream of 1 byte octets. Octet is stated explicitely to clarity that there is 8 bits of data, with no parity bit that is often found in storage bytes. Packets are sent using a duplex virtual circuit, which means that they can be sent simultaneously between both transmitting hosts.

The basic data unit in the TCP protocol is a "segment". Each segment contains a sequence number, acknowledgement number, window, checksum, source and destination port, along with additional metadata, the actual data, and padding.

How it works

All packets containing data are expected to be acknowledged as received. In the case where they are lost or corrupted, they are sent again until a timeout has occured. TCP requires a heavier implementation with a more complicated handshakes in the setup and teardown to enable a stable, reliable connection.

TCP provides basic operation signals for ensuring a successful TCP lifecycle, and they communicate a variety of situations:

URG: Urgent pointer field is valid
ACK: Acknowledgement field is valid
PSH: This segment requests a push
RST: Reset the connection
SYN: Synchronize sequence numbers
FIN: Sender has reached end of its byte stream 

ACK segments do not require an acknowledgement, as this would incur an infinite loop.

Requesting a TCP connection

Establishing a virtual circuit uses a 3-way handshake, and the connection stays open until the end of the data transfer.

  1. Client sends a SYN signal to request a connection.
  2. Server responds with a SYN-ACK signal, acknowledging the previous request and sending a request for connection.
  3. Client sends ACK signal to server, and is cleared to use the TCP connection.

Closing a TCP connection

In theory, a TCP connection can stay open indefinitely. The close sequence is as follows:

  1. Client sends a FIN signal. At this point, the client can still receive data, but can't send any more data.
  2. Server responds with ACK signal.
  3. Server responds with FIN signal to request close connection, and no longer sends any data.
  4. Client sends ACK signal to acknowledge the FIN segment.

Once the TCP connection has been closed, there is a cooldown period to ensure that there is no confusion between this connection and the next connection.

Reliable transmission and network congestion

In contrast to UDP, packets in TCP can't be fired and forgotten. The throughput of all data transmitted over TCP is managed, requiring an acknowledgement for each data packet sent within a window of acceptable data sequences allotted by the transmitting host.

A sequence number is assigned to each byte transmitted, and requires a positive aknowledgement from the receiving host. If the ACK is not received within a timeout period, the data is retransmitted with a longer timeout period until the max is hit.

The window shifts over the entirety of the data sequence until they have all been received. This window regulates TCP throughput, and also decreases network congestion by having hosts transmit only the data that the receiving end is ready to receive.

Thanks for reading. Feel free to tweet at me if you have any questions.

Part three will be a breakdown of the different major routing protocols:
Border Gateway Protocol (BGP), Routing Information Protocol (RIP), and Open Shortest Path First (OSPF).

Sources