This is my first post in a series covering the basics of internet networking. This is a new topic to me, so I'll likely spend the next couple of weeks focused on learning this material. Feel free to tweet at me if you have any requests or suggestions.
In 1974 Vint Cert and Bob Kahn published a paper entitled "A Protocol for Packet Network Intercommunication", which described a way for sharing resources between different networks. The ideas and original "Transmission Control Program" model were refactored into a more modular system adhering to the end-to-end principle, known today as the Internet Protocol (IP) Suite.
Packet Switching Networks
The original packet-switching network was originally included in the early department of defence ARPANET by Donald Davies. It's well-known that the ARPANET is the direct anticedant to the internet, and it's because of this that packet-switching networks are the primary basis for data communications in computer networks worldwide.
In contrast to circuit-switching, which pre-allocates network resources for each communication, packet switching enables all channels to be agnostic of the data being sent through, so long as each bit of data has a distinguishable header explaining how it is to be treated. For instance, in a circuit-switching network, a communication session is leased at a certain bitrate and latency for a certain period of time, while a packet-switching network can handle transferring any type of data at any bitrate, so long as it conforms to the datagram structure, comprised of a header and paylaod.
Each channel in a packet-switching network can pass through any variable bit rate stream of packets, so long as the header information is available indicating the final destination. As the packets encounter switches and router, they are received, buffered, queued, and forwarded, immediately freeing up the channel for other packets.
In a connectionless packet-switching network, each packet header includes a destination address, source address, port numbers, and other metadata required for a successful routing. The packets are routed individually, and can arrive out of order. The packets are refered to as Datagrams (being the underlying data type of IP), and the process is refered to as datagram-switching. Example of protocols that use datagram-switching are Internet Protocol, User Datagram Protocol (which adds a few nice features such as sumchecking) and Ethernet protocol.
A connection-oriented network uses a virtual circuit (or virtual connection, or virtual channel,) which is established between all nodes between the origin and the destination. From that point, much of the header information can be stripped out, enabling a simple byte-stream of data to be delivered between the nodes. Crucially, a virtual circuit enables higher-level protocols to simply deal with the data without having to deal with individual digital tranmission units, such as segments, packets, or frames. The destination node receives the data in order.
The "IP" in TCP/IP refers to the fundamental connection layer: a connectionless datagram service responsible for relaying datagrams across a network in a "best-effort" fashion. It doesn't handle security, or integrity, or the order in which data is received -- it's only job is to route packets based on the IP addresses in the packet headers.
IP protocols (how IP messages are formed, sent, and received) have developed over time. We're currently seeing a transition from IPv4 to IPv6.
IPv4 is the most widely used protocol version number is because the first three were experimental versions, used between 1977 and 1979. The successor to IPv4 is IPv6, and currently accounts for approximately 25% of internet addresses. The overwhelming reason to switch to IPv6 is that there are more of them available. Much, much more.
IPv4 uses 32 bits for addressing, which means there can only exist 4.3 billion addresses. When we consider the growth of the internet, and the inclusion of a whole "internet of things", it's obvious that this is not nearly enough. Ipv6 uses 128-bit addresses, which wikipedia tells me yields 340 undecillion addresses. That's a lot of of IOT lightbulbs.
Another major distinction between IPv4 and IPv6 is that IPv4 has the ability to automatically fragment an original datagram provided by IP into smaller units for transmission, while IPv6 does not, and relies on end stations and higher-level protocols for this, which is consistent with the end-to-end principle.
There are also differences in the class types of the addresses. IPv4 has five class types (A, B, C, D, E), with D reserved for multicast. Ipv6 has three multiple class types: Unicast, Multicast, and Anycast.
Finally, the representation of the addresses is quite different. Most people are familiar with the look of IPv4 address, in their common "xxx.xxx.xxx.xxx" format, with "localhost" as "127.0.0.1". IPv6 looks much more complicated, as it is denoted by eight groups of hex quartets, separated by collons: "xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx".
While the overriding reason to switch to IPv6 is the availability of more addresses, the additional differences in the protocol allow to more efficient routing and reduced management requirement between nodes on the network.
On top of IP, at the transport layer, is found the more sophisticated features of communications networking. UDP retains the connectionless datagram approach of data transission, but with an optional checksum to provide data integrity. TCP provides a connection-oriented service allowing more flexibility and reliability in data transmission between two parties.
Data tranfered in TCP is called a "segment", and TCP uses sequence and acknowledge numbers to recover lost segments, detect out-of-order segments, and resolve transmission errors. This ensures a reliable stream of segments, with the added overhead of a more complicated handshake when setting up the virtual circuit.
My next post will cover the workings of this handshake, how TCP can provide secure connections, and how it provides for HTTP and Websockets.