Writing your own TCP/IP stack may seem like a daunting task. Indeed, TCP has accumulated many specifications over its lifetime of more than thirty years. The core specification, however, is seemingly compact1 – the important parts being TCP header parsing, the state machine, congestion control and retransmission timeout computation.
The most common layer 2 and layer 3 protocols, Ethernet and IP respectively, pale in comparison to TCP’s complexity. In this blog series, we will implement a minimal userspace TCP/IP stack for Linux.
The purpose of these posts and the resulting software is purely educational – to learn network and system programming at a deeper level.
- TUN/TAP devices
- Ethernet Frame Format
- Ethernet Frame Parsing
- Address Resolution Protocol
- Address Resolution Algorithm
- Conclusion
- Sources
To intercept low-level network traffic from the Linux kernel, we will use a Linux TAP device. In short, a TUN/TAP device is often used by networking userspace applications to manipulate L3/L2 traffic, respectively. A popular example is tunneling, where a packet is wrapped inside the payload of another packet.
The advantage of TUN/TAP devices is that they’re easy to set up in a userspace program and they are already being used in a multitude of programs, such as OpenVPN.
As we want to build the networking stack from the layer 2 up, we need a TAP device. We instantiate it like so:
/*
* Taken from Kernel Documentation/networking/tuntap.txt
*/
int tun_alloc(char *dev)
{
struct ifreq ifr;
int fd, err;
if( (fd = open("/dev/net/tap", O_RDWR)) 0 ) {