While working on TheFleet I came across questions difficult to answer with my minimal understanding of the system level functions used to send messages over a network. So I read a guide on the subject1 and took notes. The aim of this article is to present those notes in a manner that gives clarity to how certain c functions allow one to communicate over a network via the abstraction of writing to and reading from a unix file.
File descriptors
Each unix process has its own file descriptor table.2 Each row in that table contains the memory address of a file. When a process calls a system level function that interacts with a file, the process must pass the file's file descriptor3 as a parameter to the function.
Sockets
int socket(int domain, int type, int protocol);4
A socket is a special type of unix file used for either local inter process communication or communication over a network. The function socket
creates a socket and returns its socket descriptor.5
Making a socket available to the world
int bind(int sockfd, struct sockaddr *my_addr, int addrlen);6
Calling bind
tells the kernel to tie a socket to an (ip address, port) pair.7 The pair serves as the address of the socket on the internet.
Establishing connections
int connect(int sockfd, struct sockaddr *serv_addr, int addrlen);8
connect
establishes a connection to a remote socket. Just like bind
, connect
tells the kernel to tie the given socket to an (ip address, port) pair. This pair is used as the return address for messages sent to the remote socket. The kernel picks a random available port for this purpose. Thus when a process sends a packet over the internet, it almost always uses a port number different than the one used in the foreign address.9
A process can implement typical server behavior by passing a socket descriptor to the functions bind,
listen,
and accept
in order to establish many connections on the same local port.
int listen(int sockfd, int backlog);10
listen
tells the kernel to create a limited size queue for incoming connections. Connection attempts are rejected whenever the queue is full.
int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);
accept
pops connections off of the listen queue. It returns a new socket descriptor to be used for the connection.
Established connections are uniquely determined by (source ip, source port, foreign ip, foreign port) tuples. So a machine with one ip address has a theoretical limit of 65536 * 65536 established connections per foreign ip address. Resource limitations make the maximum number of possible connections much smaller in practice.
One can see established connections with the command:
netstat -natu | grep 'ESTABLISHED'
Sending and receiving messages
Once a socket has an established connection, the functions send
and recv
are used to write and read from the socket.
int send(int sockfd, const void *msg, int len, int flags);
int recv(int sockfd, void *buf, int len, int flags);
send
and recv
do not guarantee to write or read all len bytes.
send
returns the number of bytes actually sent or -1 on error.
recv
returns the number of bytes actually read, 0 if the remote side closed the connection, or -1 on error.
Closing connections
int close(int sockfd);
close
, the function used for closing files in general, is used to close a connection.
- Perhaps I should have spent more time looking for a better source of information. The author of this guide one seems to have some right ideas and occasionally is funny,
They say absence makes the heart grow fonder, and in this case, I believe it to be true. (Or maybe it’s age.) But what I can say is that after a decade-plus of not using Microsoft OSes for my personal work, I’m much happier! As such, I can sit back and safely say, “Sure, feel free to use Windows!” …Ok yes, it does make me grit my teeth to say that.
but despite knowing better he bows to the crowd
At this point in the guide, historically, I’ve done a bit of bagging on Windows, simply due to the fact that I don’t like it very much. But I should really be fair and tell you that Windows has a huge install base and is obviously a perfectly fine operating system.
The guide also includes sections on writing IPV6 compatible code. I skimmed over them. IPV6 is the network equivilant of Gavin monster blocks. [↩]
- If a process forks the child process is given a copy of the parent process's table. Changes in the two tables are not shared between the child and the parent. If both parent and child try to read from the same file, the one that calls read first gets the data. [↩]
- i.e. the index of the file in the process's file descriptor table [↩]
- The parameters given to
socket
determine whether the socket will use the tcp or udp as well as whether it will be used for network or local communication. [↩] - A socket descriptor is a file descriptor that refers to a socket. Socket descriptors and file descriptors share the same table. [↩]
- sockaddr is a struct of address information that includes the (ip address, port) pair. [↩]
- The ip address can be a loopback address such as 127.0.0.1 if the socket is being used for local communication. [↩]
- for
connect
, sockaddr refers to the remote address. [↩] - Previously, I had the misconception that the same port was used for each side of the client and server pair. [↩]
- The parameter backlog determines the length of the queue of incoming connections. It should be <= to the system limit, which is usually around 20 [↩]
This is traditionally called the BSD sockets API due to its origin. Looks like the guide gave a decent overview; your next step would be the man pages now that you know where to look and the general terminology.
> The parameters given to socket determine whether the socket will use the tcp or udp as well as whether it will be used for network or local communication.
Picture a 2x2 grid. On one axis is the domain: inet or unix. On the other, the type: stream or datagram. There are some more obscure options, but for those named, all combinations are valid. Thus TCP and UDP provide the "wire protocol" implementation for the stream and datagram abstractions in the inet domain; while the same split exists for the unix domain, just without externally visible protocols. Unix-domain sockets are always local, though as you noted inet sockets can be local too.
> Just like bind, connect tells the kernel to tie the given socket to an (ip address, port) pair. This pair is used as the return address for messages sent to the remote socket. The kernel picks a random available port for this purpose.
At first I thought you were talking about the destination address+port, as that's primarily what connect does. The source address+port auto-assignment happens if the socket isn't already bound (and note that it's not guaranteed to be all that random). In the case of tcp it indeed starts the process of establishing the connection, but can also be used on datagram sockets, eg. to enable use of send/recv rather than sendto/recvfrom. listen and accept however don't apply to datagram sockets.
To implement a typical server supporting concurrent clients, you either use multiple threads or processes, or else set the sockets to nonblocking and implement an event loop using poll or select.
A final call that belongs in this set is shutdown. You can do with just close in many cases, but shutdown allows closing one direction first, or forcibly closing when other processes sharing the socket may still have it open.