Cserve - About
CServe is a web server written in C. Inspired from nginx, and started with a huge motivation to learn network programming and system-level thinking. The project is divided into chapters, each of which focuses on a specific feature or technique. The chapters are ordered by difficulty level, starting with the basics and gradually increasing in complexity.
Note
This document shows the 1st phase of the project implementation (April 13 - June 8). The 0.2.0 version is in progress...
The overall plan was as follows:
0️⃣ Basic TCP Server
Goal
Create a foundation that listens for and accepts TCP connections, setting up the socket infrastructure for later HTTP functionality.
Implementation
Use a Server
struct to store socket configuration. Implement server_constructor
to initialize the struct, create a socket with socket()
, bind it to a port using bind()
and start listening with listen()
. Don't forget to set the S0_REUSEADDR
option to avoid "address already in use" errors.
Create a launch()
function that runs an infinite loop, calling accept()
to handle incoming connections and logging each connection.
1️⃣ Basic HTTP Server
Goal
Extend the TCP server to parse HTTP requests and respond with a static HTTP response, introducing HTTP protocol handling.
Implementation
Define an HTTPRequest
struct to store parsed request data:
- method
, path
and version
- Headers
- header_count
- Body
- body_length
Implement http_request_parser
to read the request from a client socket, tokenize the request line, and parse headers until an empty line (\r\n
).
Create an HTTPResponse
struct for responses.
Best Practices
- Allocate memory dynamically for HTTPRequest
fields and free them with http_request_free
to prevent leaks.
- Test with curl http://localhost:8080 to verify the response.
- Use Valgrind to check for memory leaks in http_request_parse
.
2️⃣ Static File Serving
Goal
Serve static files (e.g., HTML, CSS, images) from a directory, adding a key Nginx feature.
Implementation
Introduce an HTTPServer
struct to manage HTTP-specific state: a reference to the Server
struct, a static_dir
path, and a handler
function pointer for processing requests. The constructor should also initialize static_dir
.
Best Practices
- Allocate a buffer for file contents dynamically based on file size.
- Test with a browser to verify files (e.g., index.html, images) are served correctly.
- Log file access errors (e.g., “File not found”) to debug issues.
- Hardcode common Content-Type
values (e.g., text/html
for .html, image/png
for .png) in a lookup table, deferring MIME type detection libraries.
- Check for path traversal attacks (e.g., ../../etc/passwd
) by validating the resolved path stays within static_dir
.
3️⃣ Reverse Proxy
Goal
Forward HTTP requests to a backend server, let's say FastAPI app on Uvicorn, enabling the server to act as a reverse proxy. Relay the backend’s response back to the client.
Implementation
Extend HTTPServer
to include a proxy_backends
array, which includes strings like “localhost:8000”
and backend_count
. Implement proxy_request
to:
- Create a new socket to connect to the backend using socket()
and connect()
.
- Reconstruct the client’s HTTP request (method, path, headers) and send it to the backend.
- Read the backend’s response and forward it to the client.
Update the handler function in HTTPServer
to decide whether to serve a static file or proxy based on the request path (e.g., proxy /api/*
to the backend, serve /static/*
from disk).
Best Practices
- Handle connection failures (e.g., backend down) by returning a 502 Bad Gateway
response.
- Test with a Python FastAPI app running on localhost:8000
and curl
to verify proxying.
- Use strace
to debug socket issues between the server and backend.
4️⃣ Load Balancing
Goal
Distribute requests across multiple backends to improve scalability, adding load balancing to the reverse proxy. Support multiple backend servers (e.g., localhost:8000
, localhost:8001
). Implement a round-robin algorithm to distribute requests evenly.
Implementation
Define a Backend
struct with host (e.g., “localhost”
) and port (e.g., 8000
). Store backends in HTTPServer
’s proxy_backends
as an array of strings, parsed into Backend
structs at runtime. Implement choose_backend
to select the next backend in a round-robin fashion, tracking the current index with a static variable. Modify proxy_request
to use the selected backend’s host and port. Update handler
to call choose_backend
for proxied requests.
Best Practices
- Log which backend handles each request to verify distribution.
- Test with multiple FastAPI instances (e.g., on ports 8000
and 8001
) and wrk
to simulate load.
- Assume backends are always available, adding health checks in a future iteration if needed.
5️⃣ Config File (.ini)
Goal
Add a configuration file to make the server flexible, allowing users to specify the port, static directory, and backends without recompiling. Parse a simple .ini
configuration file to set server parameters. Use the parsed values to initialize the Server
and HTTPServer
structs.
Implementation
Define a config file format like:
parse_config
to read the file line by line, extracting key-value pairs (e.g., port
, static_dir
, backend
). Then store parsed backends in a dynamic array for HTTPServer
. Update main
to call parse_config
and pass the results to server_constructor
and http_server_constructor
.
Best Practices
- Validate config values (e.g., port range, valid directory) to prevent crashes.
- Provide default values (e.g., port 8080
) if the config file is missing.
- Test with different config files to ensure flexibility.
6️⃣ Concurrency (Event-based)
Goal
Enable the server to handle multiple clients concurrently using an event-based model, improving performance and scalability.
- Replace the single-threaded accept()
loop with an event-driven model to handle multiple connections simultaneously. That way, process client requests (parsing, static files, proxying) without blocking other clients.
Implementation
Use select()
(then epoll()
later on) to monitor multiple sockets for events.
Modify launch()
to:
- Initialize a fd_set
for select
.
- Accept new connections and add client sockets to the set.
- Read and process requests from ready sockets, calling handler for HTTP logic.
- Close sockets after processing to free resources.
Maintain a dynamic array or list of active client sockets in HTTPServer to track state.
Best Practices
- Limit the number of concurrent connections (e.g., 1024
) to avoid select
’s file descriptor limit.
- Test with wrk
to measure throughput under load (e.g., 100 concurrent clients).
- Use Valgrind
to ensure no memory leaks in the event loop.
- Stick to a single thread to avoid threading complexity, mimicking Nginx’s event-driven model.