The article was originally posted here. Some of the gifs are not displayed here due to dev.to's restrictions.
What is a web server?
A web server is a program that takes a request to your website from a user and does some processing on it. Then, it might give the request to the application layer. A few of the most popular web servers are Nginx, Apache. (They have more features like reverse proxy, load balancing, and many others, as well, but primarily they act as web servers)
Now, let me ask a question. The server that runs on your localhost during the development is that a web server? Cause, whatever request you sent, it processes it and then loads up the appropriate page. So, it might seem like a webserver, but more technically it is called an app server. The app server loads the code and keeps the app in memory. When your app server gets a request from your web server, it tells your app about it. After your app is done handling the request, the app server sends the response back to the webserver (and eventually to the user). For rails in particular there are many app servers like Unicorn, Puma, Thin, Rainbows.
But if there are so many servers that are tested by the community and used by thousands, why should we bother building another? Well, by building one from scratch we will have a better knowledge of how these works.
What actions does an HTTP server actually perform?
So, let's break down what an HTTP server does.
Steps involved
So when we visit a particular URL, it sends a particular HTTP request to the server. Now, what is an HTTP request? It is an application-level protocol that every application connected to the internet has to agree upon. There are many other protocols like FTP (File Transfer Protocol), TCP (Transmission Control Protocol), SMTP (Simple Mail Transfer Protocol). HTTP or HyperText Transfer Protocol is just very popular among these and is used by web applications and web servers to communicate among themselves.
So, when we type one URL in the browser. It makes an HTTP "request" to the web server, to which the webserver processes that request and sends back an HTTP "response" which gets rendered to the user in the browser.
History
The first HTTP standard was released in 1996 which was HTTP/1.0 by Tim Berners Lee. Now we have HTTP/2 which is a more efficient expression of HTTP's semantics "on the wire" and was published in 2015. Also, did you know that there is another successor which is HTTP 3 which is already in use by over 4% of the websites (It used UPDP instead of TCP for the transport protocol)
How should we start?
So we would need a tool that will listen for bi-directional communication between client and server. Basically a socket. Socket is nothing but an endpoint for two-way communication between two programs running on a network i.e endpoints of a bidirectional communications channel. So it has to be bound to a port so the TCP layer can find the application that the data is sent to, the server forms the listener socket and the client reaches out to the socket. We will not be implementing sockets. Ruby already has a socket implemented in their standard library.
require "socket"
The socket library provides specific classes for handling the common transports as well as a generic interface for handling the rest, basically it interacts with the OS level and performs the necessary actions for us.
What should be the basic processes of the webserver
- Listen for connections
- Parse the request
- Process and send the response
1. Listen for connections
First, let's open a port and listen to all messages sent to that particular port. We can do that using the TCPServer.new
or TCPServer.open
method. [ According to the docs they are synonymous ]
require "socket"
server = TCPServer.new("localhost", 8000)
Feel free to choose any port, but make sure it is available. Use the command "netstat -lntu" to look for the ports that are currently used by a process, don't use those.
Now we would like to loop infinitely to process our incoming connections. When a client connects to our server, server.accept
will return a Ruby Socket, which can be used like any other ruby I/O object. Since the connection was made by a request we would also love to read that request, which we can do using gets
method. It will return the first line of the request.
So now we have:
require "socket"
port = (ARGV[0] || 8000).to_i # to get a port from the ARG
server = TCPServer.new("localhost", 8000)
while (session = server.accept)
puts "Client connected..."
puts "Request: #{session.gets}"
end
How to test this?
Open up two terminals in one run the ruby script, and in the other open up irb
. Now follow my commands:
On the other terminal I write the commands
> require "socket"
> soc = TCPSocket.open("localhost", 8000)
> soc.puts "Hello There"
A much easier way to test is to run the script and visit that port using the browser. If your port is 8000
just visit
http://localhost:8000
. You will see something like this:
Client connected...
Request: GET / HTTP/1.1
or can use the curl command for the same.
Why just GET / HTTP/1.1
?
Because when you sent a request it gets parsed into a multi-line string. Try to run the command curl -v localhost:8000
you will notice something like this:
* Trying ::1:8000...
* Connected to localhost (::1) port 8000 (#0)
> GET / HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.74.0
> Accept: */*
>
And in our script we used session.gets
which only takes one line in the IO stream as input. So, let's replace that with readpartial(2048)
. Here 2048 represents the byte of data we would love to read. We can increase that, but for our case, it is enough.
So far we have:
require "socket"
port = (ARGV[0] || 8000).to_i
server = TCPServer.new("localhost", 8000)
while (session = server.accept)
puts "Request: #{session.readpartial(2048)}"
end
Now run the script and the curl command again. It will print all of the HTTP request data.
2. Parsing the HTTP request
Right now we are just receiving the request as a string, we need to parse it so that our server can understand and further process it.
Let's look into the request once again:
GET / HTTP/1.1 # GET is the method, the / is the path, the HTTP part is the protocol
Host: localhost:8000 # Headers
User-Agent: curl/7.74.0
Accept: */*
The first line gives us
- method
- path
- protocol
All the lines after that comes under the header. So we write this function that will parse the raw request string
def parse(request_string)
method, path, version = request_string.lines[0].split
{
method: method,
version: version,
path: path,
headers: parse_headers(request_string),
}
end
It calls another parse_headers
to parse the headers
def normalize(header)
header.tr(":", "").to_sym
end
def parse_headers(request)
headers = {}
request.lines[1..-1].each do |line|
return headers if line == "\r\n"
header, value = line.split
header = normalize(header)
headers[header] = value
end
end
Now instead of just printing the request do it this way
server = TCPServer.new("localhost", 8000)
while (session = server.accept)
ap parse(session.readpartial(2048))
end
I am using awesome_print
to display the data in a formatted manner you can replace that with puts
. Now you would get something like this.
3. Process and send the HTTP response
Now since we have all the data we now have to prepare and send the response. If the path of the request is "/" which refers to the home we will respond with something like index.html
else, if it was something else like localhost:8000/about.html
then we will respond with that path about.html
.
def prepare(parsed_req)
path = parsed_req[:path]
if path == "/"
respond_with("index.html")
else
respond_with(path)
end
end
What respond_with
is supposed to is to check if the file exists, if it does then respond with the file, else return a 404.
def respond_with(path)
if File.exists?(path)
ok_response(File.binread(path))
else
error_response
end
end
For the responses, we will be sending a string of this format. This is according to the HTTP spec. You can read more about the HTTP spec here.
def response(code, body="")
"HTTP/1.1 #{code}\r\n" +
"Content-Length: #{body.size}\r\n" +
"\r\n" +
"\#{body}\r\n"
end
So our, ok_response
and error_respnse
will be like this:
def ok_response(body)
MyServer::Response.new(code: 200, body: body)
end
def error_response
MyServer::Response.new(code: 404)
end
Now after we have our response we can send it back to the client. I have refactored the codes a little bit, you can find the entire code here:
Once everything is in place, we can finally run the script and visit the URL http://localhost:8000
it will render all the contents of index.html
. Also if you have any other pages in the same folder like about.html
visiting http://localhost:8000/about.html
will render that as well.
Yayy!! We have successfully built our own HTTP server
Top comments (0)