Previously, we dived into how to make an HTTP parser with no dependencies. In this blog let's extend our knowledge further by building an HTTP client CLI😉
(I call our brand-new CLI fetch
, by the way)
It would be very time-consuming if we implemented all the feature requirements for modern HTTP client CLI. Instead let’s just build a simple one that only supports HTTP/1.1 protocol using IPv4 Address (no HTTP/2, 3, and IPv6. Also some features in HTTP/1.1 will be missing).
API Design
Let's first think about how our brand-new CLI will be used. It should be similar to a well-known tool curl. But I like to have an HTTP method as an argument explicitly.
Also we may want to POST data to a server using HTTP body (or not). So we have possibly three arguments passed to the CLI:
- HTTP method
- URL
- HTTP body (Optional)
And here is an example command.
fetch post example.com '{"foo": "bar"}'
Now we can see what our internal API look like. Let's write our main
function:
fn main() {
let mut args = std::env::args();
let program = args.next().unwrap();
if args.len() <= 1 {
display_usage(&program);
exit(1);
}
let method: Method = args.next().unwrap().parse().unwrap();
let url = args.next().unwrap();
let body = args.next();
let client = Client::new(); // Initialize a client
// Make an HTTP request
let response = client.perform(method, url, body).unwrap();
println!("{response}");
}
That's enough now.
DNS Client
Now let's write our client struct and its corresponding method called perform
:
pub struct Client {};
impl Client {
pub fn new() -> Self {
Self {}
}
}
impl Client {
pub fn perform(
&self,
method: Method,
url: String,
body: Option<String>,
) -> Result<String, String> {
let (protocol, url) = url.split_once("://").unwrap_or(("http", &url));
let (hostname, url) = match url.split_once('/') {
Some((hostname, url)) => (hostname, format!("/{url}")),
None => (url, "/".to_string()),
};
// ...!?
}
}
Finally, I realized we need a way to resolve DNS in our project😓 Don't worry! I have implemented it before.
Let's just copy-and-paste the code for the DNS client from my repository.
HTTP Request
Now we have an HTTP method, a server’s hostname, a URL path, a payload to be sent, and a server’s IP address.
So what to do next? Well we need to create our HTTP request out of some parameters above. Let’s do it.
let request = HTTPRequest::new(method, hostname, &url, body);
In short, our HTTP Request struct looks like below:
#[derive(Debug, Clone)]
pub struct HTTPRequest {
request_line: RequestLine,
headers: HTTPHeaders,
body: Option<String>,
}
impl HTTPRequest {
pub fn new(method: Method, hostname: &str, url: &str, body: Option<String>) -> Self {
let request_line = RequestLine::new(method, url);
let headers: HTTPHeaders = vec![("Host".to_string(), hostname.to_string())].into();
Self {
request_line,
headers,
body,
}
}
}
Once we create an HTTP request, we can serialize and send it to the server via TCP stream:
// Connect to a server
let mut stream = match protocol {
Protocol::HTTP => TcpStream::connect((addr, 80)).map_err(|e| e.to_string())?,
Protocol::HTTPS => unimplemented!(),
};
// Send an HTTP request to the server
let request = HTTPRequest::new(method, hostname, &url, body);
let n = stream
.write(request.to_string().as_bytes())
.map_err(|e| e.to_string())?;
println!("sent {n} bytes");
HTTP Response
Finally, we can receive HTTP response from the server, and make something processable (our HTTP Response) out of it:
// After sending HTTP request, create a buf reader and get data in it
let reader = BufReader::new(stream);
let response = HTTPResponse::try_from(reader)?;
println!("{:?}", response);
And our HTTPResponse struct should be like below:
#[derive(Debug, Clone)]
pub struct HTTPResponse {
status_line: StatusLine,
headers: HTTPHeaders,
body: Option<String>,
}
impl<R: Read> TryFrom<BufReader<R>> for HTTPResponse {
type Error = String;
fn try_from(reader: BufReader<R>) -> Result<Self, Self::Error> {
let mut iterator = reader.lines().map_while(Result::ok).peekable();
let status_line: StatusLine = iterator
.next()
.ok_or("failed to get status line")?
.parse()?;
let headers = HTTPHeaders::new(&mut iterator)?;
let body = if iterator.peek().is_some() {
Some(iterator.collect())
} else {
None
};
Ok(HTTPResponse {
status_line,
headers,
body,
})
}
}
We are done! Unfortunately, not really.
Test
When I cargo run
it, I found the program doesn’t finish after receiving an HTTP response from the server.
cargo run -- get example.com
// -> this just blocks and never ends...
Why? Our HTTPResponse parser implementation worked in the previous project. So it's supposed to work this time too...
Well, it turned out, in my previous project I tested my HTTPResponse parser using data in a text file. However, in read world HTTP response there is no end of file
section. So it turns out we need somehow stop reading the byte stream when we find an empty line
.
Here is updated version of my implementation:
impl<R: Read> TryFrom<BufReader<R>> for HTTPResponse {
type Error = String;
fn try_from(reader: BufReader<R>) -> Result<Self, Self::Error> {
// ...
let mut body = vec![];
for data in iterator {
// Break if it's just an empty line
if data.is_empty() {
break;
}
body.push(data);
}
Ok(HTTPResponse {
status_line,
headers,
body: Some(body.join("\n")),
})
}
}
Then, let’s try it again.
cargo run -- get example.com
// -> works!
Congratulations!!
However here is another challenge - when I run it for other URL (say, google.com) it again doesn’t finish…
And this is the biggest lesson learned in the project for me.
HTTP/1.1 Persistent Connections
By default, HTTP/1.1 uses persistent connections. This means a single TCP connection can be used to send and receive multiple HTTP requests and responses. This improves efficiency by avoiding the overhead of re-establishing connections for each request.
Many servers, including google.com
, use chunked transfer encoding for the body. This encoding allows the server to send the body in chunks of variable size, with each chunk preceded by its size information.
Simply put - some HTTP servers don’t even have an empty line in the HTTP body.
My idea after understanding the cause of the issue was, to read the Content-Length
HTTP header, and then read the exact bytes from the byte stream.
So here is my final implementation of HTTPResponse struct:
impl<R: Read> TryFrom<BufReader<R>> for HTTPResponse {
type Error = String;
fn try_from(reader: BufReader<R>) -> Result<Self, Self::Error> {
// The use of .lines() splits the stream by new line (\n, or \r\n).
// But this makes it impossible to parse HTTP body for us.
// So instead, leverage .split(b'\n')
let mut iterator = reader.split(b'\n').map_while(Result::ok).peekable();
let status_line: StatusLine = iterator
.next()
.ok_or("failed to get status line")?
.try_into()?;
let headers = HTTPHeaders::new(&mut iterator)?;
// The length of the HTTP body
let mut length = headers
.0
.get("Content-Length")
.ok_or("HTTP header doesn't have Content-Length header in it")?
.parse::<usize>()
.map_err(|e| e.to_string())?;
let mut body = vec![];
for mut data in iterator {
data.push(b'\n');
length -= data.len();
body.push(data);
if length <= 0 {
break;
}
}
let body = body.into_iter().flatten().collect::<Vec<u8>>();
let body = String::from_utf8(body).map_err(|e| e.to_string())?;
Ok(HTTPResponse {
status_line,
headers,
body: Some(body),
})
}
}
And here is the final result:
🎉
Thanks for reading 😉
Top comments (0)