This post is a follow on from a previous post on my personal blog I've written about learning Rust. While I was going through the initial "Hello, world!" phases I had a horrible idea for a first project:
"Asterisk is terrible! How hard could it be to write a PBX from scratch?"
Project Goals
Now, I've been pretty bad so far at documenting my projects so far, I've only written a single blog post on my Space Station 13 project. Hopefully I can be more proactive about this one.
The initial goal for this project was to not even handle any calls, I only wanted to be able to point a softphone at my local IP address and have the PBX successfully reject the registration. This would prove that the software can at least talk to a real phone!
Incidentally, the name comes from both the Teleprinter Network and a contraction of Telephone Exchange.
What does a PBX do exactly?
So, uh. Back to the actual project! Modern PBXes do two things: They handle the out-of-band signalling and handle the media (voice, video) streams.
The former involves actually placing, receiving and terminating calls. This is done using a protocol called SIP (Session Initiation Protocol), which is a lot like HTTP. By sending requests to and from the phone and PBX, the phone can authenticate itself, register itself to receive calls, and actually do the whole telephone thing. There are extra things on top that support things like text messaging, presence etc. If you're really interested in the nitty-gritty, look at RFC 3261.
The latter is done by SIP instructing the phones to blindly send media via RTP (Real-time Transfer Protocol) to some destination in the ether. This isn't something I am currently worrying about, as the PBX can tell the phones to send media packets to each other directly.
SIP can be performed over TCP, but usually both SIP and RTP are both performed over UDP, which has !!fun!! networking implications.
Building The Foundations
The first thing I went about doing is building representations of the SIP requests and responses. This is where the lack of hierarchical inheritance started to mess with my brain! I dug out my copy of Wireshark and a copy of the RFC and started pulling together example exchanges between the softphone on my desktop, and the Asterisk PBX on my home server.
This involved writing a lot of boilerplate, but it allowed me to pull together a structured SIP request from a string. With all of the cases for the headers and the string conversion logic, the file exceeded 200 lines. Handily, Rust easily allows you to define functionality in other files - so I pulled out the implementations for converting to/from strings and validating into their own files, making it easier to organise.
Rust does support metaprogramming for doing this all for you, macros, but unfortunately it's not a simple string comparison for headers as they can contain dashes, which enums cannot. Additionally headers are case-insensitive, and SIP also defines shortened versions of the headers to help fit larger requests inside the maximum UDP packet size.
The next issue to tackle was supporting both TCP and UDP connections, since SIP can happen over either. TCP requests require you to send a response back across the same connection, whereas with UDP each request must fit inside a single packet (limited to around 1454 bytes normally). To support this, I had to pull the handler code out into its own file and write wrapper code to handle TCP and UDP differently.
I ended up making the handler work with streams, pulling lines one-by-one from input in order to keep things consistent. This involved, in UDP's case, putting the string into a buffer and feeding that into the request handler. Eventually, though, I had it all to the point where responses were being sent for each request.
However, the softphone ignored all of the responses.
Debugging Black Box Issues
Debugging this issue was incredibly frustrating. All I had to go on was comparing the exchanges of data between my softphone and Telex, and between my softphone and Asterisk. The RFC is an incredibly dense technical document covering a lot of edge cases and desired behaviour, and it's hard to see the wood for the trees.
To start off with I was just trying to respond with 503 Service Unavailable, in order to try and get the softphone to abort trying to register from the outset, since that would mean it was communicating. No dice, it kept trying to send requests over UDP until eventually timing out. I then tried changing the response to a 401 Unauthorized in order to prompt the phone into providing credentials. Nada.
The second issue I considered is line endings. SIP defines that all line endings are CRLFs, but my responses were just LF, in line with *nix standards. Changing all my calls to manually insert CRLFs also did not do the trick.
After leaving it for a couple of evenings I came back to it and went through the headers with a fine toothed comb, referring to the RFC to see which were required by the specification. Eventually I found an anomaly, the Via header was being sent back from Asterisk despite there being no proxies in-between. I decided to give it a go and just send back the same header in the response as in the request.
Bam! That did it! The softphone immediately sent a second request with an authorization header! Telex, not having any concept of authentication, sent a 401 Unauthorized response again in response and the softphone immediately gave up and stated that registration failed. It worked!
Going Forwards
Now that Telex is actually communicating with the softphone I can work on actually supporting persistent state such as registrations and calls. As previously mentioned I'm going to avoid messing with RTP media and just instruct the phones to send media directly to each other.
The eventual roadmap though is building a basic, functional PBX that I can slowly work on extending to add more and more functionality. I highly doubt there'll be much interest in the project, Asterisk is a well established and mature piece of software with enterprise support - but it's fun to build something saner, and to have something to work with that isn't just a freemium front for Digium.
If you're interested in seeing the code, I've thrown it up on GitHub here.
If I write another post on Telex, I'll link to it down here.
Top comments (1)
Media support is not essential for a sip pbx, although it can make the cloud use case much simpler if you want to avoid stun and turn by using symmetric rtp at the pbx, which means your media reply goes to where the media source appears to actually come from rather than relying on the sdp to be correct. UDP, especially for sip, can also be easier in the cloud if baking in this kind of assumption and using keep-alives to keep router udp pinholes open without requiring explicit firewall forwarding rules. Few do it this way, but it does make a lot of the nasty problems and complexities with nat, firewalls, etc, go away.