This article is the 13th day article of Makuake Advent Calendar 2020.
This article is based on the following article with some modifications.
Introduction
I will write about the process of creating a URL router in Golang.
Routing of logic, it will adopt the algorithm trie-tree as the base rather than the regular expression.
There are various router packages in Go.
ex.
Complex algorithms of the tree structure for optimization of performance in many of the library has been adopted.
This time, I would like to implement it in a form that reduces the difficulty of the algorithm by defining a data structure that seems to be relatively easy to understand.
What is a URL router?
URL router is an application for sorting processes based on URL.
It may not be necessary to explain, but I would like to confirm the definition of the word URL and router.
URL
URL represents the address of a page on the Internet and is an abbreviation for Uniform Resource Locator.
The format of the URL string is defined as follows.
<scheme>:<scheme-specific-part>
Protocol names such as http, https, and ftp are often used for the part, but schema names other than the protocol name are also defined.
cf. Uniform Resource Identifier (URI) Schemes
In the <sceme-specific-part>
part, a string based on the schema is defined.
For example, in the case of http and https schemes, the rule is that the domain name and path name (or directory name) are defined.
See RFC1738 for detailed URL specifications.
cf. rfc1738 - Uniform Resource Locators (URL)
RFC1738 is positioned as the Internet Standard (STD1).
Router
As you know, Routing means routing control.
For example, a router that relays communication between networks uses a route table to perform routing that mediates the transfer of packets.
The URL router receives information called a URL from a browser request and acts as an intermediary between the URL and processing in order to process the data based on the received URL.
Just as routers in your network have a data structure called a routing table, URL routers need to have a data structure for routing.
Requirement definition
There are various multifunctional routers, but this time we will implement a router with the following functions.
- Supporting static routing
- The URL
/foo/bar/
can be matched with the static routing definition/foo/bar/
and the process can be returned.
- The URL
- Supporting dynamic routing
- The URL
/foo/bar/123
can be matched with the routing definition that has dynamic path parameters such as/foo/bar /:id
, and the process can be returned using the path parameter data.
- The URL
Data flow
I think it's not particularly difficult, but I would like to confirm the data flow.
The router receives the path part of the URL as input data, determines the data that matches the path, and calls the process.
How to match the path and the process is the basis of the implementation.
Tree structure
The basic implementation that matches the path and the process implements the algorithm using a data structure called a tree structure.
A tree structure is a data structure that has roots, branches, nodes, and leaves (nodes at the end of Leaf).
There are various types of tree structures depending on the pattern of data storage method and search method.
Since we want to handle strings mainly in URL routing, we adopt a data structure called a trie tree that specializes in string search.
Trie tree structure
A trie tree is a type of tree structure that handles a set of strings.
Also called a prefix tree.
Each node has a single or multiple string or number, and expresses a word by searching from the root node toward the leaf and connecting the values.
Each node does not necessarily have to have a value.
Trie tree are used for IP address search in the network layer, http routing in the application layer, and morphological analysis in the context of machine learning.
The image of the algorithm is the following visualization tool
I think it's easy to grasp when you experience it.
If you want to optimize the memory efficiency of the trie tree, you should consider a tree structure that efficiently stores the prefix of the string such as Radix Tree (Patricia Trie).
The computational complexity of the trie tree is the same for both search and insert, and when the length of the search key is m, the worst computational complexity is O(m).
I have prepared sample code so please take a look.
How to handle trie trees?
Customize the trie tree to make it easier to use with your URL router.
As an example, the figure above defines a routing that only supports GET.
/foo/
/baz/
/foo/bar/
/foo/:name[string]/
-
/foo/bar/:id[int]
A node for HTTP methods is created directly under the root, and the path is stored for each HTTP method.
Nodes starting with :
are nodes that store path parameters.
The DSL [int]
is for mapping regular expressions to their parameters.
If the node that was searched and arrived has the processing data, the matching will be successful, and if it is not, the matching will fail.
Below is an implementation of this customized tri-tree.
[goblin/trie.go(https://github.com/bmf-san/goblin/blob/master/trie.go)
Implementation of router in Golang
In Golang, the standard package net/http
has a routing function.
That feature is called multiplexer in Golang.
However, the standard package does not support path parameters, so if you want to use it, you need to prepare an external package or extend the standard multiplexer.
This time, I will create my own routing, so I need to extend the multiplexer.
Before learning about multiplexer extensions, it's a good idea to familiarize yourself with the http server specification.
http server code reading
Reference implementation
We will read it with reference to the following implementation.
package main
import (
"net/http"
)
func main() {
mux := http.NewServeMux()
handler := new(HelloHandler)
mux.Handle("/", handler)
s := http.Server{
Addr: ":3000",
Handler: mux,
}
s.ListenAndServe()
}
type HelloHandler struct{}
func (h *HelloHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Hello World"))
}
I'm going to read this redundantly written code line by line, simplifying the code.
ServeHttp(w ResponseWriter, r *Request)
I'm going to read this redundantly written code line by line, simplifying the code.
type HelloHandler struct{}
func (h *HelloHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Hello World"))
}
ServeHTTP (w ResponseWriter, r * Request)
is an implementation of the Handler
interface.
// url: https://golang.org/src/net/http/server.go?s=61586:61646#L1996
func (mux *ServeMux) ServeHTTP(w ResponseWriter, r *Request) {
if r.RequestURI == "*" {
if r.ProtoAtLeast(1, 1) {
w.Header().Set("Connection", "close")
}
w.WriteHeader(StatusBadRequest)
return
}
h, _ := mux.Handler(r)
h.ServeHTTP(w, r)
}
// url: https://golang.org/src/net/http/server.go?s=61586:61646#L79
type Handler interface {
ServeHTTP(ResponseWriter, *Request)
}
In the reference implementation, the HelloHandler
structure is prepared for ServeHTTP (w ResponseWriter, r * Request)
,
you can rewrite it more concisely by using HandlerFunc
.
// url: https://golang.org/src/net/http/server.go?s=61509:61556#L1993
type HandlerFunc func(ResponseWriter, *Request)
func (f HandlerFunc) ServeHTTP(w ResponseWriter, r *Request) {
f(w, r)
}
Rewriting the reference implementation looks like this.
package main
import (
"net/http"
)
func main() {
mux := http.NewServeMux()
mux.Handle("/", http.HandlerFunc(hello))
s := http.Server{
Addr: ":3000",
Handler: mux,
}
s.ListenAndServe()
}
func hello(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Hello World"))
}
I was able to rewrite the part that used ServeHTTP (w ResponseWriter, r * Request)
.
By the way, the contents of mux.Handle
are implemented like this.
// url: https://golang.org/src/net/http/server.go?s=75321:75365#L2390
func (mux *ServeMux) Handle(pattern string, handler Handler) {
mux.mu.Lock()
defer mux.mu.Unlock()
if pattern == "" {
panic("http: invalid pattern")
}
if handler == nil {
panic("http: nil handler")
}
if _, exist := mux.m[pattern]; exist {
panic("http: multiple registrations for " + pattern)
}
if mux.m == nil {
mux.m = make(map[string]muxEntry)
}
e := muxEntry{h: handler, pattern: pattern}
mux.m[pattern] = e
if pattern[len(pattern)-1] == '/' {
mux.es = appendSorted(mux.es, e)
}
if pattern[0] != '/' {
mux.hosts = true
}
}
ServeMux
Let's take a closer look at the previous code.
mux := http.NewServeMux()
mux.Handle("/", http.HandlerFunc(hello))
s := http.Server{
Addr: ":3000",
Handler: mux,
}
Since the part of mux.Handle ("/", http.HandlerFunc (hello))
can be processed internally by using HandleFunc
, you can write shorter.
url: https://golang.org/src/net/http/server.go?s=75575:75646#L2448
func HandleFunc(pattern string, handler func(ResponseWriter, *Request)) {
DefaultServeMux.HandleFunc(pattern, handler)
}
url: https://golang.org/src/net/http/server.go?s=75575:75646#L2435
func (mux *ServeMux) HandleFunc(pattern string, handler func(ResponseWriter, *Request)) {
if handler == nil {
panic("http: nil handler")
}
mux.Handle(pattern, HandlerFunc(handler))
}
If you rewrite it in consideration of the above, it will look like this.
package main
import (
"net/http"
)
func main() {
http.HandleFunc("/", hello)
s := http.Server{
Addr: ":3000",
Handler: http.DefaultServeMux,
}
s.ListenAndServe()
}
func hello(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Hello World"))
}
DefaultServeMux
is internally a variable that contains a pointer to the ServeMux
structure.
HandleFunc
is a method that allows you to register a url pattern match to DefaultServeMux
.
url: https://golang.org/src/net/http/server.go?s=75575:75646#L2207
// DefaultServeMux is the default ServeMux used by Serve.
var DefaultServeMux = &defaultServeMux
var defaultServeMux ServeMux
url: https://golang.org/src/net/http/server.go?s=68149:68351#L2182
type ServeMux struct {
mu sync.RWMutex
m map[string]muxEntry
es []muxEntry // slice of entries sorted from longest to shortest.
hosts bool // whether any patterns contain hostnames
}
Server
The last thing to look at is this part.
s := http.Server{
Addr: ":3000",
Handler: http.DefaultServeMux,
}
s.ListenAndServe()
The implementation of s.ListenAndServe ()
are here.
url: https://golang.org/src/net/http/server.go?s=68149:68351#L3093
func (srv *Server) ListenAndServe() error {
if srv.shuttingDown() {
return ErrServerClosed
}
addr := srv.Addr
if addr == "" {
addr = ":http"
}
ln, err := net.Listen("tcp", addr)
if err != nil {
return err
}
return srv.Serve(ln)
}
If you don't need to give detailed settings to Server
, you can write it short by using ListenAndServe ()
.
Please refer to golang.org --server.go for the setting value of Server
.
url: https://golang.org/src/net/http/server.go?s=68149:68351#L3071
func ListenAndServe(addr string, handler Handler) error {
server := &Server{Addr: addr, Handler: handler}
return server.ListenAndServe()
}
It looks like this in short.
package main
import (
"net/http"
)
func main() {
http.HandleFunc("/", hello)
http.ListenAndServe(":3000", nil)
}
func hello(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Hello World"))
}
It looks like this when used with an anonymous function.
package main
import (
"net/http"
)
func main() {
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Hello World"))
})
<span class="n">http</span><span class="o">.</span><span class="n">ListenAndServe</span><span class="p">(</span><span class="s">":3000"</span><span class="p">,</span> <span class="no">nil</span><span class="p">)</span>
}
Extend multiplexer
Here is the code for the part that extends the multiplexer.
Golang is easy to extend because the interface of the standard package is properly maintained.
Conclusion
This is the package actually developed in this way.
The amount of code is small, so please read it.
Of course, contributions are also welcome. :D
Appendix
goblin listed in awesome-go
I posted an article related to HTTP Router. Please read this too if you like.
Top comments (0)