- WSGI Basics
- Simple WSGI Server
- Application Object
- Application Caller
- File Wrapper
- Input Stream
- Exceptions
- Chunked Input With wsgi.input_terminated
- Chunked Response
- Range Requests
- Conclusion
In the last installment I covered various protocols that can be used to connect with a python application. What's missing from all this is how to structure the actual application code. This article will look at WSGI as one method of solving this challenge.
WSGI Basics
The Web Server Gateway Interface (WSGI) standard was first established with PEP333 and later PEP3333 become the current version of it. It standardizes a few callable method signatures and outlines some practices on how the application and server should interact with each other. To help with implementation python includes a wsgiref module showing the basic format of an WSGI application.
Simple WSGI Server
The first useful feature of the wsgiref is a basic server that can be spun up with minimal code:
from wsgiref.simple_server import make_server, demo_app
with make_server('localhost', 8000, demo_app) as httpd:
print("Serving HTTP on port 8000...")
# Respond to requests until process is killed
httpd.serve_forever()
Along with a simple client:
import requests
response = requests.get('http://localhost:8000')
print(response.text)
The demo_app
prints Hello World and then a dump of all the environment variables:
$ python simple_client.py
Hello world!
<snip>
QUERY_STRING = ''
REMOTE_ADDR = '127.0.0.1'
REMOTE_HOST = ''
REQUEST_METHOD = 'GET'
SCRIPT_NAME = ''
SERVER_NAME = 'localhost'
SERVER_PORT = '8000'
SERVER_PROTOCOL = 'HTTP/1.1'
SERVER_SOFTWARE = 'WSGIServer/0.2'
<snip>
wsgi.errors = <_io.TextIOWrapper name='<stderr>' mode='w' encoding='utf-8'>
wsgi.file_wrapper = <class 'wsgiref.util.FileWrapper'>
wsgi.input = <_io.BufferedReader name=4>
wsgi.multiprocess = False
wsgi.multithread = False
wsgi.run_once = False
wsgi.url_scheme = 'http'
wsgi.version = (1, 0)
Along with CGI like variables there are wsgi specific ones as outlined by the standard. Now we'll look at a more expanded example to see what's going on behind the scenes.
Application Object
The application object is a callable which is interacted with in WSGI. A more expanded version could look something like this:
from wsgiref.simple_server import make_server
from wsgiref.validate import validator
def application(env, start_response):
content = []
content.append(b'Hello World ')
content.append(f"{env['wsgi.version'][0]}.{env['wsgi.version'][1]}".encode('utf-8'))
content_length = sum(len(i) for i in content)
status = '200 OK'
response_headers = [
('Content-Type', 'plain/text'),
('Content-Length', str(content_length))
]
start_response(status, response_headers)
return content
validated_app = validator(application)
with make_server('localhost', 8000, validated_app) as httpd:
print("Serving HTTP on port 8000...")
# Respond to requests until process is killed
httpd.serve_forever()
This prints out "Hello World 1.0", which includes a string plus the version tuple for WSGI. There is also a validator which wraps around our application and ensures various parts of it are WSGI compliant. This can be useful in development practices to catch potential issues quickly. In production however you'd be better off performance wise exposing the bare application. The first important part of this is the application signature:
def application(env, start_response):
It takes in two arguments, an environment and a start_response
callable. Next is content for passing back to the client. The WSGI standard defines this as an iteratable, which we'll use a list for:
def application(env, start_response):
content = []
content.append(b'Hello World ')
content.append(f"{env['wsgi.version'][0]}.{env['wsgi.version'][1]}".encode('utf-8'))
content_length = sum(len(x) for x in content)
Another consideration here is that the return should be bytes. This is why the formatted string is further encoded in utf-8. Content length is is then calculated for the header value. This means that the output can now be sent:
status = '200 OK'
response_headers = [
('Content-Type', 'plain/text'),
('Content-Length', str(content_length))
]
start_response(status, response_headers)
return content
Response headers are passed in as a list of tuples. start_response
is the method callable which was passed into the main application. The passing in of this value is handled by the wsgiref simple server. From there the content is passed back to the caller for the client to receive. Generators functions can be used as an alternative:
from wsgiref.simple_server import make_server
from wsgiref.validate import validator
def application(env, start_response):
status = '200 OK'
response_headers = [
('Content-Type', 'text/plain'),
]
start_response(status, response_headers)
def generate_content():
yield b'Hello World '
yield f"{env['wsgi.version'][0]}.{env['wsgi.version'][1]}\n".encode('utf-8')
return generate_content()
validated_app = validator(application)
with make_server('localhost', 8000, validated_app) as httpd:
print("Serving HTTP on port 8000...")
# Respond to requests until process is killed
httpd.serve_forever()
Ideally yield would be used in cases of more intense processing where memory starvation might be an issue.
Application Caller
So looking at all of this what's actually happening with the caller? What's providing start_response
and wsgi.version
? The server itself handles this generally one of two ways:
- The server is pure python, setting up the wsgi environment variables and then importing the application module to call it with the generated environment and
start_response
callable - Mostly the same as 1 save that there is some kind of embedded interpreter / tie in with the Python C API (or cffi)
gunicorn is an example of the first solution. mod_wsgi is an example of doing it using the second method. If you're just doing development work I'd highly recommend using the first method. It also has a potential benefit to tap into JIT optimizations if you're using PyPy due to being a long running script.
File Wrapper
For the case of files WSGI also has a variable wsgi.file_wrapper
which may or may not be available and provide chunked file streaming:
from os.path import getsize
from wsgiref.simple_server import make_server
from wsgiref.validate import validator
def application(env, start_response):
status = '200 OK'
content_length = str(getsize('large-file.json'))
response_headers = [
('Content-Type', 'application/json'),
('Content-Length', content_length)
]
start_response(status, response_headers)
return env['wsgi.file_wrapper'](open('large-file.json', 'rb'))
validated_app = validator(application)
with make_server('localhost', 8000, validated_app) as httpd:
print("Serving HTTP on port 8000...")
# Respond to requests until process is killed
httpd.serve_forever()
In this case I know wsgi.file_wrapper
is available, but that's not always the case. iter(lambda: filelike.read(block_size), '')
can be used as a replacement for cases where it's not. os.path.getsize (or os.stat().st_size, which it uses behind the scenes) can be used for purposes of obtaining the Content-Length
value.
Input Stream
Input from the client is obtained via the wsgi.input
environment variable. While it could be stdin, some solutions may utilize a buffer of some kind instead. I'll use an echo client and server to demonstrate this:
wsgi_input_client.py
import requests
post_data = {
'test1': 'foobar',
'test2': 'foobar',
'test3': 'foobar'
}
r = requests.post('http://localhost:8000/', data=post_data)
print(r.content)
wsgi_input_server.py
from wsgiref.simple_server import make_server
from wsgiref.validate import validator
def application(env, start_response):
input = env['wsgi.input']
data = input.read(int(env['CONTENT_LENGTH']))
status = '200 OK'
response_headers = [
('Content-Type', 'text/plain'),
('Content-Length', env['CONTENT_LENGTH']),
]
start_response(status, response_headers)
return [data]
validated_app = validator(application)
with make_server('localhost', 8000, validated_app) as httpd:
print("Serving HTTP on port 8000...")
# Respond to requests until process is killed
httpd.serve_forever()
Running the server and then calling the client against it:
$ python wsgi_input_client.py
b'test1=foobar&test2=foobar&test3=foobar'
So the way of handling input is fairly standard and not much different from how you'd implement it in a CGI script.
Exceptions
An alternative call for start_response
can be used for exceptions in case something goes wrong:
import sys
from wsgiref.simple_server import make_server
from wsgiref.validate import validator
def application(env, start_response):
try:
content = []
content.append(b'Hello World ')
content.append(f"{env['wsgi.version'][0]}.{env['wsgi.version'][1]}".encode('utf-8'))
content_length = sum(len(i) for i in content)
status = '200 OK'
response_headers = [
('Content-Type', 'text/plain'),
('Content-Length', str(content_length))
]
start_response(status, response_headers)
1/0
return content
except:
status = '500 Internal Server Error'
response_headers = [
('Content-Type', 'text/plain')
]
start_response(status, response_headers, sys.exc_info())
return [b'An error has occurred']
validated_app = validator(application)
with make_server('localhost', 8000, validated_app) as httpd:
print("Serving HTTP on port 8000...")
# Respond to requests until process is killed
httpd.serve_forever()
In this case the client will receive An error has occurred
message along with a 500 status return due to the division by 0. sys.exc_info()
is the standard return to the third argument of start_response
only if an exception is present (and also the only time an additional call to it can occur).
Chunked Input With wsgi.input_terminated
While wsgiref.simple_server
is useful for basic cases it does have one particular area that developers struggled with: chunked input. MDN has an example of what chunked encoding looks like:
HTTP/1.1 200 OK
Content-Type: text/plain
Transfer-Encoding: chunked
7\r\n
Mozilla\r\n
11\r\n
Developer Network\r\n
0\r\n
\r\n
Data is sent in chunks where a hex value of the length of the data and the actual data itself is sent. The end is indicated by a 0 size and then followed by \r\n
on its own line. Unfortunately the wsgiref server doesn't handle this. To work around this limitation Armin Ronacher of the Flask framework proposed a wsgi.input_terminated solution. This has been implemented in many wsgi server solutions already. To showcase this I'll use the werkzeug library which provides various WSGI utilities via pip install werkzeug
:
from werkzeug.serving import make_server
from werkzeug.wrappers import Request, Response
def application(environ, start_response):
request = Request(environ)
with open('test.json', 'wb') as stream_fp:
stream_fp.write(request.stream.read())
resp = Response('Hello World!', mimetype='text/plain')
return resp(environ, start_response)
if __name__ == '__main__':
HOST = '127.0.0.1'
PORT = 8123
httpd = make_server(HOST, PORT, application)
print(f'Serving on http://{HOST}:{PORT}')
try:
httpd.serve_forever()
except KeyboardInterrupt:
print('^C')
In this example werkzeug has wrappers around the request and response to make it easier to access certain properties of each. To test this out I'll be chunk posting a 25MB JSON file which will then be written to the working directory of the server. "Hello World!" will be printed when the process is done. To see more of the request I'll be using curl instead of the usual python requests script:
$ curl -v -H "Transfer-Encoding: chunked" -d @large-file.json http://127.0.0.1:8123/
* Trying 127.0.0.1:8123...
* Connected to 127.0.0.1 (127.0.0.1) port 8123 (#0)
> POST / HTTP/1.1
> Host: 127.0.0.1:8123
> User-Agent: curl/7.74.0
> Accept: */*
> Transfer-Encoding: chunked
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 100 Continue
* Signaling end of chunked upload via terminating chunk.
* Mark bundle as not supporting multiuse
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: Werkzeug/2.3.6 Python/3.10.12
< Date: Sun, 20 Aug 2023 10:55:16 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 12
< Connection: close
<
* Closing connection 0
Hello World!
So the request was written, but the code didn't change too much from the standard WSGI version. That's because things are handled behind the scenes:
if environ.get("HTTP_TRANSFER_ENCODING", "").strip().lower() == "chunked":
environ["wsgi.input_terminated"] = True
environ["wsgi.input"] = DechunkedInput(environ["wsgi.input"])
So this will set the wsgi.input_terminated
if it finds a Transfer-Encoding
header with the value of chunked
. The DeChunkedInput
class works with reading the chunk segments:
line = self._rfile.readline().decode("latin1")
_len = int(line.strip(), 16)
So here the length value is read in, which is a hex encoded integer followed by \r\n
. Then readinto
handles reading based on that value and also checking for the terminating size 0 with an \r\n
on its own line after that.
Chunked Response
When the werkzeug server has threading or multiprocess enabled, it will utilize HTTP/1.1 and can return chunked responses as well. This makes it easier to deal with the fact that getting content length from dynamic data can be tedious. As an example:
from werkzeug.serving import make_server
from werkzeug.wrappers import Request, Response
def application(environ, start_response):
request = Request(environ)
with open('test.json', 'wb') as stream_fp:
stream_fp.write(request.stream.read())
def generate_response():
yield "Line 1"
yield "Line 2"
yield "Line 3"
yield "Line 4"
resp = Response(generate_response(), mimetype='text/plain')
return resp(environ, start_response)
if __name__ == '__main__':
HOST = '127.0.0.1'
PORT = 8123
httpd = make_server(HOST, PORT, application, threaded=True)
print(f'Serving on http://{HOST}:{PORT}')
try:
httpd.serve_forever()
except KeyboardInterrupt:
print('^C')
Running curl with raw mode will show chunked data that it normally abstracts away from us:
$ curl -iv --raw -H "Transfer-Encoding: chunked" -d @large-file.json http://127.0.0.1:8123/
* Trying 127.0.0.1:8123...
* Connected to 127.0.0.1 (127.0.0.1) port 8123 (#0)
> POST / HTTP/1.1
> Host: 127.0.0.1:8123
> User-Agent: curl/7.74.0
> Accept: */*
> Transfer-Encoding: chunked
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 100 Continue
HTTP/1.1 100 Continue
* Mark bundle as not supporting multiuse
< HTTP/1.1 100 Continue
HTTP/1.1 100 Continue
* Signaling end of chunked upload via terminating chunk.
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Server: Werkzeug/2.3.6 Python/3.10.12
Server: Werkzeug/2.3.6 Python/3.10.12
< Date: Sun, 20 Aug 2023 11:22:23 GMT
Date: Sun, 20 Aug 2023 11:22:23 GMT
< Content-Type: text/plain; charset=utf-8
Content-Type: text/plain; charset=utf-8
< Transfer-Encoding: chunked
Transfer-Encoding: chunked
< Connection: close
Connection: close
<
6
Line 1
6
Line 2
6
Line 3
6
Line 4
0
* Closing connection 0
The return now utilizes Transfer-Encoding: chunked
and the chunked data can be seen as well.
Range Requests
Range is a special HTTP feature where you can download a specific portion of something. The primary use for this is to support resuming transfers from a certain byte. As an example:
from werkzeug.serving import make_server
from werkzeug.wrappers import Request, Response
def application(environ, start_response):
request = Request(environ)
start, end = request.range.ranges[0]
with open('large-file.json') as stream_fp:
stream_fp.seek(start)
data = stream_fp.read(end - start)
resp = Response(data, mimetype='text/plain')
return resp(environ, start_response)
if __name__ == '__main__':
HOST = '127.0.0.1'
PORT = 8123
httpd = make_server(HOST, PORT, application, threaded=True)
print(f'Serving on http://{HOST}:{PORT}')
try:
httpd.serve_forever()
except KeyboardInterrupt:
print('^C')
Note that the reason why the request.range.ranges
is a list of tuples is because you can have multiple range declarations. In this case it will be a controlled session where I know there will only be one range value. Using curl again with range modifiers we can see that the requested 200 bytes were returned:
$ curl -v -r 1000-1199 http://127.0.0.1:8123/
* Trying 127.0.0.1:8123...
* Connected to 127.0.0.1 (127.0.0.1) port 8123 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:8123
> Range: bytes=1000-1199
> User-Agent: curl/7.74.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Server: Werkzeug/2.3.6 Python/3.10.12
< Date: Sun, 20 Aug 2023 11:48:04 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 200
< Connection: close
<
* Closing connection 0
cacfcd","before":"437c03652caa0bc4a7554b18d5c0a394c2f3d326","commits":[{"sha":"6b089eb4a43f728f0a594388092f480f2ecacfcd","author":{"email":"5c682c2d1ec4073e277f9ba9f4bdf07e5794dabe@rspt.ch","name":"rs
Note that it's 200 bytes due to 0 indexing handling. You'll want to iterate through the ranges list in practice to support the multi-range declarations.
Conclusion
This includes a look at WSGI and using the werkzeug
library to simplify some of the more advanced HTTP related features. Thanks to being a PEP standard you won't have too much trouble finding software to support it. In the next part of the series I'll be looking at WSGI server solutions to deliver WSGI content.
Top comments (3)
I'm still digesting the previous post... super excited to find out what's next!
Thanks for the comment! The protocols post is something I wrote more out of curiosity on how things work. Practically speaking though many modern solutions work off HTTP and the ones that do work off the other protocols handle everything for you. If you just want to mess around with something SCGI is pretty easy to work with.
It is empowering to see how we can create a webserver with very simple amount of code. Thanks for the great post.