Http requests in python web applications are served as shown below
Web developers usually work with the web frameworks like Django, flask, but they hardly know the actual mechanics in place about how http requests float through the system and how the desired http response is generated.
A web framework is a major component of the whole process. These days, applications are written mostly under the web frameworks, but as a developer, its very essential to have the knowledge of each component of the request -> resposne cycle and in python web apps, WSGI is very important concept that you will learn in this post.
Some History
Before moving to WSGI details, I find that short look of history would be vital so lets start from when the web came into being.
Web or HTTP is static, by design this means it can only return you a static file on the server. Let's say, a web server receives a request as shown in below image GET /index.html
. The server finds that page on the disk, reads index.html
file and returns a response with content of index.html
file. Web could also return CSS files or other static assets, but the whole idea back then was simple like this.
External Scripts
By early 90's, it was identified that the web was not an efficient and interactive way to entertain the dynamic http requests where each request doesn't have to update static html pages every time. Some dynamic method was needed to process inputs like html forms, sessions.
That was achieved by running external scripts that would process those new input types, i.e. html-forms, json(later).
At that time, external scripts were written in PERL or PHP, but it was doable with python also. External scripts were named after the input they used to process like form.py
, json.py
. form.py
would process form-data and json.py
would process json-data. Now, instead of sending bear metal file names in requests like GET /index.html
. Web servers started to get requests as GET form.py
. That used to indicate that server has to run form.py
that is an external script to process an html form request.
Web server used to fork the parent process and run the scripts. Any print()
statement in external scripts turned out to be the http response to the client.
CGI (Common Gateway Interface)
Web servers usually parse a http request, create some new environment variables and external scripts inherit them in the arguments. But different web servers named randomly to those environment variables that caused conflicts. Each server had its own specification of environment variables.
To mitigate this, a common standard for request environment variables was established named Common Gateway Interface.
Some cgi request environemnt vars
WSGI
Python community took a step ahead and along with CGI environment variables, they standardized the way, external scripts must be executed. This standardization was named as WSGI(Web Server Gateway Interface). It is described in python doc PEP 3333.
This specification ensures that every python external script called from a web server must have a particular callable object (class/ method/ function/ instance having __call__
method) as below:
There are 2 sides of WSGI:
- Application/Framework: Python web frameworks actually act as external scripts that conform to the WSGI protocol by implementing a callable object as described in above figure. WSGI compatible servers execute this object for each request.
- Server/Gateway: WSGI servers implement the functionality to execute the callable object. Web servers like Apache, Nginx are not able to do this task so intermediate servers like gunicorn, uWSGI play the part in between. They take http requests from web servers, call the WSGI object with the request and send back the response to web servers.
Power of WSGI servers is that, a python web application built in any python framework can use any WSGI server like gunicorn, uWSGI. Python developers can write future web servers/ web frameworks using WSGI specification so that web server or web framework would be portable to any other available option.
Some more about WSGI Web Servers
- No Interpreter Restart
Going back to CGI external scripts for a moment. There is a limitation with external scripts that, for each new request, web server has to restart the python interpreter and run the script. This slows down the process a lot.
One of the potential benefits of the WSGI server is that now web server doesn't have to spin-up the new python interpreter for each new request. Instead the web server passes the request to WSGI server that runs the WSGI callable object/function repeatedly for each request. This approach is way more robust.
- Pre-Forking
One bottleneck is the time web servers take to create a new process for each request in order to execute the callable object.
Pre-forking is a technique where web servers pre fork the external script processes in idle time or in python world, WSGI servers create new processes beforehand to run the callable object against each http request. These processes are called workers, you can even create threads to those workers.
HTTP servers like Nginx can't pre-fork so that's also another reason to fit in the WSGI servers in between Nginx and python web applications.
for more, visit the links 1 2 3
hope you learnt something :)
Top comments (2)
Really good article
Thanks for sharing