forked from python/peps
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathpep-3333.txt
1792 lines (1439 loc) · 79.7 KB
/
pep-3333.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
PEP: 3333
Title: Python Web Server Gateway Interface v1.0.1
Version: $Revision$
Last-Modified: $Date$
Author: P.J. Eby <[email protected]>
Discussions-To: [email protected]
Status: Final
Type: Informational
Content-Type: text/x-rst
Created: 26-Sep-2010
Post-History: 26-Sep-2010, 04-Oct-2010
Replaces: 333
Preface for Readers of PEP 333
==============================
This is an updated version of :pep:`333`, modified slightly to improve
usability under Python 3, and to incorporate several long-standing
de facto amendments to the WSGI protocol. (Its code samples have
also been ported to Python 3.)
While for procedural reasons [6]_, this must be a distinct PEP, no
changes were made that invalidate previously-compliant servers or
applications under Python 2.x. If your 2.x application or server
is compliant to PEP \333, it is also compliant with this PEP.
Under Python 3, however, your app or server must also follow the
rules outlined in the sections below titled, `A Note On String
Types`_, and `Unicode Issues`_.
For detailed, line-by-line diffs between this document and PEP \333,
you may view its SVN revision history [7]_, from revision 84854 forward.
Abstract
========
This document specifies a proposed standard interface between web
servers and Python web applications or frameworks, to promote web
application portability across a variety of web servers.
Original Rationale and Goals (from PEP 333)
===========================================
Python currently boasts a wide variety of web application frameworks,
such as Zope, Quixote, Webware, SkunkWeb, PSO, and Twisted Web -- to
name just a few [1]_. This wide variety of choices can be a problem
for new Python users, because generally speaking, their choice of web
framework will limit their choice of usable web servers, and vice
versa.
By contrast, although Java has just as many web application frameworks
available, Java's "servlet" API makes it possible for applications
written with any Java web application framework to run in any web
server that supports the servlet API.
The availability and widespread use of such an API in web servers for
Python -- whether those servers are written in Python (e.g. Medusa),
embed Python (e.g. mod_python), or invoke Python via a gateway
protocol (e.g. CGI, FastCGI, etc.) -- would separate choice of
framework from choice of web server, freeing users to choose a pairing
that suits them, while freeing framework and server developers to
focus on their preferred area of specialization.
This PEP, therefore, proposes a simple and universal interface between
web servers and web applications or frameworks: the Python Web Server
Gateway Interface (WSGI).
But the mere existence of a WSGI spec does nothing to address the
existing state of servers and frameworks for Python web applications.
Server and framework authors and maintainers must actually implement
WSGI for there to be any effect.
However, since no existing servers or frameworks support WSGI, there
is little immediate reward for an author who implements WSGI support.
Thus, WSGI **must** be easy to implement, so that an author's initial
investment in the interface can be reasonably low.
Thus, simplicity of implementation on *both* the server and framework
sides of the interface is absolutely critical to the utility of the
WSGI interface, and is therefore the principal criterion for any
design decisions.
Note, however, that simplicity of implementation for a framework
author is not the same thing as ease of use for a web application
author. WSGI presents an absolutely "no frills" interface to the
framework author, because bells and whistles like response objects and
cookie handling would just get in the way of existing frameworks'
handling of these issues. Again, the goal of WSGI is to facilitate
easy interconnection of existing servers and applications or
frameworks, not to create a new web framework.
Note also that this goal precludes WSGI from requiring anything that
is not already available in deployed versions of Python. Therefore,
new standard library modules are not proposed or required by this
specification, and nothing in WSGI requires a Python version greater
than 2.2.2. (It would be a good idea, however, for future versions
of Python to include support for this interface in web servers
provided by the standard library.)
In addition to ease of implementation for existing and future
frameworks and servers, it should also be easy to create request
preprocessors, response postprocessors, and other WSGI-based
"middleware" components that look like an application to their
containing server, while acting as a server for their contained
applications.
If middleware can be both simple and robust, and WSGI is widely
available in servers and frameworks, it allows for the possibility
of an entirely new kind of Python web application framework: one
consisting of loosely-coupled WSGI middleware components. Indeed,
existing framework authors may even choose to refactor their
frameworks' existing services to be provided in this way, becoming
more like libraries used with WSGI, and less like monolithic
frameworks. This would then allow application developers to choose
"best-of-breed" components for specific functionality, rather than
having to commit to all the pros and cons of a single framework.
Of course, as of this writing, that day is doubtless quite far off.
In the meantime, it is a sufficient short-term goal for WSGI to
enable the use of any framework with any server.
Finally, it should be mentioned that the current version of WSGI
does not prescribe any particular mechanism for "deploying" an
application for use with a web server or server gateway. At the
present time, this is necessarily implementation-defined by the
server or gateway. After a sufficient number of servers and
frameworks have implemented WSGI to provide field experience with
varying deployment requirements, it may make sense to create
another PEP, describing a deployment standard for WSGI servers and
application frameworks.
Specification Overview
======================
The WSGI interface has two sides: the "server" or "gateway" side, and
the "application" or "framework" side. The server side invokes a
callable object that is provided by the application side. The
specifics of how that object is provided are up to the server or
gateway. It is assumed that some servers or gateways will require an
application's deployer to write a short script to create an instance
of the server or gateway, and supply it with the application object.
Other servers and gateways may use configuration files or other
mechanisms to specify where an application object should be
imported from, or otherwise obtained.
In addition to "pure" servers/gateways and applications/frameworks,
it is also possible to create "middleware" components that implement
both sides of this specification. Such components act as an
application to their containing server, and as a server to a
contained application, and can be used to provide extended APIs,
content transformation, navigation, and other useful functions.
Throughout this specification, we will use the term "a callable" to
mean "a function, method, class, or an instance with a ``__call__``
method". It is up to the server, gateway, or application implementing
the callable to choose the appropriate implementation technique for
their needs. Conversely, a server, gateway, or application that is
invoking a callable **must not** have any dependency on what kind of
callable was provided to it. Callables are only to be called, not
introspected upon.
A Note On String Types
----------------------
In general, HTTP deals with bytes, which means that this specification
is mostly about handling bytes.
However, the content of those bytes often has some kind of textual
interpretation, and in Python, strings are the most convenient way
to handle text.
But in many Python versions and implementations, strings are Unicode,
rather than bytes. This requires a careful balance between a usable
API and correct translations between bytes and text in the context of
HTTP... especially to support porting code between Python
implementations with different ``str`` types.
WSGI therefore defines two kinds of "string":
* "Native" strings (which are always implemented using the type
named ``str``) that are used for request/response headers and
metadata
* "Bytestrings" (which are implemented using the ``bytes`` type
in Python 3, and ``str`` elsewhere), that are used for the bodies
of requests and responses (e.g. POST/PUT input data and HTML page
outputs).
Do not be confused however: even if Python's ``str`` type is actually
Unicode "under the hood", the *content* of native strings must
still be translatable to bytes via the Latin-1 encoding! (See
the section on `Unicode Issues`_ later in this document for more
details.)
In short: where you see the word "string" in this document, it refers
to a "native" string, i.e., an object of type ``str``, whether it is
internally implemented as bytes or unicode. Where you see references
to "bytestring", this should be read as "an object of type ``bytes``
under Python 3, or type ``str`` under Python 2".
And so, even though HTTP is in some sense "really just bytes", there
are many API conveniences to be had by using whatever Python's
default ``str`` type is.
The Application/Framework Side
------------------------------
The application object is simply a callable object that accepts
two arguments. The term "object" should not be misconstrued as
requiring an actual object instance: a function, method, class,
or instance with a ``__call__`` method are all acceptable for
use as an application object. Application objects must be able
to be invoked more than once, as virtually all servers/gateways
(other than CGI) will make such repeated requests.
(Note: although we refer to it as an "application" object, this
should not be construed to mean that application developers will use
WSGI as a web programming API! It is assumed that application
developers will continue to use existing, high-level framework
services to develop their applications. WSGI is a tool for
framework and server developers, and is not intended to directly
support application developers.)
Here are two example application objects; one is a function, and the
other is a class::
HELLO_WORLD = b"Hello world!\n"
def simple_app(environ, start_response):
"""Simplest possible application object"""
status = '200 OK'
response_headers = [('Content-type', 'text/plain')]
start_response(status, response_headers)
return [HELLO_WORLD]
class AppClass:
"""Produce the same output, but using a class
(Note: 'AppClass' is the "application" here, so calling it
returns an instance of 'AppClass', which is then the iterable
return value of the "application callable" as required by
the spec.
If we wanted to use *instances* of 'AppClass' as application
objects instead, we would have to implement a '__call__'
method, which would be invoked to execute the application,
and we would need to create an instance for use by the
server or gateway.
"""
def __init__(self, environ, start_response):
self.environ = environ
self.start = start_response
def __iter__(self):
status = '200 OK'
response_headers = [('Content-type', 'text/plain')]
self.start(status, response_headers)
yield HELLO_WORLD
The Server/Gateway Side
-----------------------
The server or gateway invokes the application callable once for each
request it receives from an HTTP client, that is directed at the
application. To illustrate, here is a simple CGI gateway, implemented
as a function taking an application object. Note that this simple
example has limited error handling, because by default an uncaught
exception will be dumped to ``sys.stderr`` and logged by the web
server.
::
import os, sys
enc, esc = sys.getfilesystemencoding(), 'surrogateescape'
def unicode_to_wsgi(u):
# Convert an environment variable to a WSGI "bytes-as-unicode" string
return u.encode(enc, esc).decode('iso-8859-1')
def wsgi_to_bytes(s):
return s.encode('iso-8859-1')
def run_with_cgi(application):
environ = {k: unicode_to_wsgi(v) for k,v in os.environ.items()}
environ['wsgi.input'] = sys.stdin.buffer
environ['wsgi.errors'] = sys.stderr
environ['wsgi.version'] = (1, 0)
environ['wsgi.multithread'] = False
environ['wsgi.multiprocess'] = True
environ['wsgi.run_once'] = True
if environ.get('HTTPS', 'off') in ('on', '1'):
environ['wsgi.url_scheme'] = 'https'
else:
environ['wsgi.url_scheme'] = 'http'
headers_set = []
headers_sent = []
def write(data):
out = sys.stdout.buffer
if not headers_set:
raise AssertionError("write() before start_response()")
elif not headers_sent:
# Before the first output, send the stored headers
status, response_headers = headers_sent[:] = headers_set
out.write(wsgi_to_bytes('Status: %s\r\n' % status))
for header in response_headers:
out.write(wsgi_to_bytes('%s: %s\r\n' % header))
out.write(wsgi_to_bytes('\r\n'))
out.write(data)
out.flush()
def start_response(status, response_headers, exc_info=None):
if exc_info:
try:
if headers_sent:
# Re-raise original exception if headers sent
raise exc_info[1].with_traceback(exc_info[2])
finally:
exc_info = None # avoid dangling circular ref
elif headers_set:
raise AssertionError("Headers already set!")
headers_set[:] = [status, response_headers]
# Note: error checking on the headers should happen here,
# *after* the headers are set. That way, if an error
# occurs, start_response can only be re-called with
# exc_info set.
return write
result = application(environ, start_response)
try:
for data in result:
if data: # don't send headers until body appears
write(data)
if not headers_sent:
write('') # send headers now if body was empty
finally:
if hasattr(result, 'close'):
result.close()
Middleware: Components that Play Both Sides
-------------------------------------------
Note that a single object may play the role of a server with respect
to some application(s), while also acting as an application with
respect to some server(s). Such "middleware" components can perform
such functions as:
* Routing a request to different application objects based on the
target URL, after rewriting the ``environ`` accordingly.
* Allowing multiple applications or frameworks to run side by side
in the same process
* Load balancing and remote processing, by forwarding requests and
responses over a network
* Perform content postprocessing, such as applying XSL stylesheets
The presence of middleware in general is transparent to both the
"server/gateway" and the "application/framework" sides of the
interface, and should require no special support. A user who
desires to incorporate middleware into an application simply
provides the middleware component to the server, as if it were
an application, and configures the middleware component to
invoke the application, as if the middleware component were a
server. Of course, the "application" that the middleware wraps
may in fact be another middleware component wrapping another
application, and so on, creating what is referred to as a
"middleware stack".
For the most part, middleware must conform to the restrictions
and requirements of both the server and application sides of
WSGI. In some cases, however, requirements for middleware
are more stringent than for a "pure" server or application,
and these points will be noted in the specification.
Here is a (tongue-in-cheek) example of a middleware component that
converts ``text/plain`` responses to pig Latin, using Joe Strout's
``piglatin.py``. (Note: a "real" middleware component would
probably use a more robust way of checking the content type, and
should also check for a content encoding. Also, this simple
example ignores the possibility that a word might be split across
a block boundary.)
::
from piglatin import piglatin
class LatinIter:
"""Transform iterated output to piglatin, if it's okay to do so
Note that the "okayness" can change until the application yields
its first non-empty bytestring, so 'transform_ok' has to be a mutable
truth value.
"""
def __init__(self, result, transform_ok):
if hasattr(result, 'close'):
self.close = result.close
self._next = iter(result).__next__
self.transform_ok = transform_ok
def __iter__(self):
return self
def __next__(self):
data = self._next()
if self.transform_ok:
return piglatin(data) # call must be byte-safe on Py3
else:
return data
class Latinator:
# by default, don't transform output
transform = False
def __init__(self, application):
self.application = application
def __call__(self, environ, start_response):
transform_ok = []
def start_latin(status, response_headers, exc_info=None):
# Reset ok flag, in case this is a repeat call
del transform_ok[:]
for name, value in response_headers:
if name.lower() == 'content-type' and value == 'text/plain':
transform_ok.append(True)
# Strip content-length if present, else it'll be wrong
response_headers = [(name, value)
for name, value in response_headers
if name.lower() != 'content-length'
]
break
write = start_response(status, response_headers, exc_info)
if transform_ok:
def write_latin(data):
write(piglatin(data)) # call must be byte-safe on Py3
return write_latin
else:
return write
return LatinIter(self.application(environ, start_latin), transform_ok)
# Run foo_app under a Latinator's control, using the example CGI gateway
from foo_app import foo_app
run_with_cgi(Latinator(foo_app))
Specification Details
=====================
The application object must accept two positional arguments. For
the sake of illustration, we have named them ``environ`` and
``start_response``, but they are not required to have these names.
A server or gateway **must** invoke the application object using
positional (not keyword) arguments. (E.g. by calling
``result = application(environ, start_response)`` as shown above.)
The ``environ`` parameter is a dictionary object, containing CGI-style
environment variables. This object **must** be a builtin Python
dictionary (*not* a subclass, ``UserDict`` or other dictionary
emulation), and the application is allowed to modify the dictionary
in any way it desires. The dictionary must also include certain
WSGI-required variables (described in a later section), and may
also include server-specific extension variables, named according
to a convention that will be described below.
The ``start_response`` parameter is a callable accepting two
required positional arguments, and one optional argument. For the sake
of illustration, we have named these arguments ``status``,
``response_headers``, and ``exc_info``, but they are not required to
have these names, and the application **must** invoke the
``start_response`` callable using positional arguments (e.g.
``start_response(status, response_headers)``).
The ``status`` parameter is a status string of the form
``"999 Message here"``, and ``response_headers`` is a list of
``(header_name, header_value)`` tuples describing the HTTP response
header. The optional ``exc_info`` parameter is described below in the
sections on `The start_response() Callable`_ and `Error Handling`_.
It is used only when the application has trapped an error and is
attempting to display an error message to the browser.
The ``start_response`` callable must return a ``write(body_data)``
callable that takes one positional parameter: a bytestring to be written
as part of the HTTP response body. (Note: the ``write()`` callable is
provided only to support certain existing frameworks' imperative output
APIs; it should not be used by new applications or frameworks if it
can be avoided. See the `Buffering and Streaming`_ section for more
details.)
When called by the server, the application object must return an
iterable yielding zero or more bytestrings. This can be accomplished in a
variety of ways, such as by returning a list of bytestrings, or by the
application being a generator function that yields bytestrings, or
by the application being a class whose instances are iterable.
Regardless of how it is accomplished, the application object must
always return an iterable yielding zero or more bytestrings.
The server or gateway must transmit the yielded bytestrings to the client
in an unbuffered fashion, completing the transmission of each bytestring
before requesting another one. (In other words, applications
**should** perform their own buffering. See the `Buffering and
Streaming`_ section below for more on how application output must be
handled.)
The server or gateway should treat the yielded bytestrings as binary byte
sequences: in particular, it should ensure that line endings are
not altered. The application is responsible for ensuring that the
bytestring(s) to be written are in a format suitable for the client. (The
server or gateway **may** apply HTTP transfer encodings, or perform
other transformations for the purpose of implementing HTTP features
such as byte-range transmission. See `Other HTTP Features`_, below,
for more details.)
If a call to ``len(iterable)`` succeeds, the server must be able
to rely on the result being accurate. That is, if the iterable
returned by the application provides a working ``__len__()``
method, it **must** return an accurate result. (See
the `Handling the Content-Length Header`_ section for information
on how this would normally be used.)
If the iterable returned by the application has a ``close()`` method,
the server or gateway **must** call that method upon completion of the
current request, whether the request was completed normally, or
terminated early due to an application error during iteration or an early
disconnect of the browser. (The ``close()`` method requirement is to
support resource release by the application. This protocol is intended
to complement :pep:`342`'s generator support, and other common iterables
with ``close()`` methods.)
Applications returning a generator or other custom iterator **should not**
assume the entire iterator will be consumed, as it **may** be closed early
by the server.
(Note: the application **must** invoke the ``start_response()``
callable before the iterable yields its first body bytestring, so that the
server can send the headers before any body content. However, this
invocation **may** be performed by the iterable's first iteration, so
servers **must not** assume that ``start_response()`` has been called
before they begin iterating over the iterable.)
Finally, servers and gateways **must not** directly use any other
attributes of the iterable returned by the application, unless it is an
instance of a type specific to that server or gateway, such as a "file
wrapper" returned by ``wsgi.file_wrapper`` (see `Optional
Platform-Specific File Handling`_). In the general case, only
attributes specified here, or accessed via e.g. the :pep:`234` iteration
APIs are acceptable.
``environ`` Variables
---------------------
The ``environ`` dictionary is required to contain these CGI
environment variables, as defined by the Common Gateway Interface
specification [2]_. The following variables **must** be present,
unless their value would be an empty string, in which case they
**may** be omitted, except as otherwise noted below.
``REQUEST_METHOD``
The HTTP request method, such as ``"GET"`` or ``"POST"``. This
cannot ever be an empty string, and so is always required.
``SCRIPT_NAME``
The initial portion of the request URL's "path" that corresponds to
the application object, so that the application knows its virtual
"location". This **may** be an empty string, if the application
corresponds to the "root" of the server.
``PATH_INFO``
The remainder of the request URL's "path", designating the virtual
"location" of the request's target within the application. This
**may** be an empty string, if the request URL targets the
application root and does not have a trailing slash.
``QUERY_STRING``
The portion of the request URL that follows the ``"?"``, if any.
May be empty or absent.
``CONTENT_TYPE``
The contents of any ``Content-Type`` fields in the HTTP request.
May be empty or absent.
``CONTENT_LENGTH``
The contents of any ``Content-Length`` fields in the HTTP request.
May be empty or absent.
``SERVER_NAME``, ``SERVER_PORT``
When ``HTTP_HOST`` is not set, these variables can be combined to determine a
default. See the `URL Reconstruction`_ section below for more detail.
``SERVER_NAME`` and ``SERVER_PORT`` are required strings and must never be
empty.
``SERVER_PROTOCOL``
The version of the protocol the client used to send the request.
Typically this will be something like ``"HTTP/1.0"`` or ``"HTTP/1.1"``
and may be used by the application to determine how to treat any
HTTP request headers. (This variable should probably be called
``REQUEST_PROTOCOL``, since it denotes the protocol used in the
request, and is not necessarily the protocol that will be used in the
server's response. However, for compatibility with CGI we have to
keep the existing name.)
``HTTP_`` Variables
Variables corresponding to the client-supplied HTTP request headers
(i.e., variables whose names begin with ``"HTTP_"``). The presence or
absence of these variables should correspond with the presence or
absence of the appropriate HTTP header in the request.
A server or gateway **should** attempt to provide as many other CGI
variables as are applicable. In addition, if SSL is in use, the server
or gateway **should** also provide as many of the Apache SSL environment
variables [5]_ as are applicable, such as ``HTTPS=on`` and
``SSL_PROTOCOL``. Note, however, that an application that uses any CGI
variables other than the ones listed above are necessarily non-portable
to web servers that do not support the relevant extensions. (For
example, web servers that do not publish files will not be able to
provide a meaningful ``DOCUMENT_ROOT`` or ``PATH_TRANSLATED``.)
A WSGI-compliant server or gateway **should** document what variables
it provides, along with their definitions as appropriate. Applications
**should** check for the presence of any variables they require, and
have a fallback plan in the event such a variable is absent.
Note: missing variables (such as ``REMOTE_USER`` when no
authentication has occurred) should be left out of the ``environ``
dictionary. Also note that CGI-defined variables must be native strings,
if they are present at all. It is a violation of this specification
for *any* CGI variable's value to be of any type other than ``str``.
In addition to the CGI-defined variables, the ``environ`` dictionary
**may** also contain arbitrary operating-system "environment variables",
and **must** contain the following WSGI-defined variables:
===================== ===============================================
Variable Value
===================== ===============================================
``wsgi.version`` The tuple ``(1, 0)``, representing WSGI
version 1.0.
``wsgi.url_scheme`` A string representing the "scheme" portion of
the URL at which the application is being
invoked. Normally, this will have the value
``"http"`` or ``"https"``, as appropriate.
``wsgi.input`` An input stream (file-like object) from which
the HTTP request body bytes can be read. (The server
or gateway may perform reads on-demand as
requested by the application, or it may pre-
read the client's request body and buffer it
in-memory or on disk, or use any other
technique for providing such an input stream,
according to its preference.)
``wsgi.errors`` An output stream (file-like object) to which
error output can be written, for the purpose of
recording program or other errors in a
standardized and possibly centralized location.
This should be a "text mode" stream; i.e.,
applications should use ``"\n"`` as a line
ending, and assume that it will be converted to
the correct line ending by the server/gateway.
(On platforms where the ``str`` type is unicode,
the error stream **should** accept and log
arbitrary unicode without raising an error; it
is allowed, however, to substitute characters
that cannot be rendered in the stream's encoding.)
For many servers, ``wsgi.errors`` will be the
server's main error log. Alternatively, this
may be ``sys.stderr``, or a log file of some
sort. The server's documentation should
include an explanation of how to configure this
or where to find the recorded output. A server
or gateway may supply different error streams
to different applications, if this is desired.
``wsgi.multithread`` This value should evaluate true if the
application object may be simultaneously
invoked by another thread in the same process,
and should evaluate false otherwise.
``wsgi.multiprocess`` This value should evaluate true if an
equivalent application object may be
simultaneously invoked by another process,
and should evaluate false otherwise.
``wsgi.run_once`` This value should evaluate true if the server
or gateway expects (but does not guarantee!)
that the application will only be invoked this
one time during the life of its containing
process. Normally, this will only be true for
a gateway based on CGI (or something similar).
===================== ===============================================
Finally, the ``environ`` dictionary may also contain server-defined
variables. These variables should be named using only lower-case
letters, numbers, dots, and underscores, and should be prefixed with
a name that is unique to the defining server or gateway. For
example, ``mod_python`` might define variables with names like
``mod_python.some_variable``.
Input and Error Streams
~~~~~~~~~~~~~~~~~~~~~~~
The input and error streams provided by the server must support
the following methods:
=================== ========== ========
Method Stream Notes
=================== ========== ========
``read(size)`` ``input`` 1
``readline()`` ``input`` 1, 2
``readlines(hint)`` ``input`` 1, 3
``__iter__()`` ``input``
``flush()`` ``errors`` 4
``write(str)`` ``errors``
``writelines(seq)`` ``errors``
=================== ========== ========
The semantics of each method are as documented in the Python Library
Reference, except for these notes as listed in the table above:
1. The server is not required to read past the client's specified
``Content-Length``, and **should** simulate an end-of-file
condition if the application attempts to read past that point.
The application **should not** attempt to read more data than is
specified by the ``CONTENT_LENGTH`` variable.
A server **should** allow ``read()`` to be called without an argument,
and return the remainder of the client's input stream.
A server **should** return empty bytestrings from any attempt to
read from an empty or exhausted input stream.
2. Servers **should** support the optional "size" argument to ``readline()``,
but as in WSGI 1.0, they are allowed to omit support for it.
(In WSGI 1.0, the size argument was not supported, on the grounds that
it might have been complex to implement, and was not often used in
practice... but then the ``cgi`` module started using it, and so
practical servers had to start supporting it anyway!)
3. Note that the ``hint`` argument to ``readlines()`` is optional for
both caller and implementer. The application is free not to
supply it, and the server or gateway is free to ignore it.
4. Since the ``errors`` stream may not be rewound, servers and gateways
are free to forward write operations immediately, without buffering.
In this case, the ``flush()`` method may be a no-op. Portable
applications, however, cannot assume that output is unbuffered
or that ``flush()`` is a no-op. They must call ``flush()`` if
they need to ensure that output has in fact been written. (For
example, to minimize intermingling of data from multiple processes
writing to the same error log.)
The methods listed in the table above **must** be supported by all
servers conforming to this specification. Applications conforming
to this specification **must not** use any other methods or attributes
of the ``input`` or ``errors`` objects. In particular, applications
**must not** attempt to close these streams, even if they possess
``close()`` methods.
The ``start_response()`` Callable
---------------------------------
The second parameter passed to the application object is a callable
of the form ``start_response(status, response_headers, exc_info=None)``.
(As with all WSGI callables, the arguments must be supplied
positionally, not by keyword.) The ``start_response`` callable is
used to begin the HTTP response, and it must return a
``write(body_data)`` callable (see the `Buffering and Streaming`_
section, below).
The ``status`` argument is an HTTP "status" string like ``"200 OK"``
or ``"404 Not Found"``. That is, it is a string consisting of a
Status-Code and a Reason-Phrase, in that order and separated by a
single space, with no surrounding whitespace or other characters.
(See :rfc:`2616`, Section 6.1.1 for more information.) The string
**must not** contain control characters, and must not be terminated
with a carriage return, linefeed, or combination thereof.
The ``response_headers`` argument is a list of ``(header_name,
header_value)`` tuples. It must be a Python list; i.e.
``type(response_headers) is ListType``, and the server **may** change
its contents in any way it desires. Each ``header_name`` must be a
valid HTTP header field-name (as defined by :rfc:`2616`, Section 4.2),
without a trailing colon or other punctuation.
Each ``header_value`` **must not** include *any* control characters,
including carriage returns or linefeeds, either embedded or at the end.
(These requirements are to minimize the complexity of any parsing that
must be performed by servers, gateways, and intermediate response
processors that need to inspect or modify response headers.)
In general, the server or gateway is responsible for ensuring that
correct headers are sent to the client: if the application omits
a header required by HTTP (or other relevant specifications that are in
effect), the server or gateway **must** add it. For example, the HTTP
``Date:`` and ``Server:`` headers would normally be supplied by the
server or gateway.
(A reminder for server/gateway authors: HTTP header names are
case-insensitive, so be sure to take that into consideration when
examining application-supplied headers!)
Applications and middleware are forbidden from using HTTP/1.1
"hop-by-hop" features or headers, any equivalent features in HTTP/1.0,
or any headers that would affect the persistence of the client's
connection to the web server. These features are the
exclusive province of the actual web server, and a server or gateway
**should** consider it a fatal error for an application to attempt
sending them, and raise an error if they are supplied to
``start_response()``. (For more specifics on "hop-by-hop" features and
headers, please see the `Other HTTP Features`_ section below.)
Servers **should** check for errors in the headers at the time
``start_response`` is called, so that an error can be raised while
the application is still running.
However, the ``start_response`` callable **must not** actually transmit the
response headers. Instead, it must store them for the server or
gateway to transmit **only** after the first iteration of the
application return value that yields a non-empty bytestring, or upon
the application's first invocation of the ``write()`` callable. In
other words, response headers must not be sent until there is actual
body data available, or until the application's returned iterable is
exhausted. (The only possible exception to this rule is if the
response headers explicitly include a ``Content-Length`` of zero.)
This delaying of response header transmission is to ensure that buffered
and asynchronous applications can replace their originally intended
output with error output, up until the last possible moment. For
example, the application may need to change the response status from
"200 OK" to "500 Internal Error", if an error occurs while the body is
being generated within an application buffer.
The ``exc_info`` argument, if supplied, must be a Python
``sys.exc_info()`` tuple. This argument should be supplied by the
application only if ``start_response`` is being called by an error
handler. If ``exc_info`` is supplied, and no HTTP headers have been
output yet, ``start_response`` should replace the currently-stored
HTTP response headers with the newly-supplied ones, thus allowing the
application to "change its mind" about the output when an error has
occurred.
However, if ``exc_info`` is provided, and the HTTP headers have already
been sent, ``start_response`` **must** raise an error, and **should**
re-raise using the ``exc_info`` tuple. That is::
raise exc_info[1].with_traceback(exc_info[2])
This will re-raise the exception trapped by the application, and in
principle should abort the application. (It is not safe for the
application to attempt error output to the browser once the HTTP
headers have already been sent.) The application **must not** trap
any exceptions raised by ``start_response``, if it called
``start_response`` with ``exc_info``. Instead, it should allow
such exceptions to propagate back to the server or gateway. See
`Error Handling`_ below, for more details.
The application **may** call ``start_response`` more than once, if and
only if the ``exc_info`` argument is provided. More precisely, it is
a fatal error to call ``start_response`` without the ``exc_info``
argument if ``start_response`` has already been called within the
current invocation of the application. This includes the case where
the first call to ``start_response`` raised an error. (See the example
CGI gateway above for an illustration of the correct logic.)
Note: servers, gateways, or middleware implementing ``start_response``
**should** ensure that no reference is held to the ``exc_info``
parameter beyond the duration of the function's execution, to avoid
creating a circular reference through the traceback and frames
involved. The simplest way to do this is something like::
def start_response(status, response_headers, exc_info=None):
if exc_info:
try:
# do stuff w/exc_info here
finally:
exc_info = None # Avoid circular ref.
The example CGI gateway provides another illustration of this
technique.
Handling the ``Content-Length`` Header
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If the application supplies a ``Content-Length`` header, the server
**should not** transmit more bytes to the client than the header
allows, and **should** stop iterating over the response when enough
data has been sent, or raise an error if the application tries to
``write()`` past that point. (Of course, if the application does
not provide *enough* data to meet its stated ``Content-Length``,
the server **should** close the connection and log or otherwise
report the error.)
If the application does not supply a ``Content-Length`` header, a
server or gateway may choose one of several approaches to handling
it. The simplest of these is to close the client connection when
the response is completed.
Under some circumstances, however, the server or gateway may be
able to either generate a ``Content-Length`` header, or at least
avoid the need to close the client connection. If the application
does *not* call the ``write()`` callable, and returns an iterable
whose ``len()`` is 1, then the server can automatically determine
``Content-Length`` by taking the length of the first bytestring yielded
by the iterable.
And, if the server and client both support HTTP/1.1
:rfc:`"chunked encoding"<2616#section-3.6.1>`,
then the server **may** use chunked encoding to send
a chunk for each ``write()`` call or bytestring yielded by the iterable,
thus generating a ``Content-Length`` header for each chunk. This
allows the server to keep the client connection alive, if it wishes
to do so. Note that the server **must** comply fully with :rfc:`2616`
when doing this, or else fall back to one of the other strategies for
dealing with the absence of ``Content-Length``.
(Note: applications and middleware **must not** apply any kind of
``Transfer-Encoding`` to their output, such as chunking or gzipping;
as "hop-by-hop" operations, these encodings are the province of the
actual web server/gateway. See `Other HTTP Features`_ below, for
more details.)
Buffering and Streaming
-----------------------
Generally speaking, applications will achieve the best throughput
by buffering their (modestly-sized) output and sending it all at
once. This is a common approach in existing frameworks such as
Zope: the output is buffered in a StringIO or similar object, then
transmitted all at once, along with the response headers.
The corresponding approach in WSGI is for the application to simply
return a single-element iterable (such as a list) containing the
response body as a single bytestring. This is the recommended approach
for the vast majority of application functions, that render
HTML pages whose text easily fits in memory.
For large files, however, or for specialized uses of HTTP streaming
(such as multipart "server push"), an application may need to provide
output in smaller blocks (e.g. to avoid loading a large file into
memory). It's also sometimes the case that part of a response may
be time-consuming to produce, but it would be useful to send ahead the
portion of the response that precedes it.
In these cases, applications will usually return an iterator (often
a generator-iterator) that produces the output in a block-by-block
fashion. These blocks may be broken to coincide with multipart
boundaries (for "server push"), or just before time-consuming
tasks (such as reading another block of an on-disk file).
WSGI servers, gateways, and middleware **must not** delay the
transmission of any block; they **must** either fully transmit
the block to the client, or guarantee that they will continue
transmission even while the application is producing its next block.
A server/gateway or middleware may provide this guarantee in one of
three ways:
1. Send the entire block to the operating system (and request
that any O/S buffers be flushed) before returning control
to the application, OR