Skip to content

Commit

Permalink
--threadless default for Python 3.8+ on mac and linux (#710)
Browse files Browse the repository at this point in the history
* Explicit `multiprocessing.Manager.shutdown`

Multiprocessing manager is used within eventing core. From doc,
it appears to start a BaseManager which starts a server????
Seriously???? Anyways, using multiprocessing manager is a PITA
and mistake, as it doesn't even give us performance we expect.
Our proxy server can handle more requests than what multiprocess
manager can exchange between processes.

* `--threadless is now ON by default for `Python 3.8+` on `mac` and `linux` environments

* Clarity around why multiprocessing.Manager must be deprecated

* Add `--threaded` flag which can be used to fallback for environments where `--threadless` is now default

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* never used

* Update README

* Use `threaded=True` in tests which were written for threaded model

* Fix issue where sharing manager between global event queue and subscriber can lead to TypeError

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
abhinavsingh and pre-commit-ci[bot] authored Nov 8, 2021
1 parent aadcc10 commit 98e6d0b
Show file tree
Hide file tree
Showing 23 changed files with 248 additions and 137 deletions.
162 changes: 88 additions & 74 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1603,16 +1603,19 @@ Now point your CDT instance to `ws://localhost:8899/devtools`.

## Threads vs Threadless

Pre v2.x, `proxy.py` used to spawn new threads for handling
client requests.
### Pre v2.x

Starting v2.x, `proxy.py` added support for threadless execution of
client requests using `asyncio`.
`proxy.py` used to spawn new threads for handling client requests.

In future, threadless execution will be the default mode.
### Starting v2.0

Till then if you are interested in trying it out,
start `proxy.py` with `--threadless` flag.
`proxy.py` added support for threadless execution of client requests using `asyncio`.

### Starting v2.4.0

Threadless execution was turned ON by default for `Python 3.8+` on `mac` and `linux` environments. `proxy.py` threadless execution has been reported safe on these environments by our users. If you are running into trouble, fallback to threaded mode using `--threaded` flag.

For `windows` and `Python < 3.8`, you can still try out threadless mode by starting `proxy.py` with `--threadless` flag. If threadless works for you, consider sending a PR by editing `_env_threadless_compliant` method in the `proxy/common/constants.py` file.

## SyntaxError: invalid syntax

Expand Down Expand Up @@ -1883,19 +1886,23 @@ for list of tests.

```console
proxy -h
usage: proxy [-h] [--enable-events] [--enable-conn-pool] [--threadless] [--pid-file PID_FILE]
[--backlog BACKLOG] [--hostname HOSTNAME] [--port PORT] [--num-workers NUM_WORKERS]
[--unix-socket-path UNIX_SOCKET_PATH] [--client-recvbuf-size CLIENT_RECVBUF_SIZE]
[--key-file KEY_FILE] [--timeout TIMEOUT] [--version] [--log-level LOG_LEVEL]
[--log-file LOG_FILE] [--log-format LOG_FORMAT] [--open-file-limit OPEN_FILE_LIMIT]
[--plugins PLUGINS] [--disable-http-proxy] [--ca-key-file CA_KEY_FILE]
[--ca-cert-dir CA_CERT_DIR] [--ca-cert-file CA_CERT_FILE] [--ca-file CA_FILE]
[--ca-signing-key-file CA_SIGNING_KEY_FILE] [--cert-file CERT_FILE]
[--disable-headers DISABLE_HEADERS] [--server-recvbuf-size SERVER_RECVBUF_SIZE]
[--basic-auth BASIC_AUTH] [--cache-dir CACHE_DIR]
[--filtered-upstream-hosts FILTERED_UPSTREAM_HOSTS] [--enable-web-server]
[--enable-static-server] [--static-server-dir STATIC_SERVER_DIR]
[--pac-file PAC_FILE] [--pac-file-url-path PAC_FILE_URL_PATH]
usage: proxy [-h] [--enable-events] [--enable-conn-pool] [--threadless] [--threaded]
[--pid-file PID_FILE] [--backlog BACKLOG] [--hostname HOSTNAME]
[--port PORT] [--num-workers NUM_WORKERS]
[--unix-socket-path UNIX_SOCKET_PATH]
[--client-recvbuf-size CLIENT_RECVBUF_SIZE] [--key-file KEY_FILE]
[--timeout TIMEOUT] [--version] [--log-level LOG_LEVEL]
[--log-file LOG_FILE] [--log-format LOG_FORMAT]
[--open-file-limit OPEN_FILE_LIMIT] [--plugins PLUGINS] [--enable-dashboard]
[--disable-http-proxy] [--ca-key-file CA_KEY_FILE]
[--ca-cert-dir CA_CERT_DIR] [--ca-cert-file CA_CERT_FILE]
[--ca-file CA_FILE] [--ca-signing-key-file CA_SIGNING_KEY_FILE]
[--cert-file CERT_FILE] [--disable-headers DISABLE_HEADERS]
[--server-recvbuf-size SERVER_RECVBUF_SIZE] [--basic-auth BASIC_AUTH]
[--cache-dir CACHE_DIR] [--filtered-upstream-hosts FILTERED_UPSTREAM_HOSTS]
[--enable-web-server] [--enable-static-server]
[--static-server-dir STATIC_SERVER_DIR] [--pac-file PAC_FILE]
[--pac-file-url-path PAC_FILE_URL_PATH]
[--filtered-client-ips FILTERED_CLIENT_IPS]
[--filtered-url-regex-config FILTERED_URL_REGEX_CONFIG]
[--cloudflare-dns-mode CLOUDFLARE_DNS_MODE]
Expand All @@ -1904,104 +1911,111 @@ proxy.py v2.4.0

options:
-h, --help show this help message and exit
--enable-events Default: False. Enables core to dispatch lifecycle events. Plugins
can be used to subscribe for core events.
--enable-events Default: False. Enables core to dispatch lifecycle events.
Plugins can be used to subscribe for core events.
--enable-conn-pool Default: False. (WIP) Enable upstream connection pooling.
--threadless Default: False. When disabled a new thread is spawned to handle each
--threadless Default: True. Enabled by default on Python 3.8+ (mac, linux).
When disabled a new thread is spawned to handle each client
connection.
--threaded Default: False. Disabled by default on Python < 3.8 and
windows. When enabled a new thread is spawned to handle each
client connection.
--pid-file PID_FILE Default: None. Save parent process ID to a file.
--backlog BACKLOG Default: 100. Maximum number of pending connections to proxy server
--backlog BACKLOG Default: 100. Maximum number of pending connections to proxy
server
--hostname HOSTNAME Default: ::1. Server IP address.
--port PORT Default: 8899. Server port.
--num-workers NUM_WORKERS
Defaults to number of CPU cores.
--unix-socket-path UNIX_SOCKET_PATH
Default: None. Unix socket path to use. When provided --host and
--port flags are ignored
Default: None. Unix socket path to use. When provided --host
and --port flags are ignored
--client-recvbuf-size CLIENT_RECVBUF_SIZE
Default: 1 MB. Maximum amount of data received from the client in a
single recv() operation. Bump this value for faster uploads at the
expense of increased RAM.
--key-file KEY_FILE Default: None. Server key file to enable end-to-end TLS encryption
with clients. If used, must also pass --cert-file.
--timeout TIMEOUT Default: 10.0. Number of seconds after which an inactive connection
must be dropped. Inactivity is defined by no data sent or received by
the client.
Default: 1 MB. Maximum amount of data received from the client
in a single recv() operation. Bump this value for faster
uploads at the expense of increased RAM.
--key-file KEY_FILE Default: None. Server key file to enable end-to-end TLS
encryption with clients. If used, must also pass --cert-file.
--timeout TIMEOUT Default: 10.0. Number of seconds after which an inactive
connection must be dropped. Inactivity is defined by no data
sent or received by the client.
--version, -v Prints proxy.py version.
--log-level LOG_LEVEL
Valid options: DEBUG, INFO (default), WARNING, ERROR, CRITICAL. Both
upper and lowercase values are allowed. You may also simply use the
leading character e.g. --log-level d
Valid options: DEBUG, INFO (default), WARNING, ERROR,
CRITICAL. Both upper and lowercase values are allowed. You may
also simply use the leading character e.g. --log-level d
--log-file LOG_FILE Default: sys.stdout. Log file destination.
--log-format LOG_FORMAT
Log format for Python logger.
--open-file-limit OPEN_FILE_LIMIT
Default: 1024. Maximum number of files (TCP connections) that
proxy.py can open concurrently.
--plugins PLUGINS Comma separated plugins
--enable-dashboard Default: False. Enables proxy.py dashboard.
--disable-http-proxy Default: False. Whether to disable proxy.HttpProxyPlugin.
--ca-key-file CA_KEY_FILE
Default: None. CA key to use for signing dynamically generated HTTPS
certificates. If used, must also pass --ca-cert-file and --ca-
signing-key-file
Default: None. CA key to use for signing dynamically generated
HTTPS certificates. If used, must also pass --ca-cert-file and
--ca-signing-key-file
--ca-cert-dir CA_CERT_DIR
Default: ~/.proxy.py. Directory to store dynamically generated
certificates. Also see --ca-key-file, --ca-cert-file and --ca-
signing-key-file
--ca-cert-file CA_CERT_FILE
Default: None. Signing certificate to use for signing dynamically
generated HTTPS certificates. If used, must also pass --ca-key-file
and --ca-signing-key-file
Default: None. Signing certificate to use for signing
dynamically generated HTTPS certificates. If used, must also
pass --ca-key-file and --ca-signing-key-file
--ca-file CA_FILE Default:
/Users/abhinavsingh/Dev/proxy.py/venv310/lib/python3.10/site-
packages/certifi/cacert.pem. Provide path to custom CA bundle for
peer certificate verification
packages/certifi/cacert.pem. Provide path to custom CA bundle
for peer certificate verification
--ca-signing-key-file CA_SIGNING_KEY_FILE
Default: None. CA signing key to use for dynamic generation of HTTPS
certificates. If used, must also pass --ca-key-file and --ca-cert-
file
Default: None. CA signing key to use for dynamic generation of
HTTPS certificates. If used, must also pass --ca-key-file and
--ca-cert-file
--cert-file CERT_FILE
Default: None. Server certificate to enable end-to-end TLS encryption
with clients. If used, must also pass --key-file.
Default: None. Server certificate to enable end-to-end TLS
encryption with clients. If used, must also pass --key-file.
--disable-headers DISABLE_HEADERS
Default: None. Comma separated list of headers to remove before
dispatching client request to upstream server.
Default: None. Comma separated list of headers to remove
before dispatching client request to upstream server.
--server-recvbuf-size SERVER_RECVBUF_SIZE
Default: 1 MB. Maximum amount of data received from the server in a
single recv() operation. Bump this value for faster downloads at the
expense of increased RAM.
Default: 1 MB. Maximum amount of data received from the server
in a single recv() operation. Bump this value for faster
downloads at the expense of increased RAM.
--basic-auth BASIC_AUTH
Default: No authentication. Specify colon separated user:password to
enable basic authentication.
Default: No authentication. Specify colon separated
user:password to enable basic authentication.
--cache-dir CACHE_DIR
Default: A temporary directory. Flag only applicable when cache
plugin is used with on-disk storage.
Default: A temporary directory. Flag only applicable when
cache plugin is used with on-disk storage.
--filtered-upstream-hosts FILTERED_UPSTREAM_HOSTS
Default: Blocks Facebook. Comma separated list of IPv4 and IPv6
addresses.
Default: Blocks Facebook. Comma separated list of IPv4 and
IPv6 addresses.
--enable-web-server Default: False. Whether to enable proxy.HttpWebServerPlugin.
--enable-static-server
Default: False. Enable inbuilt static file server. Optionally, also
use --static-server-dir to serve static content from custom
directory. By default, static file server serves out of installed
proxy.py python module folder.
Default: False. Enable inbuilt static file server. Optionally,
also use --static-server-dir to serve static content from
custom directory. By default, static file server serves out of
installed proxy.py python module folder.
--static-server-dir STATIC_SERVER_DIR
Default: "public" folder in directory where proxy.py is placed. This
option is only applicable when static server is also enabled. See
--enable-static-server.
--pac-file PAC_FILE A file (Proxy Auto Configuration) or string to serve when the server
receives a direct file request. Using this option enables
proxy.HttpWebServerPlugin.
Default: "public" folder in directory where proxy.py is
placed. This option is only applicable when static server is
also enabled. See --enable-static-server.
--pac-file PAC_FILE A file (Proxy Auto Configuration) or string to serve when the
server receives a direct file request. Using this option
enables proxy.HttpWebServerPlugin.
--pac-file-url-path PAC_FILE_URL_PATH
Default: /. Web server path to serve the PAC file.
--filtered-client-ips FILTERED_CLIENT_IPS
Default: 127.0.0.1,::1. Comma separated list of IPv4 and IPv6
addresses.
--filtered-url-regex-config FILTERED_URL_REGEX_CONFIG
Default: No config. Comma separated list of IPv4 and IPv6 addresses.
Default: No config. Comma separated list of IPv4 and IPv6
addresses.
--cloudflare-dns-mode CLOUDFLARE_DNS_MODE
Default: security. Either "security" (for malware protection) or
"family" (for malware and adult content protection)
Default: security. Either "security" (for malware protection)
or "family" (for malware and adult content protection)

Proxy.py not working? Report at: https://github.com/abhinavsingh/proxy.py/issues/new
```
Expand Down
12 changes: 11 additions & 1 deletion proxy/common/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
:license: BSD, see LICENSE for more details.
"""
import os
import sys
import time
import secrets
import pathlib
Expand All @@ -19,6 +20,15 @@

from .version import __version__


def _env_threadless_compliant() -> bool:
"""Returns true for Python 3.8+ across all platforms
except Windows."""
if os.name == 'nt':
return False
return sys.version_info >= (3, 8)


PROXY_PY_START_TIME = time.time()

# /path/to/proxy.py/proxy folder
Expand Down Expand Up @@ -86,7 +96,7 @@
DEFAULT_PORT = 8899
DEFAULT_SERVER_RECVBUF_SIZE = DEFAULT_BUFFER_SIZE
DEFAULT_STATIC_SERVER_DIR = os.path.join(PROXY_PY_DIR, "public")
DEFAULT_THREADLESS = False
DEFAULT_THREADLESS = _env_threadless_compliant()
DEFAULT_TIMEOUT = 10.0
DEFAULT_VERSION = False
DEFAULT_HTTP_PORT = 80
Expand Down
1 change: 1 addition & 0 deletions proxy/common/flag.py
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,7 @@ def initialize(
)
args.timeout = cast(int, opts.get('timeout', args.timeout))
args.threadless = cast(bool, opts.get('threadless', args.threadless))
args.threaded = cast(bool, opts.get('threaded', args.threaded))
args.enable_events = cast(
bool,
opts.get(
Expand Down
13 changes: 11 additions & 2 deletions proxy/common/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,17 +20,26 @@
from types import TracebackType
from typing import Optional, Dict, Any, List, Tuple, Type, Callable

from .constants import HTTP_1_1, COLON, WHITESPACE, CRLF, DEFAULT_TIMEOUT
from .constants import HTTP_1_1, COLON, WHITESPACE, CRLF, DEFAULT_TIMEOUT, DEFAULT_THREADLESS

if os.name != 'nt':
import resource

logger = logging.getLogger(__name__)


def is_threadless(threadless: bool, threaded: bool) -> bool:
# if default is threadless then return true unless
# user has overridden mode using threaded flag.
#
# if default is not threadless then return true
# only if user has overridden using --threadless flag
return (DEFAULT_THREADLESS and not threaded) or (not DEFAULT_THREADLESS and threadless)


def is_py2() -> bool:
"""Exists only to avoid mocking sys.version_info in tests."""
return sys.version_info[0] == 2
return sys.version_info.major == 2


def text_(s: Any, encoding: str = 'utf-8', errors: str = 'strict') -> Any:
Expand Down
21 changes: 17 additions & 4 deletions proxy/core/acceptor/acceptor.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
from ..event import EventQueue, eventNames
from ...common.constants import DEFAULT_THREADLESS
from ...common.flag import flags
from ...common.utils import is_threadless
from ...common.logger import Logger

logger = logging.getLogger(__name__)
Expand All @@ -36,7 +37,19 @@
'--threadless',
action='store_true',
default=DEFAULT_THREADLESS,
help='Default: False. When disabled a new thread is spawned '
help='Default: ' + ('True' if DEFAULT_THREADLESS else 'False') + '. ' +
'Enabled by default on Python 3.8+ (mac, linux). ' +
'When disabled a new thread is spawned '
'to handle each client connection.',
)

flags.add_argument(
'--threaded',
action='store_true',
default=not DEFAULT_THREADLESS,
help='Default: ' + ('True' if not DEFAULT_THREADLESS else 'False') + '. ' +
'Disabled by default on Python < 3.8 and windows. ' +
'When enabled a new thread is spawned '
'to handle each client connection.',
)

Expand Down Expand Up @@ -150,7 +163,7 @@ def run_once(self) -> None:
conn, addr = self.sock.accept()
addr = None if addr == '' else addr
if (
self.flags.threadless and
is_threadless(self.flags.threadless, self.flags.threaded) and
self.threadless_client_queue and
self.threadless_process
):
Expand All @@ -173,15 +186,15 @@ def run(self) -> None:
)
try:
self.selector.register(self.sock, selectors.EVENT_READ)
if self.flags.threadless:
if is_threadless(self.flags.threadless, self.flags.threaded):
self.start_threadless_process()
while not self.running.is_set():
self.run_once()
except KeyboardInterrupt:
pass
finally:
self.selector.unregister(self.sock)
if self.flags.threadless:
if is_threadless(self.flags.threadless, self.flags.threaded):
self.shutdown_threadless_process()
self.sock.close()
logger.debug('Acceptor#%d shutdown', self.idd)
4 changes: 2 additions & 2 deletions proxy/core/event/manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,10 +75,10 @@ def setup(self) -> None:
logger.debug('Thread ID: %d', self.dispatcher_thread.ident)

def shutdown(self) -> None:
assert self.dispatcher_shutdown
assert self.dispatcher_thread
assert self.dispatcher_shutdown and self.dispatcher_thread and self.manager
self.dispatcher_shutdown.set()
self.dispatcher_thread.join()
self.manager.shutdown()
logger.debug(
'Shutdown of global event dispatcher thread %d successful',
self.dispatcher_thread.ident,
Expand Down
Loading

0 comments on commit 98e6d0b

Please sign in to comment.