Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MOD-3841]: run web server in separate thread #2255

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

kramstrom
Copy link
Contributor

@kramstrom kramstrom commented Sep 20, 2024

Describe your changes

Allows blocking calls in web_server decorated functions by running the user callable in a thread so you can do something like:

import modal

image = modal.Image.debian_slim().pip_install("lavague-core", "lavague-server").env({"LAVAGUE_TELEMETRY": "NONE"})
app = modal.App("lavague-server", image=image)


@app.function()
@modal.web_server(port=8000)
def websocket_app():
    from lavague.core import ActionEngine, WorldModel
    from lavague.core.agents import WebAgent
    from lavague.server import AgentSession
    from lavague.server.base import AgentServer
    from lavague.server.driver import DriverServer

    def create_agent(session: AgentSession):
        world_model = WorldModel()
        driver = DriverServer(session)
        action_engine = ActionEngine(driver)
        return WebAgent(world_model, action_engine)

    server = AgentServer(create_agent, port=8000)
    server.serve()

instead of having to run it in a thread, since it can be confusing that the web_server decorator currently expects users to run everything in a background thread

Backward/forward compatibility checks

Check these boxes or delete any item (or this section) if not relevant for this PR.

  • Client+Server: this change is compatible with old servers
  • Client forward compatibility: this change ensures client can accept data intended for later versions of itself

Note on protobuf: protobuf message changes in one place may have impact to
multiple entities (client, server, worker, database). See points above.


Changelog

Changes the modal.web_server decorator to run the user defined function in a thread instead of forcing users to handle this themselves. Previously you would have to do something like:

@app.function()
@web_server(8000)
def server():
    import subprocess

    self.subprocess = subprocess.Popen(["python", "-m", "http.server", "-b", "0.0.0.0", "8000"])

but now we also support the blocking alternative:

@app.function()
@web_server(8000)
def blocking_web_server():
    from http.server import HTTPServer, SimpleHTTPRequestHandler

    httpd = HTTPServer(("0.0.0.0", 8000), SimpleHTTPRequestHandler)
    httpd.serve_forever()

This also changes the startup_timeout functionality which previously would start at the end of the function but will now start at the function beginning which may result in a need to increase the timeout.

@ekzhang
Copy link
Member

ekzhang commented Sep 20, 2024

If you wanted to use uvicorn.run(), couldn't you just use asgi_app in that case?

Right now web_server is designed to wait until the function finishes executing before it starts polling for the server to start up, this is more flexible than always running the code in a thread and also will propagate initialization errors sooner.

This is also a breaking change since the startup_timeout would start at function begin, not at function end.

@kramstrom
Copy link
Contributor Author

kramstrom commented Sep 20, 2024

If you wanted to use uvicorn.run(), couldn't you just use asgi_app in that case?

Right now web_server is designed to wait until the function finishes executing before it starts polling for the server to start up, this is more flexible than always running the code in a thread and also will propagate initialization errors sooner.

This is also a breaking change since the startup_timeout would start at function begin, not at function end.

Yeah the uvicorn was just an example of something that will block the thread. I think the reasoning for running the function in a thread is that it can be a footgun for users that their web_server decorated function can be blocking and never therefore never return without indicating clearly why

@ekzhang
Copy link
Member

ekzhang commented Sep 20, 2024

Yeah the uvicorn was just an example of something that will block the thread.

Right, I'm just saying that if the concern is ergonomics in eliminating one line of code (Thread().start()), we should design based on examples that exist rather than theoretical examples.

@kramstrom
Copy link
Contributor Author

kramstrom commented Sep 23, 2024

Yeah the uvicorn was just an example of something that will block the thread.

Right, I'm just saying that if the concern is ergonomics in eliminating one line of code (Thread().start()), we should design based on examples that exist rather than theoretical examples.

Yup, that's totally fair, updated the example now to a real user issue that prompted this

@aksh-at
Copy link
Contributor

aksh-at commented Jan 7, 2025

Do we want to merge this? The issue isn't eliminating one line of code for the user, it's that users don't realize they are supposed to launch something in the background. It's a bit of a footgun.

@kramstrom kramstrom requested a review from freider January 8, 2025 07:20
@kramstrom kramstrom marked this pull request as ready for review January 8, 2025 07:21
@freider
Copy link
Contributor

freider commented Jan 8, 2025

Can we add a small test?

I agree this is valuable and prevents a common footgun. Since this is a slightly breaking change (startup_timeout, potential threading issues) we should definitely add a changelog entry about it. Maybe add a minor version bump as well?

@@ -76,7 +77,8 @@ def construct_webhook_callable(

elif webhook_config.type == api_pb2.WEBHOOK_TYPE_WEB_SERVER:
# Function spawns an HTTP web server listening at a port.
user_defined_callable()
server_thread = threading.Thread(target=user_defined_callable, name="web_server", daemon=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat separate from this (but made more apparent here), but it bothers me a bit that there is no clear graceful way to shut down a web server launched this way on container shutdown. (Assigning some variable to a class and adding an exit lifecycle method would work I suppose, but that's not symmetric to the launch itself)

@kramstrom kramstrom requested a review from freider January 8, 2025 14:00
…rom/mod-3841-run-web_server-function-in-a-separate-thread-in-case-user
@mwaskom
Copy link
Contributor

mwaskom commented Jan 8, 2025

Should the changelog call out the new timeout semantics?

@ekzhang
Copy link
Member

ekzhang commented Jan 8, 2025

Do we want to merge this? The issue isn't eliminating one line of code for the user, it's that users don't realize they are supposed to launch something in the background. It's a bit of a footgun.

Yeah I understand that, but it's pretty quick to notice in development the issue and how to support it. I'd argue that the alternative is less flexible and could lead to subtle bugs in production.

Developers already seem to set quite high startup timeouts, several minutes or more. With this change it would:

  1. Cause exceptions at process startup to not be caught until after the startup timeout, making the container idle for that time
  2. the startup timeout would also need to be extended further to include any data loading step in the function

Would also like to implement a healthcheck / liveness loop and Elias said, which could be returned from this function. The current API is built to be forward compatible with this in mind.

@ekzhang
Copy link
Member

ekzhang commented Jan 8, 2025

Also just wanted to mention this change is much harder to reverse than to apply, which makes it worth thinking about forward compatibility with a better API to support production workloads imo

…rom/mod-3841-run-web_server-function-in-a-separate-thread-in-case-user
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants