Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add REST API for remote functions RFC #25

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tdcmeehan
Copy link
Contributor

No description provided.

@tdcmeehan tdcmeehan marked this pull request as draft August 15, 2024 12:24
@tdcmeehan tdcmeehan force-pushed the functions branch 3 times, most recently from 1a17d1a to 35b6047 Compare August 15, 2024 19:56
@tdcmeehan tdcmeehan marked this pull request as ready for review August 27, 2024 19:17
@tdcmeehan tdcmeehan requested a review from aditi-pandit August 27, 2024 19:17
@prithvip
Copy link

"This RFC proposes to extend the
remote function design to allow for dynamic function registration and execution."

Don't we already support dynamically registered functions today? We can do CREATE FUNCTION and it will create the function in the function namespace manager, which could create it in Metastore.

@prithvip
Copy link

Why is REST API like GET /v1/functions necessary? Since we can get this information already through "SHOW FUNCTIONS" today?

@tdcmeehan
Copy link
Contributor Author

Don't we already support dynamically registered functions today? We can do CREATE FUNCTION and it will create the function in the function namespace manager, which could create it in Metastore.

Let me clarify, while we can CREATE FUNCTION, this takes a SQL expression. Our SQL grammer permits the addition of an identifier, but we don't use it, so effectively we can't create an externally defined function (like a JAR or container), it is presumed that this is externally managed. So the goal is to enable that use case, and also provide a reference implementation that shows how that works end to end.

Why is REST API like GET /v1/functions necessary? Since we can get this information already through "SHOW FUNCTIONS" today?

The purpose of a REST spec is to allow for a straightfoward implementation of function namespace managers. Currently, Presto does not have an opinion on the format of the remote function server. This adds an implementation of a function namespace manager that interacts with this REST spec, a reference implementation of this REST spec that can be used as is or for testing or to enable Java SPI-defined functions for C++ clusters, and integration with Prestissimo so that C++ clusters can make calls to the function server.

Copy link

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tdcmeehan : Had a bunch of comments.

### Dynamic functions in remote function servers

A new REST API is defined, along with a REST plugin implementation, which allows for consistent and unified metadata and execution
of remote functions in a single API definition. This API is designed to be extensible, allowing for the definition of new functions

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean dynamic here instead of extensible ? Can we presume that new functions can be defined and used without a server restart ?


Fundamentally, the design of the REST API for remote functions will be based on the existing `FunctionNamespaceManager` interface SPI.
Additionally, for Presto C++, currently there is no corresponding SPI for function namespace managers. This RFC proposes to
create a new REST-based implementation of the function execution framework. By creating a REST API for remote functions that

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you evaluate REST vs gRPC ?


#### Presto C++ special considerations

The current Presto C++ implementation does not have a `FunctionNamespaceManager` SPI. This RFC proposes to extend the Velox

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the function servers configured in a static list or is there a dynamic discovery of function servers ? Will the function server need to announce itself to a discovery service ?

>
> Response body: JSON array of function metadata objects

Returns the complete listing of functions in the specified schema with the specified function name.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should return the function Ids as well so that they can be used in subsequent calls ?

>
> Request body: JSON object representing the function to be added
>
> Response body: the function ID of the newly created function

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to give more details about functionId. functionId is local to the function server. Will Presto use it at all ?


Returns the complete listing of functions in the specified schema with the specified function name.

#### Add a function

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you envision this being used ? PrestoSQL could expose CREATE FUNCTION that uses this API. But then how do you give access to the function body ?

How do you make this work across function implementations in different languages ?

specific function, which is useful to differentiate multiple functions which share the same name but have different
arguments.

#### Update a function

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the e2e flow of interactions in which this API is used ?

#### Presto C++ special considerations

The current Presto C++ implementation does not have a `FunctionNamespaceManager` SPI. This RFC proposes to extend the Velox
function server to support the REST API for remote functions. This will allow for the definition of functions at runtime in

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function server and Presto server should have independent lifetimes. If the user wanted to upgrade their function servers, then how does that proceed ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants