-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds deployment configuration for extproc #98
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,6 +2,7 @@ package v1alpha1 | |
|
||
import ( | ||
egv1a1 "github.com/envoyproxy/gateway/api/v1alpha1" | ||
corev1 "k8s.io/api/core/v1" | ||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||
gwapiv1 "sigs.k8s.io/gateway-api/apis/v1" | ||
gwapiv1a2 "sigs.k8s.io/gateway-api/apis/v1alpha2" | ||
|
@@ -57,7 +58,8 @@ type LLMRouteSpec struct { | |
// Each rule is a subset of the HTTPRoute in the Gateway API (https://gateway-api.sigs.k8s.io/api-types/httproute/). | ||
// | ||
// AI Gateway controller will generate a HTTPRoute based on the configuration given here with the additional | ||
// modifications to achieve the necessary jobs, notably inserting the AI Gateway external processor filter. | ||
// modifications to achieve the necessary jobs, notably inserting the AI Gateway filter responsible for | ||
// the transformation of the request and response, etc. | ||
// | ||
// In the matching conditions in the LLMRouteRule, `x-envoy-ai-gateway-model` header is available | ||
// if we want to describe the routing behavior based on the model name. The model name is extracted | ||
|
@@ -69,6 +71,14 @@ type LLMRouteSpec struct { | |
// +kubebuilder:validation:Required | ||
// +kubebuilder:validation:MaxItems=128 | ||
Rules []LLMRouteRule `json:"rules"` | ||
// FilterConfig is the configuration for the AI Gateway filter inserted in the generated HTTPRoute. | ||
// | ||
// An AI Gateway filter is responsible for the transformation of the request and response | ||
// as well as the routing behavior based on the model name extracted from the request content, etc. | ||
// | ||
// Currently, the filter is only implemented as an external process filter, which might be | ||
// extended to other types of filters in the future. See https://github.com/envoyproxy/ai-gateway/issues/90 | ||
FilterConfig *LLMRouteFilterConfig `json:"filterConfig,omitempty"` | ||
} | ||
|
||
// LLMRouteRule is a rule that defines the routing behavior of the LLMRoute. | ||
|
@@ -122,6 +132,45 @@ type LLMRouteRuleMatch struct { | |
Headers []gwapiv1.HTTPHeaderMatch `json:"headers,omitempty"` | ||
} | ||
|
||
type LLMRouteFilterConfig struct { | ||
// Type specifies the type of the filter configuration. | ||
// | ||
// Currently, only ExternalProcess is supported, and default is ExternalProcess. | ||
// | ||
// +kubebuilder:default=ExternalProcess | ||
Type LLMRouteFilterConfigType `json:"type"` | ||
|
||
// ExternalProcess is the configuration for the external process filter. | ||
// This is optional, and if not set, the default values of Deployment spec will be used. | ||
// | ||
// +optional | ||
ExternalProcess *LLMRouteFilterConfigExternalProcess `json:"externalProcess,omitempty"` | ||
} | ||
|
||
// LLMRouteFilterConfigType specifies the type of the filter configuration. | ||
// | ||
// +kubebuilder:validation:Enum=ExternalProcess;DynamicModule | ||
type LLMRouteFilterConfigType string | ||
|
||
const ( | ||
LLMRouteFilterConfigTypeExternalProcess LLMRouteFilterConfigType = "ExternalProcess" | ||
LLMRouteFilterConfigTypeDynamicModule LLMRouteFilterConfigType = "DynamicModule" // Reserved for https://github.com/envoyproxy/ai-gateway/issues/90 | ||
) | ||
|
||
type LLMRouteFilterConfigExternalProcess struct { | ||
// Replicas is the number of desired pods of the external process deployment. | ||
// | ||
// +optional | ||
Replicas *int32 `json:"replicas,omitempty"` | ||
// Resources required by the external process container. | ||
// More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ | ||
// | ||
// +optional | ||
Resources *corev1.ResourceRequirements `json:"resources,omitempty"` | ||
Comment on lines
+161
to
+169
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i could've used either the embedding of appv1.DeploymentSpec or EG's KubernetesDeploymentSpec but both seems too complex for this purpose in addition to the impl cost. we can add additional fields on demand. |
||
// TODO: maybe adding the option not to deploy the external process filter and let the user deploy it manually? | ||
// Not sure if it is worth it as we are migrating to dynamic modules. | ||
} | ||
|
||
// +kubebuilder:object:root=true | ||
|
||
// LLMBackend is a resource that represents a single backend for LLMRoute. | ||
|
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This indirection will help us deprecate and keep the API clean before/after extproc removal per #90
fyi I will work on the compatibility policy doc next
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this could be
AIGatewayFilterConfig
per #76There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, let's change it all together.