Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Host Environment Variable Forwarding #4842

Merged
merged 2 commits into from
Feb 9, 2025

Conversation

wdbaruni
Copy link
Member

@wdbaruni wdbaruni commented Feb 9, 2025

This PR adds the ability to securely forward host environment variables to job executions. This enables passing credentials and secrets from the host to jobs through a controlled allowlist mechanism.

Key Changes

  • Added support for referencing host environment variables using env: prefix
  • Implemented allowlist-based security controls at the compute node level
  • Added early validation in bid strategy to fail fast when jobs request non-allowlisted variables

Usage Example

# Job specification
Tasks:
  - Name: main
    Env:
      API_KEY: "env:API_KEY"     # Forward host's API_KEY
      LOG_PATH: "/logs"          # Regular literal value
    Engine:
      Type: docker
      Params:
        Image: ubuntu:latest

# Compute node configuration
compute:
  env:
    allowlist:
      - "API_*"    # Allow forwarding of any env var starting with API_

Security Design

  • Host variables must be explicitly allowlisted using patterns (e.g., API_*)
  • Jobs must explicitly request variables using env: prefix
  • Early validation during bid phase prevents scheduling jobs that request non-allowlisted variables
  • Creates clear audit trail of which credentials each job requested

Future Work

The architecture introduced in this PR is designed to be extensible to support secret management systems. The env: prefix pattern will evolve to support additional sources like:

Tasks:
  - Name: main
    Env:
      API_KEY: "vault:secrets/api-key"    # HashiCorp Vault
      DB_PASS: "aws:prod/db/password"     # AWS Secrets Manager
      CERT: "azure:certificates/prod"      # Azure Key Vault

This foundation enables:

  • Integration with popular secret vaults and cloud provider secret managers
  • Dynamic credential generation and rotation
  • More granular access control patterns

The key difference from the previous implementation is that we now support referencing host environment variables (prefixed with env:) in addition to literal values. This is implemented with security in mind - only explicitly allowlisted patterns can be forwarded, and jobs must declare which variables they need for audit purposes.

Summary by CodeRabbit

  • New Features

    • Introduced dynamic environment variable resolution for jobs and compute nodes, allowing secure usage of host variables through customizable allow lists.
    • Integrated environment variable handling into job execution and bidding workflows for more flexible configurations.
  • Documentation

    • Updated API specifications and schema descriptions to clarify how environment variables can be configured, including support for direct values and host references.

Copy link

linear bot commented Feb 9, 2025

Copy link
Contributor

coderabbitai bot commented Feb 9, 2025

Walkthrough

This update introduces comprehensive environment variable resolution across multiple components. New resolver strategies, interfaces, and types have been added to validate and resolve environment variables using configurable allowlists. The changes span bid strategies, executor preparation, compute node instantiation, configuration, API documentation, and test suites. In addition, existing models and tests have been updated to use a structured representation for environment variables, improving error handling and validation throughout the system.

Changes

File(s) Changed Change Summary
pkg/bidstrategy/semantic/env_resolver.go
pkg/bidstrategy/semantic/env_resolver_test.go
Adds EnvResolverStrategy implementing bid resolution logic based on environment variable resolution with corresponding tests.
pkg/compute/env/constants.go
pkg/compute/env/errors.go
pkg/compute/env/host_resolver.go
pkg/compute/env/host_resolver_test.go
pkg/compute/env/resolver_map.go
pkg/compute/env/resolver_map_test.go
Introduces new env package components: constants, error functions, HostResolver, ResolverMap, and associated tests for validating allowed environment variables and their resolution.
pkg/compute/envvars.go
pkg/compute/envvars_test.go
Implements ResolveEnvVars and updates GetExecutionEnvVars to accept an EnvVarResolver, with expanded test cases for error handling and variable resolution.
pkg/compute/executor.go Integrates EnvVarResolver into BaseExecutor by adding fields and updating PrepareRunArguments, allowing for external resolution of environment variables.
pkg/compute/types.go Introduces the EnvVarResolver interface with methods for validation and value retrieval of environment variables.
pkg/config/defaults.go
pkg/config/types/compute.go
pkg/config/types/generated_constants.go
pkg/config/types/generated_descriptions.go
Adds Env configuration fields and allowlists; updates logging levels and update intervals; and defines new config keys and descriptions for environment variables.
pkg/executor/docker/executor_test.go Updates test cases by changing the taskEnv type from map[string]string to map[string]models.EnvVarValue.
pkg/lib/envvar/utils.go Adds the FromEnvVarValues function to convert a map of models.EnvVarValue to a map of strings.
pkg/models/env_var.go
pkg/models/env_var_test.go
pkg/models/task.go
pkg/models/task_test.go
Introduces the EnvVarValue type with validation and conversion functions; updates Task models and tests to use EnvVarValue instead of plain strings for environment variables.
pkg/node/compute.go Integrates environment variable resolvers into compute node creation and bidder instantiation, updating function signatures to pass the resolver.
pkg/swagger/docs.go
pkg/swagger/swagger.json
webui/lib/api/schema/swagger.json
Updates API documentation to reflect new environment variable configurations, allowing both literal and host-derived values in tasks and compute nodes.
pkg/test/devstack/env_resolver_test.go
pkg/test/executor/test_runner.go
Introduces new test suites for environment resolver functionality in devstack and integrates the envResolver in the test runner for preparing run arguments.

Sequence Diagram(s)

sequenceDiagram
  actor Exec as Executor
  participant BE as BaseExecutor
  participant GEV as GetExecutionEnvVars
  participant REV as ResolveEnvVars
  participant R as ResolverMap/HostResolver

  Exec ->> BE: Start execution run
  BE ->> GEV: Call GetExecutionEnvVars(envResolver)
  GEV ->> REV: Request resolution for each environment variable
  REV ->> R: Validate variable (if prefix present)
  R -->> REV: Return resolved value or error
  REV -->> GEV: Return complete env map
  GEV -->> BE: Provide resolved environment variables
  BE -->> Exec: Complete run preparation
Loading
sequenceDiagram
  actor Bidder as Bidder
  participant ES as EnvResolverStrategy
  participant EVR as EnvVarResolver

  Bidder ->> ES: ShouldBid(context, request)
  ES ->> EVR: Validate each requested environment variable
  alt All variables valid
    EVR -->> ES: Valid response for each variable
    ES -->> Bidder: Return positive bid response
  else At least one variable invalid
    EVR -->> ES: Return error details
    ES -->> Bidder: Return negative bid response with error
  end
Loading

Poem

I'm a bunny in the code, hopping with delight,
Resolving env vars from morning till night.
With checks and validations, I leap so high,
Allowing only the best values to fly.
CodeRabbit Inc. made these changes shine—
A hop, a skip, and all systems align! 🐇💻

Warning

There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure.

🔧 golangci-lint (1.62.2)

{"Issues":[{"FromLinter":"typecheck","Text":"pattern build/: no matching files found","Severity":"","SourceLines":["//go:embed build/"],"Replacement":null,"Pos":{"Filename":"webui/webui.go","Offset":0,"Line":23,"Column":12},"ExpectNoLint":false,"ExpectedNoLintLinter":""}],"Report":{"Linters":[{"Name":"asasalint"},{"Name":"asciicheck"},{"Name":"bidichk"},{"Name":"bodyclose","Enabled":true},{"Name":"canonicalheader"},{"Name":"containedctx"},{"Name":"contextcheck"},{"Name":"copyloopvar","Enabled":true},{"Name":"cyclop"},{"Name":"decorder"},{"Name":"deadcode"},{"Name":"depguard","Enabled":true},{"Name":"dogsled","Enabled":true},{"Name":"dupl"},{"Name":"dupword"},{"Name":"durationcheck"},{"Name":"errcheck","Enabled":true,"EnabledByDefault":true},{"Name":"errchkjson"},{"Name":"errname"},{"Name":"errorlint"},{"Name":"execinquery"},{"Name":"exhaustive"},{"Name":"exhaustivestruct"},{"Name":"exhaustruct"},{"Name":"exportloopref"},{"Name":"forbidigo","Enabled":true},{"Name":"forcetypeassert"},{"Name":"fatcontext"},{"Name":"funlen","Enabled":true},{"Name":"gci"},{"Name":"ginkgolinter"},{"Name":"gocheckcompilerdirectives"},{"Name":"gochecknoglobals"},{"Name":"gochecknoinits","Enabled":true},{"Name":"gochecksumtype"},{"Name":"gocognit"},{"Name":"goconst","Enabled":true},{"Name":"gocritic"},{"Name":"gocyclo","Enabled":true},{"Name":"godot"},{"Name":"godox"},{"Name":"err113"},{"Name":"gofmt","Enabled":true},{"Name":"gofumpt"},{"Name":"goheader"},{"Name":"goimports","Enabled":true},{"Name":"golint"},{"Name":"mnd","Enabled":true},{"Name":"gomnd"},{"Name":"gomoddirectives"},{"Name":"gomodguard"},{"Name":"goprintffuncname","Enabled":true},{"Name":"gosec","Enabled":true},{"Name":"gosimple","Enabled":true,"EnabledByDefault":true},{"Name":"gosmopolitan"},{"Name":"govet","Enabled":true,"EnabledByDefault":true},{"Name":"grouper"},{"Name":"ifshort"},{"Name":"iface"},{"Name":"importas"},{"Name":"inamedparam"},{"Name":"ineffassign","Enabled":true,"EnabledByDefault":true},{"Name":"interfacebloat"},{"Name":"interfacer"},{"Name":"intrange"},{"Name":"ireturn"},{"Name":"lll","Enabled":true},{"Name":"loggercheck"},{"Name":"maintidx"},{"Name":"makezero"},{"Name":"maligned"},{"Name":"mirror"},{"Name":"misspell"},{"Name":"musttag"},{"Name":"nakedret","Enabled":true},{"Name":"nestif"},{"Name":"nilerr"},{"Name":"nilnil"},{"Name":"nlreturn"},{"Name":"noctx","Enabled":true},{"Name":"nonamedreturns"},{"Name":"nosnakecase"},{"Name":"nosprintfhostport"},{"Name":"paralleltest"},{"Name":"perfsprint"},{"Name":"prealloc"},{"Name":"predeclared"},{"Name":"promlinter"},{"Name":"protogetter"},{"Name":"reassign"},{"Name":"recvcheck"},{"Name":"revive"},{"Name":"rowserrcheck"},{"Name":"sloglint"},{"Name":"scopelint"},{"Name":"sqlclosecheck"},{"Name":"spancheck"},{"Name":"staticcheck","Enabled":true,"EnabledByDefault":true},{"Name":"structcheck"},{"Name":"stylecheck","Enabled":true},{"Name":"tagalign"},{"Name":"tagliatelle"},{"Name":"tenv"},{"Name":"testableexamples"},{"Name":"testifylint"},{"Name":"testpackage"},{"Name":"thelper"},{"Name":"tparallel"},{"Name":"typecheck","Enabled":true,"EnabledByDefault":true},{"Name":"unconvert","Enabled":true},{"Name":"unparam"},{"Name":"unused","Enabled":true,"EnabledByDefault":true},{"Name":"usestdlibvars"},{"Name":"varcheck"},{"Name":"varnamelen"},{"Name":"wastedassign"},{"Name":"whitespace","Enabled":true},{"Name":"wrapcheck"},{"Name":"wsl"},{"Name":"zerologlint"},{"Name":"nolintlint","Enabled":true}]}}

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (21)
pkg/compute/envvars.go (2)

12-25: Consider returning partial resolutions or extended error details for simpler debugging.
Currently, when a single environment variable fails to resolve, the entire function returns an error without providing partial results or naming the failing key. This might be acceptable in many scenarios, but consider whether returning a partially resolved map alongside the failing key could give more context to the caller, aiding debugging or recovery.


29-60: Extract system environment variable construction into a dedicated helper function.
The logic for building system environment variables (lines 45–56) could be placed into its own function to improve readability and testability. This separation would also allow for targeted unit testing of system variable creation.

pkg/compute/env/host_resolver.go (2)

1-44: Validate or log invalid regex patterns at construction time.
In the constructor, consider pre-compiling each allowed pattern to catch regex errors early, rather than silently ignoring them in the matching loop. This ensures you know if a provided pattern is invalid, preventing confusion during resolution.


45-53: Optimize repetitive regex matching for large allowlists.
Currently, each environment variable resolution requires scanning the entire allowlist. If your allowlist or usage grows, it might be more efficient to compile each pattern once and store the compiled regexes, avoiding repeated interpretation.

pkg/bidstrategy/semantic/env_resolver.go (1)

33-50: Enhance error handling and logging in ShouldBid method.

The error handling could be more robust and informative:

  1. Consider adding context about which resolver rejected the variable
  2. Log validation failures for audit purposes
 func (s *EnvResolverStrategy) ShouldBid(
 	ctx context.Context,
 	request bidstrategy.BidStrategyRequest,
 ) (bidstrategy.BidStrategyResponse, error) {
 	if len(request.Job.Task().Env) == 0 {
 		return bidstrategy.NewBidResponse(true, noEnvVarsReason), nil
 	}

+	var validationErrors []string
 	for name, value := range request.Job.Task().Env {
 		if err := s.resolver.Validate(name, string(value)); err != nil {
-			return bidstrategy.NewBidResponse(false, fmt.Sprintf("resolve environment variable %s: %v", name, err)), nil
+			validationErrors = append(validationErrors,
+				fmt.Sprintf("failed to resolve environment variable %s: %v (resolver: %T)", 
+					name, err, s.resolver))
 		}
 	}
+	
+	if len(validationErrors) > 0 {
+		return bidstrategy.NewBidResponse(false, 
+			fmt.Sprintf("environment variable validation failed:\n%s", 
+				strings.Join(validationErrors, "\n"))), nil
+	}

 	return bidstrategy.NewBidResponse(true, canResolveReason), nil
 }
pkg/lib/envvar/utils.go (1)

13-19: Optimize map initialization and add nil safety.

The function could be more efficient with proper capacity initialization and nil handling.

 func FromEnvVarValues(env map[string]models.EnvVarValue) map[string]string {
+	if env == nil {
+		return nil
+	}
-	envMap := make(map[string]string)
+	envMap := make(map[string]string, len(env))
 	for k, v := range env {
 		envMap[k] = string(v)
 	}
 	return envMap
 }
pkg/models/env_var.go (2)

34-37: Consider using regex for name validation.

The current validation logic could be more maintainable using a single regex pattern.

+// Environment variable name pattern: must be uppercase, start with a letter,
+// and contain only letters, numbers, and underscores
+var envVarNamePattern = regexp.MustCompile(`^[A-Z][A-Z0-9_]*$`)

 func ValidateEnvVars(env map[string]EnvVarValue) error {
 	for name, value := range env {
 		if name == "" {
 			return fmt.Errorf("environment variable name cannot be empty")
 		}

-		// Must be uppercase
-		if name != strings.ToUpper(name) {
+		if !envVarNamePattern.MatchString(name) {
 			return fmt.Errorf("environment variable '%s' must be uppercase", name)
 		}

63-74: Add validation option to EnvVarsToStringMap.

The conversion function could optionally validate the environment variables during conversion.

-func EnvVarsToStringMap(env map[string]EnvVarValue) map[string]string {
+func EnvVarsToStringMap(env map[string]EnvVarValue, validate bool) (map[string]string, error) {
 	if env == nil {
-		return nil
+		return nil, nil
+	}
+	
+	if validate {
+		if err := ValidateEnvVars(env); err != nil {
+			return nil, fmt.Errorf("environment variable validation failed: %w", err)
+		}
 	}
+
 	result := make(map[string]string, len(env))
 	for k, v := range env {
 		result[k] = v.String()
 	}
-	return result
+	return result, nil
 }
pkg/compute/types.go (1)

42-49: Enhance interface documentation for clarity.

The interface methods would benefit from more detailed documentation explaining:

  • The expected format of input values (e.g., env: prefix)
  • The error conditions for each method
  • The relationship between Validate and Value methods
 // EnvVarResolver is the interface for resolving environment variable references
 type EnvVarResolver interface {
-	// Validate checks if the value has valid syntax for this resolver
+	// Validate checks if the environment variable reference is allowed by this resolver.
+	// name: The name of the environment variable in the job specification
+	// value: The reference value (e.g., "env:HOST_VAR")
+	// Returns an error if the reference is not allowed or has invalid syntax
 	Validate(name string, value string) error

-	// Value returns the resolved value from the source
+	// Value resolves the environment variable reference to its actual value.
+	// value: The reference value (e.g., "env:HOST_VAR")
+	// Returns the resolved value and an error if:
+	// - The reference is not allowed (should be validated first)
+	// - The referenced variable doesn't exist
+	// - The reference has invalid syntax
 	Value(value string) (string, error)
 }
pkg/bidstrategy/semantic/env_resolver_test.go (1)

20-66: Add test cases for edge cases and pattern matching.

Consider adding test cases for:

  1. Invalid syntax in env var reference (e.g., "env:")
  2. Empty values
  3. Multiple allowed patterns
  4. Case sensitivity in patterns
 	testCases := []struct {
 		name      string
 		env       map[string]models.EnvVarValue
 		allowList []string
 		shouldBid bool
 	}{
+		{
+			name: "invalid env var reference syntax",
+			env: map[string]models.EnvVarValue{
+				"INVALID_VAR": "env:",
+			},
+			allowList: []string{"*"},
+			shouldBid: false,
+		},
+		{
+			name: "empty value",
+			env: map[string]models.EnvVarValue{
+				"EMPTY_VAR": "",
+			},
+			allowList: []string{"*"},
+			shouldBid: true,
+		},
+		{
+			name: "multiple allowed patterns",
+			env: map[string]models.EnvVarValue{
+				"AWS_VAR":   "env:AWS_SECRET",
+				"GCP_VAR":   "env:GCP_SECRET",
+				"AZURE_VAR": "env:AZURE_SECRET",
+			},
+			allowList: []string{"AWS_*", "GCP_*", "AZURE_*"},
+			shouldBid: true,
+		},
+		{
+			name: "case sensitive patterns",
+			env: map[string]models.EnvVarValue{
+				"CASE_VAR": "env:TEST_var",
+			},
+			allowList: []string{"TEST_VAR"},
+			shouldBid: false,
+		},
 		// ... existing test cases ...
 	}
pkg/compute/env/host_resolver_test.go (2)

66-71: Add cleanup for environment variables.

Environment variables set in tests should be cleaned up to prevent test pollution.

 func (s *HostResolverSuite) TestValue() {
+	// Save original environment
+	origEnv := map[string]string{
+		"ALLOWED_VAR": os.Getenv("ALLOWED_VAR"),
+		"TEST_VAR":    os.Getenv("TEST_VAR"),
+		"DENIED_VAR":  os.Getenv("DENIED_VAR"),
+	}
+
+	// Restore environment after test
+	defer func() {
+		for k, v := range origEnv {
+			if v == "" {
+				os.Unsetenv(k)
+			} else {
+				os.Setenv(k, v)
+			}
+		}
+	}()
+
 	// Set test environment variables
 	s.T().Setenv("ALLOWED_VAR", "allowed_value")
 	s.T().Setenv("TEST_VAR", "test_value")
 	s.T().Setenv("DENIED_VAR", "denied_value")

27-64: Use table-driven subtests for validation tests.

The validation tests could be better organized using table-driven subtests with more descriptive test cases.

 func (s *HostResolverSuite) TestValidate() {
 	tests := []struct {
 		name      string
 		varName   string
 		varValue  string
 		shouldErr bool
+		errMsg    string
 	}{
 		{
 			name:      "allowed pattern",
 			varName:   "job_var",
 			varValue:  "ALLOWED_VALUE",
 			shouldErr: false,
 		},
 		{
 			name:      "allowed exact match",
 			varName:   "job_var",
 			varValue:  "TEST_VAR",
 			shouldErr: false,
 		},
 		{
 			name:      "not allowed",
 			varName:   "job_var",
 			varValue:  "DENIED_VALUE",
 			shouldErr: true,
+			errMsg:    "not in allowed patterns",
 		},
+		{
+			name:      "empty variable name",
+			varName:   "",
+			varValue:  "ALLOWED_VALUE",
+			shouldErr: true,
+			errMsg:    "empty variable name",
+		},
+		{
+			name:      "empty variable value",
+			varName:   "job_var",
+			varValue:  "",
+			shouldErr: true,
+			errMsg:    "empty variable value",
+		},
 	}

 	for _, tt := range tests {
 		s.Run(tt.name, func() {
 			err := s.resolver.Validate(tt.varName, tt.varValue)
 			if tt.shouldErr {
 				s.Error(err)
+				if tt.errMsg != "" {
+					s.Contains(err.Error(), tt.errMsg)
+				}
 			} else {
 				s.NoError(err)
 			}
 		})
 	}
 }
pkg/config/types/compute.go (1)

16-18: Enhance configuration documentation.

The environment configuration documentation would benefit from more detailed explanations and examples.

-	// Env specifies environment variable configuration for the compute node
+	// Env specifies environment variable configuration for the compute node.
+	// This configuration controls which host environment variables can be securely
+	// forwarded to jobs. See EnvConfig for details.
 	Env EnvConfig `yaml:"Env,omitempty" json:"Env,omitempty"`

 // EnvConfig specifies environment variable configuration for the compute node
 type EnvConfig struct {
-	// AllowList specifies which host environment variables can be forwarded to jobs.
-	// Supports glob patterns (e.g., "AWS_*", "API_*")
+	// AllowList specifies which host environment variables can be forwarded to jobs
+	// using glob patterns. Only variables matching these patterns can be referenced
+	// in job specifications using the "env:" prefix.
+	//
+	// Examples:
+	// - "AWS_*": Allow all AWS-related variables
+	// - "API_KEY_*": Allow all API key variables
+	// - "GITHUB_TOKEN": Allow a specific variable
+	//
+	// Security Note:
+	// Be cautious when configuring patterns to avoid exposing sensitive credentials.
+	// Use specific patterns instead of overly broad ones like "*".
+	//
+	// Related Configuration:
+	// - Set RequireTLS to true in ComputeTLS for secure variable transmission
 	AllowList []string `yaml:"AllowList,omitempty" json:"AllowList,omitempty"`

Also applies to: 42-47

pkg/test/devstack/env_resolver_test.go (2)

30-71: LGTM! Good test coverage for successful environment variable resolution.

The test effectively verifies that both literal values and host environment variables are correctly resolved and passed to jobs.

Consider adding test cases for:

  • Empty environment variables
  • Variables with special characters
  • Multiple environment variables in the same job

73-108: LGTM! Good security test for denied variables.

The test properly verifies that non-allowlisted environment variables are blocked at the bid phase, which aligns with the PR's security objectives.

Consider adding test cases for:

  • Variables that partially match the allowlist pattern
  • Variables with mixed case in the prefix (e.g., "ENV:var")
  • Empty allowlist configuration
pkg/compute/env/resolver_map_test.go (2)

26-66: LGTM! Good test coverage for value parsing.

The test cases effectively cover various scenarios including literal values, environment references, empty values, and special characters.

Consider adding test cases for:

  • Multiple colons in the value (e.g., "env:prefix:value")
  • Unicode characters in values
  • Very long values

66-136: LGTM! Thorough validation test coverage.

The test cases properly verify environment variable name restrictions and reserved prefixes.

Consider adding test cases for:

  • Maximum length validation
  • Unicode characters in names
  • Names with underscores only
pkg/test/executor/test_runner.go (1)

92-95: Consider making the allowlist pattern configurable.

While using TEST_* is appropriate for testing, consider allowing the pattern to be configurable through the test case configuration to support different testing scenarios.

 // Create a permissive env resolver for testing
 envResolver := env.NewResolver(env.ResolverParams{
-    AllowList: []string{"TEST_*"}, // Allow only TEST_* environment variables to be forwarded
+    AllowList: testCase.GetEnvAllowList(), // Get allowlist from test case config
 })
pkg/models/env_var_test.go (2)

20-64: LGTM! Good serialization test coverage.

The test cases effectively verify JSON marshaling and unmarshaling for various value types.

Consider adding test cases for:

  • JSON escape sequences
  • Very long values
  • Values with newlines and control characters

66-136: LGTM! Thorough validation test coverage.

The test cases properly verify environment variable naming restrictions.

Consider adding test cases for:

  • Names at maximum length boundary
  • Names with only underscores
  • Empty variable names
webui/lib/api/schema/swagger.json (1)

2652-2663: New EnvConfig with AllowList Feature
The newly introduced "types.EnvConfig" defines an AllowList that accepts an array of strings (supporting glob patterns like "AWS_" or "API_"). This provides a clear mechanism for securely specifying which host environment variables can be forwarded to jobs, in line with the security design outlined in the PR objectives. It would be beneficial to ensure that corresponding validations and tests are in place to enforce this allowlist appropriately.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d151162 and cac475f.

⛔ Files ignored due to path filters (2)
  • webui/lib/api/generated/services.gen.ts is excluded by !**/generated/**
  • webui/lib/api/generated/types.gen.ts is excluded by !**/generated/**
📒 Files selected for processing (28)
  • pkg/bidstrategy/semantic/env_resolver.go (1 hunks)
  • pkg/bidstrategy/semantic/env_resolver_test.go (1 hunks)
  • pkg/compute/env/constants.go (1 hunks)
  • pkg/compute/env/errors.go (1 hunks)
  • pkg/compute/env/host_resolver.go (1 hunks)
  • pkg/compute/env/host_resolver_test.go (1 hunks)
  • pkg/compute/env/resolver_map.go (1 hunks)
  • pkg/compute/env/resolver_map_test.go (1 hunks)
  • pkg/compute/envvars.go (2 hunks)
  • pkg/compute/envvars_test.go (4 hunks)
  • pkg/compute/executor.go (7 hunks)
  • pkg/compute/types.go (1 hunks)
  • pkg/config/defaults.go (1 hunks)
  • pkg/config/types/compute.go (2 hunks)
  • pkg/config/types/generated_constants.go (1 hunks)
  • pkg/config/types/generated_descriptions.go (1 hunks)
  • pkg/executor/docker/executor_test.go (3 hunks)
  • pkg/lib/envvar/utils.go (1 hunks)
  • pkg/models/env_var.go (1 hunks)
  • pkg/models/env_var_test.go (1 hunks)
  • pkg/models/task.go (3 hunks)
  • pkg/models/task_test.go (2 hunks)
  • pkg/node/compute.go (6 hunks)
  • pkg/swagger/docs.go (4 hunks)
  • pkg/swagger/swagger.json (3 hunks)
  • pkg/test/devstack/env_resolver_test.go (1 hunks)
  • pkg/test/executor/test_runner.go (2 hunks)
  • webui/lib/api/schema/swagger.json (3 hunks)
✅ Files skipped from review due to trivial changes (1)
  • pkg/compute/env/constants.go
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: testcontainers-suite / tests
🔇 Additional comments (30)
pkg/compute/env/errors.go (1)

1-23: Approach looks solid; consider aligning naming conventions and visibility.
Both error constructors are minimal and consistent. If usage evolves beyond this package, you might make them public or rename them with a consistent prefix (e.g., “ErrNotAllowed()”) to reflect Go’s convention for exported entities. Otherwise, these private constructors neatly encapsulate environment variable error creation.

pkg/compute/env/resolver_map_test.go (1)

138-184: LGTM! Good test coverage for map conversion.

The test cases effectively cover nil maps, empty maps, and mixed literal/reference values.

pkg/models/env_var_test.go (1)

138-184: LGTM! Good map conversion test coverage.

The test cases effectively cover various map scenarios including nil, empty, and mixed values.

pkg/compute/envvars_test.go (3)

17-19: LGTM! Good security practice with allowlist patterns.

The resolver is correctly configured with a restrictive allowlist that only allows specific patterns (MY_*, OTHER_*, TEST_*).


87-92: LGTM! Good test coverage for environment variable overrides.

The test correctly verifies that system variables (BACALHAU_*) cannot be overridden by user-provided values, maintaining system security.


110-128: LGTM! Proper error handling for unauthorized variables.

The test case properly verifies that attempts to use non-allowlisted environment variables fail fast.

pkg/config/defaults.go (1)

161-163: LGTM! Secure test configuration for environment variables.

The test configuration correctly restricts environment variable forwarding to only TEST_* patterns, aligning with the PR's security objectives.

pkg/models/task.go (2)

21-25: LGTM! Clear documentation and type safety for environment variables.

The updated type and documentation clearly specify the supported formats for environment variables:

  • Direct values: "debug-mode"
  • Host env vars: "env:HOST_VAR"

58-60: LGTM! Proper initialization of environment variables map.

The Normalize method correctly initializes the Env map with the new EnvVarValue type.

pkg/config/types/generated_constants.go (1)

24-24: LGTM! Consistent naming for configuration constant.

The new constant ComputeEnvAllowListKey follows the established naming pattern and provides a type-safe way to reference the environment variable allowlist configuration.

pkg/models/task_test.go (3)

197-199: LGTM! Good test coverage for environment variable validation.

The test case properly validates the restriction on environment variables starting with 'BACALHAU_' prefix.


209-213: LGTM! Good test coverage for valid environment variables.

The test case properly validates both regular environment variables and variables containing 'BACALHAU' in non-prefix positions.


256-256: LGTM! Proper deep copy test for environment variables.

The test ensures that environment variables are properly deep copied, maintaining data integrity.

pkg/node/compute.go (2)

104-106: LGTM! Clean initialization of environment variable resolver.

The resolver is properly initialized with the configured allowlist from the compute node configuration.


329-331: LGTM! Good integration with bid strategy.

The environment resolver is properly integrated into the semantic bid strategies, enabling early validation of environment variables during the bid phase.

pkg/config/types/generated_descriptions.go (2)

26-26: LGTM! Clear description of environment variable allowlist.

The description clearly explains the purpose and format of the allowlist, including support for glob patterns.


29-29: LGTM! Clear deprecation notice.

The description properly marks the field as deprecated and provides the alternative to use.

pkg/compute/executor.go (3)

35-35: LGTM! Clean addition of environment resolver.

The environment resolver is properly added to both the parameters and struct, maintaining clean dependency injection.

Also applies to: 49-49


187-190: LGTM! Good error handling for environment variable resolution.

The code properly handles and wraps errors from environment variable resolution, providing clear error messages.


253-253: LGTM! Clean integration with executor start.

The environment resolver is properly passed to PrepareRunArguments during execution start.

pkg/executor/docker/executor_test.go (2)

175-175: LGTM! Environment variable conversion is handled correctly.

The use of models.EnvVarsToStringMap ensures proper conversion of environment variables from the new structured type to string map for execution.


582-620: undefined

pkg/swagger/docs.go (3)

1-1: LGTM! Package comment updated correctly.

The redundant package name has been removed from the generated code comment.


2233-2238: LGTM! Clear documentation for environment variable values.

The description clearly explains the two supported formats:

  • Direct values (e.g., "debug-mode")
  • Host environment variables (e.g., "env:HOST_VAR")

2656-2667: LGTM! Well-structured environment configuration type.

The EnvConfig type provides a clear way to configure allowlisted environment variables with support for glob patterns (e.g., "AWS_", "API_").

webui/lib/api/schema/swagger.json (2)

2228-2234: Clarify Task Environment Variable Usage
The updated description in the "Env" property for the Task model now clearly states that environment variable values can be provided either as a direct literal (e.g., "debug-mode") or as a host environment variable by prefixing with "env:". This documentation is clear and aligns perfectly with the PR objective of introducing host environment variable forwarding.


2530-2537: Introduce Compute Node Environment Configuration
The addition of the "Env" property in the Compute type makes it explicit that compute nodes can be configured to handle environment variables. By referencing the new "types.EnvConfig" via an allOf construct, the schema now clearly communicates the role of environment variables in compute node configuration. This change is consistent with the feature’s requirements.

pkg/swagger/swagger.json (3)

2228-2234: Clarify environment variable usage in task definitions
The update to the "Env" property under models.Task now clearly explains that environment variable values can be provided directly (e.g., "debug-mode") or as host environment variables by using the "env:" prefix (e.g., "env:HOST_VAR"). This added explanation aligns perfectly with the PR’s objective of introducing host environment variable forwarding.


2530-2537: Introduce compute node environment configuration
The new "Env" property in the types.Compute section effectively enables configuration of environment variable forwarding on compute nodes. By using an "allOf" reference to types.EnvConfig, it lays the groundwork for secure, allowlist-based forwarding. Ensure that the compute node implementation parses and applies these configurations consistently.


2652-2663: Define structured EnvConfig with an allowlist
The addition of the types.EnvConfig block provides a clear schema for specifying which host environment variables are permitted to be forwarded. The "AllowList" field—with support for glob patterns like "AWS_" or "API_"—is critical for enforcing security controls and aligns with the PR’s goal of transparent and secure variable forwarding.

pkg/compute/env/resolver_map.go Show resolved Hide resolved
@wdbaruni
Copy link
Member Author

wdbaruni commented Feb 9, 2025

@coderabbitai review with assertive profile

Copy link
Contributor

coderabbitai bot commented Feb 9, 2025

@wdbaruni This is a well-designed PR that introduces host environment variable forwarding with robust security controls. The implementation stands out in several aspects:

  1. Strong Security Design: The allowlist-based security model with early validation in the bid strategy effectively prevents unauthorized access to host variables. The requirement for explicit env: prefix creates a clear audit trail.

  2. Clean Architecture: The code is well-structured with clear separation of concerns. The EnvVarResolver interface provides a solid foundation, while the HostResolver and ResolverMap implementations maintain single responsibility principle.

  3. Comprehensive Testing: The test coverage is thorough, including unit tests for all components and integration tests in devstack. Edge cases and error scenarios are well-covered.

  4. Future-Ready: The architecture is designed to support future enhancements like secret management systems. The interface-based design allows for easy addition of new resolvers.

The implementation is production-ready and merits approval.

✅ Actions performed

Comments resolved.

@wdbaruni wdbaruni merged commit e36b9ab into main Feb 9, 2025
14 checks passed
@wdbaruni wdbaruni deleted the eng-612-pass-host-env-variables-to-jobs branch February 9, 2025 09:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant