Problem Statement
To use providers in sandboxes, they must be pre-registered using the openshell provider create CLI command:
openshell provider create --name my-provider --type generic \
--credential API_KEY="${API_KEY}" \
--credential TOKEN="${TOKEN}"
In Kubernetes deployments using the Helm chart, this typically requires a separate job/script to run somewhere, which is inconvenient and imperative
Proposed Design
Allow providers to be configured declaratively via a config file or Helm values:
openshell:
providers:
- name: cimpress-bedrock
type: generic
credentials:
- key: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: openshell-providers
key: openai-api-key
....
The gateway would automatically register these providers on startup using the configured credentials.
Alternatives Considered
- Separate init container
- Lifecycle hooks (postStart): Would require upstream chart to expose lifecycle hook configuration, which currently isn't available.
Agent Investigation
Problem Statement
OpenShell providers must currently be registered via openshell provider create CLI commands, which requires post-install Kubernetes Jobs that run after gateway deployment. This adds operational complexity: operators must create separate Jobs with CLI installation, manage timing dependencies with gateway readiness, and script provider registration commands. For multi-tenant Kubernetes deployments using GitOps workflows, this imperative approach doesn't align with declarative configuration patterns used throughout the stack.
This feature request proposes adding declarative provider configuration via Helm chart values and gateway.toml, allowing providers to be defined alongside other gateway configuration and automatically synced at startup. Credentials would be sourced from Kubernetes Secrets, matching standard Kubernetes patterns for secret management.
Technical Context
The current provider management system stores providers in the gateway's database (PostgreSQL or SQLite) and manages them through a gRPC API. The CLI (openshell provider create) parses credentials, calls the gRPC endpoint, and the gateway validates and persists the provider record. Providers include metadata (name, type), credentials (map of key-value pairs stored encrypted), and configuration options. A background refresh worker mints short-lived tokens for OAuth2/JWT flows.
The gateway.toml config file is already parsed at startup using the serde-based ConfigFile struct, but providers are not part of the config schema. The gateway startup sequence connects to the database, initializes components, and resumes persisted sandboxes, but does not reconcile any declarative provider definitions.
Adding declarative configuration requires building a provider sync mechanism that runs during gateway startup, reads declared providers from gateway.toml (sourced from Helm-generated ConfigMaps), resolves credentials from Kubernetes Secrets, and ensures the providers exist in the database before sandboxes resume.
Affected Components
| Component |
Key Files |
Role |
| Gateway Config |
crates/openshell-server/src/config_file.rs |
Parse gateway.toml including new providers section |
| Gateway Startup |
crates/openshell-server/src/lib.rs |
Orchestrate provider sync during run_server() initialization |
| Provider Sync (new) |
crates/openshell-server/src/provider_sync.rs |
Reconcile declared providers with database state, resolve credentials from Secrets |
| Provider gRPC |
crates/openshell-server/src/grpc/provider.rs |
Reuse existing create_provider_record() validation and persistence logic |
| Helm Chart |
deploy/helm/openshell/values.yaml, templates |
Define providers values schema, generate ConfigMap, update RBAC for Secret reads |
| Gateway Config Docs |
docs/reference/gateway-config.mdx |
Document new [openshell.gateway.providers] TOML section and credential sourcing patterns |
Technical Investigation
Architecture Overview
Current Provider Storage:
- Providers are stored in a database (PostgreSQL or SQLite) as protobuf objects in the
objects table
- Schema:
object_type = "provider", name (unique), id (UUID), payload (protobuf binary), resource_version (for optimistic concurrency control)
- Each provider has metadata, type, credentials (encrypted map<string,string>), config (map<string,string>), and credential expiry timestamps
- Credentials are redacted in API responses and only exposed to sandboxes via
GetInferenceBundleRequest or environment injection
Provider Creation Flow:
- CLI parses
openshell provider create command → validates type, discovers credentials from flags or environment
- CLI calls gRPC
CreateProviderRequest with credentials and config
- Gateway
create_provider_record() validates the provider type, checks for name collisions, and persists to database with MustCreate CAS condition
- Background refresh worker mints short-lived tokens for OAuth2/JWT providers (e.g., Vertex AI)
Gateway Startup Sequence:
- Parse CLI args and load gateway.toml (
config_file.rs)
- Connect to database (
Store::connect())
- Initialize OIDC, sandbox index, compute drivers
- Build
ServerState with store, compute, auth components
- Resume persisted sandboxes
- Spawn watchers and background workers including provider refresh
- Start gRPC/HTTP listeners
No existing reconciliation mechanism: The gateway does not currently check for declarative provider definitions at startup. Providers are only created via explicit gRPC API calls.
Code References
| Location |
Description |
crates/openshell-cli/src/main.rs:717-827 |
CLI ProviderCommands::Create struct and argument parsing |
crates/openshell-cli/src/run.rs:4476-4653 |
provider_create() function: credential discovery, validation, gRPC call |
crates/openshell-server/src/grpc/provider.rs:61-140 |
create_provider_record() gRPC handler: validates type, persists to database |
crates/openshell-server/src/persistence/mod.rs:114-199 |
Store enum dispatching to PostgreSQL/SQLite backends, CAS operations |
proto/datamodel.proto:32-44 |
Provider protobuf message schema |
crates/openshell-server/src/config_file.rs:38-156 |
ConfigFile and GatewayFileSection structs for gateway.toml parsing |
crates/openshell-server/src/lib.rs:194-351 |
run_server() gateway startup orchestration, provider refresh worker spawn at line 351 |
deploy/helm/openshell/values.yaml |
Helm chart values schema (needs providers array addition) |
Current Behavior
Providers are created imperatively via CLI commands. The CLI:
- Parses
--credential KEY=VALUE or --credential KEY (discovers from environment) flags
- Validates provider type against supported types (generic, vertex-ai, etc.)
- Calls gRPC
CreateProviderRequest with credentials map and config map
- Gateway validates and persists to database with
MustCreate (fails if name already exists)
Kubernetes deployments require a post-install Job that:
- Installs openshell CLI
- Registers the gateway using bootstrap token
- Runs
openshell provider create commands for each provider
- Manages timing/readiness checks to ensure gateway is available
What Would Need to Change
1. Config File Schema (config_file.rs)
- Add
providers: Option<Vec<ProviderDeclaration>> to GatewayFileSection struct
- Define new types:
pub struct ProviderDeclaration {
pub name: String,
pub provider_type: String,
pub credentials: Vec<CredentialSource>,
pub config: HashMap<String, String>,
}
pub enum CredentialSource {
SecretKeyRef { secret_name: String, key: String },
EnvVar { name: String },
}
- Validation: reject
Literal credential sources (force operators to use Secrets)
2. Provider Sync Module (new provider_sync.rs)
-
sync_declarative_providers(store: &Store, config_file: &ConfigFile, namespace: &str) -> Result<()>
- Read
config_file.gateway.providers
- For each declared provider:
- Resolve credentials from Kubernetes Secrets or environment variables
- Check if provider exists in database (by name)
- If missing: call
create_provider_record() to persist
- If exists: log warning (create-only mode, no updates)
- Return
Err on any failure (fail-fast: gateway does not start if provider sync fails)
-
resolve_credential_sources(credentials: &[CredentialSource], namespace: &str) -> Result<HashMap<String, String>>
- For
SecretKeyRef: use kube client to GET /api/v1/namespaces/{namespace}/secrets/{secret_name}, extract data[key], base64 decode
- For
EnvVar: read from std::env::var()
- Fail if Secret does not exist, key is missing, or env var is unset
-
ensure_provider(store: &Store, decl: &ProviderDeclaration, credentials: HashMap<String, String>) -> Result<()>
- Call existing
create_provider_record() logic (reuse validation)
- Handle
MustCreate conflict error → log warning, skip (provider already exists)
3. Gateway Startup (lib.rs)
- Insert provider sync call after database connection (around line 206, after
Store::connect())
- Must run before
SandboxIndex::new() and sandbox resume to ensure providers exist before sandboxes try to use them
- Pass
config_file, store, and detected namespace (from K8s API or --namespace flag)
- Fail gateway startup if
sync_declarative_providers() returns Err
4. Kubernetes Client Integration
- Provider sync module needs
kube crate to read Secrets (already a dependency for K8s compute driver)
- Reuse existing K8s client initialization from
openshell-driver-kubernetes
- Handle in-cluster auth (ServiceAccount token) vs out-of-cluster kubeconfig
5. Helm Chart Updates
- values.yaml: Add
providers: [] array schema
providers:
- name: gitlab
type: generic
credentials:
- key: GITLAB_TOKEN
valueFrom:
secretKeyRef:
name: openshell-providers
key: gitlab-token
- ConfigMap template: Render
providers into gateway.toml as TOML array
- RBAC: Update gateway ServiceAccount Role to allow
get on Secrets with label openshell.io/provider-credentials: "true" (label-based RBAC selector)
- Example Secret manifest: Provide example Secret in chart README
Alternative Approaches Considered
1. --auto-providers flag (rejected)
- Already exists but relies on environment variable credential discovery
- Doesn't work for custom provider types with non-standard credential keys
- Still requires passing all credentials as environment variables (no Secret sourcing)
- Doesn't support GitOps declarative patterns
2. Simplified post-install Job (rejected)
- Reduces complexity by running job only once (not on upgrades)
- Still requires CLI installation, scripting, timing coordination
- Not declarative, adds operational overhead
3. Init containers (rejected)
- Tightly coupled to gateway pod lifecycle
- Still requires CLI installation and scripting
- Harder to debug than separate Jobs
4. Lifecycle hooks (postStart) (rejected)
- Helm chart does not expose lifecycle hook configuration
- Would require upstream chart changes
- postStart hooks block pod readiness, delaying service availability
5. Full reconciliation with prune (rejected for initial version)
- Declared providers become authoritative, CLI-created providers not in config are deleted
- Dangerous: breaks existing workflows, destroys user-managed providers
- Better to start with create-only mode and add prune as opt-in feature later
Patterns to Follow
1. Config File Parsing
- Follow existing TOML deserialization patterns in
config_file.rs
- Use
#[serde(default)] for optional fields
- Add validation in
ConfigFile::validate() method
2. Error Handling
- Return
anyhow::Result<()> for provider sync
- Use
context() to add breadcrumbs for debugging (e.g., "failed to read Secret {name}/{key}")
- Fail fast: gateway does not start if provider sync fails (matches existing behavior for required config)
3. Kubernetes Client Usage
- Reuse
openshell-driver-kubernetes client setup
- Handle both in-cluster and out-of-cluster auth
- Use label selectors for RBAC restrictions
4. Provider Creation
- Reuse existing
create_provider_record() validation logic (don't duplicate)
- Respect
MustCreate CAS semantics (fail if provider already exists)
- Log provider creation events at INFO level for audit trail
5. Testing
- Unit tests for credential resolution (
resolve_credential_sources)
- Integration tests for provider sync (in-memory SQLite database)
- E2E test for Helm deployment with declarative providers (requires K8s cluster)
Proposed Approach
Add a provider sync mechanism that runs during gateway startup:
-
Extend gateway.toml schema to include an optional providers array with name, type, credentials (sourced from Secrets or env vars), and config.
-
Build provider sync module that reads declared providers from config, resolves credentials from Kubernetes Secrets using the kube client, and ensures each provider exists in the database by calling the existing create_provider_record() function.
-
Integrate sync into gateway startup immediately after database connection and before sandbox resume, ensuring providers are available when sandboxes start.
-
Update Helm chart to expose providers in values.yaml, render them into a ConfigMap that mounts as gateway.toml, and update RBAC to allow reading Secrets with a specific label.
-
Use create-only mode for initial version: declared providers are created if missing but ignored if they already exist. CLI-created providers are unaffected. Log warnings for name collisions. This avoids destructive operations and maintains backwards compatibility.
-
Fail fast on sync errors: if any provider fails to sync (Secret missing, invalid type, database error), the gateway does not start. This gives operators immediate feedback that configuration is broken.
Scope Assessment
- Complexity: Medium
- Confidence: High (clear path, well-understood provider system, existing patterns to follow)
- Estimated files to change: 8-10 files (3 core Rust files, 4 Helm templates, 2 doc files)
- Issue type:
feat
Risks & Open Questions
Risks:
- CWE-522 (Insufficiently Protected Credentials): HIGH — Declarative config encourages storing credentials in ConfigMaps. Mitigation: ONLY support
valueFrom.secretKeyRef, reject literal credential values in TOML. Document that credentials MUST live in Secrets.
- CWE-269 (Improper Privilege Management): MEDIUM — Gateway ServiceAccount gains
get permission on Secrets. Mitigation: Use label-based RBAC selector (openshell.io/provider-credentials), requiring operators to explicitly label provider Secrets.
- CWE-1188 (Insecure Default Configuration): MEDIUM — Default Helm chart might include example providers with placeholder credentials. Mitigation: Do NOT include any provider in default
values.yaml. Document that providers: [] is the safe default.
Edge Cases:
- Secret does not exist: Fail provider sync with clear error, prevent gateway startup. Operators must create Secret before deploying gateway.
- Provider name collision: Declared provider name matches existing CLI-created provider. Log warning, skip creation (CLI takes precedence). Document this behavior.
- Partial sync failure: One provider succeeds, another fails. Gateway fails to start (all-or-nothing). Clear error messages guide operators to fix the broken provider.
Open Design Questions (need stakeholder input):
-
Sync strategy: Create-only (recommended), full reconciliation, or label-based hybrid (managed-by: config)?
-
Credential update policy: Should declarative config be allowed to update existing providers' credentials, or is create-only the permanent behavior?
-
RBAC scope: Should gateway read all Secrets in the namespace, only labeled Secrets (openshell.io/provider-credentials: "true"), or support cross-namespace Secret reads?
-
Failure mode: Fail gateway startup on any provider sync error (safe, recommended) or log error and continue with partial sync (forgiving, but leaves system in inconsistent state)?
-
Helm integration depth: Should provider credentials live in a separate Secret created outside Helm (recommended, matches operator pattern), or should Helm chart accept credential values that it templates into the Secret (dangerous, credentials in values files)?
Test Considerations
Unit Tests:
resolve_credential_sources() with mocked kube client: test SecretKeyRef resolution, env var fallback, error handling for missing Secrets/keys
- Provider declaration parsing: valid TOML → struct deserialization, invalid TOML → validation errors
Integration Tests:
- Provider sync with in-memory SQLite database
- Create-only behavior: declare provider, sync, declare same provider again, verify it's not duplicated
- Name collision: CLI-created provider + declarative provider with same name → warning logged, no overwrite
- Partial failure: multiple providers declared, one has invalid type → gateway fails to start with clear error
E2E Tests:
- Helm deployment with declarative providers (requires K8s cluster)
- Deploy gateway with
providers in values.yaml, verify providers exist in database via openshell provider list
- Create Secret with credentials, deploy gateway referencing Secret, verify provider credentials work in sandbox
- RBAC test: deploy gateway without Secret read permission, verify sync fails with permission error
Test Patterns from Existing Code:
- Provider tests in
crates/openshell-server/src/grpc/provider.rs use TestFixture with in-memory store
- E2E tests in
.agents/skills/test-release-canary/ use openshell CLI for validation
- Follow existing test file organization: unit tests in module files, integration tests in
tests/ directory
Created by spike investigation. Review the proposed approach and design decisions, then use build-from-issue to plan and implement.
Checklist
Problem Statement
To use providers in sandboxes, they must be pre-registered using the
openshell provider createCLI command:In Kubernetes deployments using the Helm chart, this typically requires a separate job/script to run somewhere, which is inconvenient and imperative
Proposed Design
Allow providers to be configured declaratively via a config file or Helm values:
The gateway would automatically register these providers on startup using the configured credentials.
Alternatives Considered
Agent Investigation
Problem Statement
OpenShell providers must currently be registered via
openshell provider createCLI commands, which requires post-install Kubernetes Jobs that run after gateway deployment. This adds operational complexity: operators must create separate Jobs with CLI installation, manage timing dependencies with gateway readiness, and script provider registration commands. For multi-tenant Kubernetes deployments using GitOps workflows, this imperative approach doesn't align with declarative configuration patterns used throughout the stack.This feature request proposes adding declarative provider configuration via Helm chart values and gateway.toml, allowing providers to be defined alongside other gateway configuration and automatically synced at startup. Credentials would be sourced from Kubernetes Secrets, matching standard Kubernetes patterns for secret management.
Technical Context
The current provider management system stores providers in the gateway's database (PostgreSQL or SQLite) and manages them through a gRPC API. The CLI (
openshell provider create) parses credentials, calls the gRPC endpoint, and the gateway validates and persists the provider record. Providers include metadata (name, type), credentials (map of key-value pairs stored encrypted), and configuration options. A background refresh worker mints short-lived tokens for OAuth2/JWT flows.The gateway.toml config file is already parsed at startup using the serde-based
ConfigFilestruct, but providers are not part of the config schema. The gateway startup sequence connects to the database, initializes components, and resumes persisted sandboxes, but does not reconcile any declarative provider definitions.Adding declarative configuration requires building a provider sync mechanism that runs during gateway startup, reads declared providers from gateway.toml (sourced from Helm-generated ConfigMaps), resolves credentials from Kubernetes Secrets, and ensures the providers exist in the database before sandboxes resume.
Affected Components
crates/openshell-server/src/config_file.rsproviderssectioncrates/openshell-server/src/lib.rsrun_server()initializationcrates/openshell-server/src/provider_sync.rscrates/openshell-server/src/grpc/provider.rscreate_provider_record()validation and persistence logicdeploy/helm/openshell/values.yaml, templatesprovidersvalues schema, generate ConfigMap, update RBAC for Secret readsdocs/reference/gateway-config.mdx[openshell.gateway.providers]TOML section and credential sourcing patternsTechnical Investigation
Architecture Overview
Current Provider Storage:
objectstableobject_type = "provider",name(unique),id(UUID),payload(protobuf binary),resource_version(for optimistic concurrency control)GetInferenceBundleRequestor environment injectionProvider Creation Flow:
openshell provider createcommand → validates type, discovers credentials from flags or environmentCreateProviderRequestwith credentials and configcreate_provider_record()validates the provider type, checks for name collisions, and persists to database withMustCreateCAS conditionGateway Startup Sequence:
config_file.rs)Store::connect())ServerStatewith store, compute, auth componentsNo existing reconciliation mechanism: The gateway does not currently check for declarative provider definitions at startup. Providers are only created via explicit gRPC API calls.
Code References
crates/openshell-cli/src/main.rs:717-827ProviderCommands::Createstruct and argument parsingcrates/openshell-cli/src/run.rs:4476-4653provider_create()function: credential discovery, validation, gRPC callcrates/openshell-server/src/grpc/provider.rs:61-140create_provider_record()gRPC handler: validates type, persists to databasecrates/openshell-server/src/persistence/mod.rs:114-199Storeenum dispatching to PostgreSQL/SQLite backends, CAS operationsproto/datamodel.proto:32-44Providerprotobuf message schemacrates/openshell-server/src/config_file.rs:38-156ConfigFileandGatewayFileSectionstructs for gateway.toml parsingcrates/openshell-server/src/lib.rs:194-351run_server()gateway startup orchestration, provider refresh worker spawn at line 351deploy/helm/openshell/values.yamlprovidersarray addition)Current Behavior
Providers are created imperatively via CLI commands. The CLI:
--credential KEY=VALUEor--credential KEY(discovers from environment) flagsCreateProviderRequestwith credentials map and config mapMustCreate(fails if name already exists)Kubernetes deployments require a post-install Job that:
openshell provider createcommands for each providerWhat Would Need to Change
1. Config File Schema (
config_file.rs)providers: Option<Vec<ProviderDeclaration>>toGatewayFileSectionstructLiteralcredential sources (force operators to use Secrets)2. Provider Sync Module (new
provider_sync.rs)sync_declarative_providers(store: &Store, config_file: &ConfigFile, namespace: &str) -> Result<()>config_file.gateway.providerscreate_provider_record()to persistErron any failure (fail-fast: gateway does not start if provider sync fails)resolve_credential_sources(credentials: &[CredentialSource], namespace: &str) -> Result<HashMap<String, String>>SecretKeyRef: use kube client toGET /api/v1/namespaces/{namespace}/secrets/{secret_name}, extractdata[key], base64 decodeEnvVar: read fromstd::env::var()ensure_provider(store: &Store, decl: &ProviderDeclaration, credentials: HashMap<String, String>) -> Result<()>create_provider_record()logic (reuse validation)MustCreateconflict error → log warning, skip (provider already exists)3. Gateway Startup (
lib.rs)Store::connect())SandboxIndex::new()and sandbox resume to ensure providers exist before sandboxes try to use themconfig_file,store, and detected namespace (from K8s API or--namespaceflag)sync_declarative_providers()returnsErr4. Kubernetes Client Integration
kubecrate to read Secrets (already a dependency for K8s compute driver)openshell-driver-kubernetes5. Helm Chart Updates
providers: []array schemaprovidersinto gateway.toml as TOML arraygeton Secrets with labelopenshell.io/provider-credentials: "true"(label-based RBAC selector)Alternative Approaches Considered
1.
--auto-providersflag (rejected)2. Simplified post-install Job (rejected)
3. Init containers (rejected)
4. Lifecycle hooks (postStart) (rejected)
5. Full reconciliation with prune (rejected for initial version)
Patterns to Follow
1. Config File Parsing
config_file.rs#[serde(default)]for optional fieldsConfigFile::validate()method2. Error Handling
anyhow::Result<()>for provider synccontext()to add breadcrumbs for debugging (e.g., "failed to read Secret {name}/{key}")3. Kubernetes Client Usage
openshell-driver-kubernetesclient setup4. Provider Creation
create_provider_record()validation logic (don't duplicate)MustCreateCAS semantics (fail if provider already exists)5. Testing
resolve_credential_sources)Proposed Approach
Add a provider sync mechanism that runs during gateway startup:
Extend gateway.toml schema to include an optional
providersarray with name, type, credentials (sourced from Secrets or env vars), and config.Build provider sync module that reads declared providers from config, resolves credentials from Kubernetes Secrets using the kube client, and ensures each provider exists in the database by calling the existing
create_provider_record()function.Integrate sync into gateway startup immediately after database connection and before sandbox resume, ensuring providers are available when sandboxes start.
Update Helm chart to expose
providersin values.yaml, render them into a ConfigMap that mounts as gateway.toml, and update RBAC to allow reading Secrets with a specific label.Use create-only mode for initial version: declared providers are created if missing but ignored if they already exist. CLI-created providers are unaffected. Log warnings for name collisions. This avoids destructive operations and maintains backwards compatibility.
Fail fast on sync errors: if any provider fails to sync (Secret missing, invalid type, database error), the gateway does not start. This gives operators immediate feedback that configuration is broken.
Scope Assessment
featRisks & Open Questions
Risks:
valueFrom.secretKeyRef, reject literal credential values in TOML. Document that credentials MUST live in Secrets.getpermission on Secrets. Mitigation: Use label-based RBAC selector (openshell.io/provider-credentials), requiring operators to explicitly label provider Secrets.values.yaml. Document thatproviders: []is the safe default.Edge Cases:
Open Design Questions (need stakeholder input):
Sync strategy: Create-only (recommended), full reconciliation, or label-based hybrid (
managed-by: config)?Credential update policy: Should declarative config be allowed to update existing providers' credentials, or is create-only the permanent behavior?
RBAC scope: Should gateway read all Secrets in the namespace, only labeled Secrets (
openshell.io/provider-credentials: "true"), or support cross-namespace Secret reads?Failure mode: Fail gateway startup on any provider sync error (safe, recommended) or log error and continue with partial sync (forgiving, but leaves system in inconsistent state)?
Helm integration depth: Should provider credentials live in a separate Secret created outside Helm (recommended, matches operator pattern), or should Helm chart accept credential values that it templates into the Secret (dangerous, credentials in values files)?
Test Considerations
Unit Tests:
resolve_credential_sources()with mocked kube client: test SecretKeyRef resolution, env var fallback, error handling for missing Secrets/keysIntegration Tests:
E2E Tests:
providersin values.yaml, verify providers exist in database viaopenshell provider listTest Patterns from Existing Code:
crates/openshell-server/src/grpc/provider.rsuseTestFixturewith in-memory store.agents/skills/test-release-canary/useopenshellCLI for validationtests/directoryCreated by spike investigation. Review the proposed approach and design decisions, then use
build-from-issueto plan and implement.Checklist