Skip to content

feat: CSI Token Requests#561

Open
ThirdEyeSqueegee wants to merge 7 commits intomainfrom
tokens
Open

feat: CSI Token Requests#561
ThirdEyeSqueegee wants to merge 7 commits intomainfrom
tokens

Conversation

@ThirdEyeSqueegee
Copy link
Member

@ThirdEyeSqueegee ThirdEyeSqueegee commented Jan 30, 2026

Issue #, if available:
#400, #422

Description of changes:

Summary

This PR adopts the CSI driver's tokenRequests mechanism for obtaining pod service account tokens. CSI Token Requests is the standard way for CSI drivers to impersonate the pods they mount volumes for — the driver mints service account tokens on behalf of each pod and passes them to the provider. Previously, the provider minted these tokens itself using the K8s TokenRequest API, which required serviceaccounts/token create RBAC permission. With this change, the provider receives pre-minted tokens from the CSI driver, eliminating that permission and simplifying the provider's role to consuming tokens rather than creating them. Instead of minting Kubernetes service account tokens itself (via the TokenRequest API), the provider now receives pre-minted tokens from the CSI driver through the standard tokenRequests mechanism.

Motivation

Previously, the provider held a Kubernetes client and called the TokenRequest API directly to create service account tokens for both IRSA (sts.amazonaws.com audience) and Pod Identity (pods.eks.amazonaws.com audience). This required:

  • serviceaccounts/token create permission in the ClusterRole
  • A K8s client threaded through the auth layer into credential providers
  • Complex mock infrastructure in tests (mock STS, mock K8s client, mock token creation)

The CSI driver already supports a tokenRequests feature that projects service account tokens into the mount request attributes. By leveraging this, the provider becomes a simpler consumer of pre-fetched tokens rather than an active participant in token creation.

How it works

The CSI driver's CSIDriver resource is configured with tokenRequests specifying the required audiences:

spec:
  tokenRequests:
    - audience: "sts.amazonaws.com"
    - audience: "pods.eks.amazonaws.com"

On each mount (and remount during rotation), the CSI driver mints fresh tokens via the K8s TokenRequest API and passes them to the provider in the csi.storage.k8s.io/serviceAccount.tokens volume attribute. The provider parses this JSON, extracts the token for the appropriate audience (IRSA or Pod Identity), and passes it to the credential provider.

Before:

Mount() → auth.NewAuth() → auth.GetAWSConfig() → credential_provider → K8s TokenRequest API → STS/PodIdentity

After:

Mount() → parse CSI tokens → credential_provider(token) → STS/PodIdentity

Changes by file

Deleted files

  • auth/auth.go (113 lines) — The entire auth orchestration package is removed. This contained the Auth struct, NewAuth() constructor, GetAWSConfig() dispatcher, getAppID() helper, and the ProviderVersion/ProviderName constants. All of this functionality is either inlined into the server or eliminated.
  • auth/auth_test.go (221 lines) — All tests for the auth package (TestNewAuth, TestGetAWSConfig, TestAppID).

New files

  • utils/token_parser.go (42 lines) — Defines ParseServiceAccountTokens() which parses the CSI driver's serviceAccount.tokens JSON into a map[string]ServiceAccountToken, and GetTokenForAudience() which extracts a non-empty token for a specific audience. Also defines the IRSAAudience and PodIdentityAudience constants. Error messages explicitly point to tokenRequests configuration as the fix, since a missing/empty tokens attribute is the most likely misconfiguration.
  • utils/token_parser_test.go (134 lines) — Table-driven tests for both functions covering: empty input, invalid JSON, valid multi-audience, valid single-audience, missing audience, empty token value, and audience constant validation.

Application code changes

  • server/server.go — The core integration point. Mount() now extracts serviceAccount.tokens from the mount attributes, calls ParseServiceAccountTokens() upfront, then uses GetTokenForAudience() to select the right token based on usePodIdentity. The getRoleARN() method (previously in IRSACredentialProvider) is moved here since the server already has the K8s client. The appID() method and ProviderName/ProviderVersion constants are moved from the deleted auth package. getAwsConfigs() signature changes to accept roleArn and token directly instead of namespace/serviceAccount/podName. A pre-existing bug in the len(awsConfigs) > 2 guard was fixed — it previously returned nil, err where err was nil (from the preceding successful getAwsConfigs call), now returns a proper error.

  • credential_provider/irsa_credential_provider.go — Significantly simplified. The irsaTokenFetcher (which called the K8s TokenRequest API) is replaced with csiTokenFetcher, a trivial struct that returns a pre-fetched token string to satisfy the stscreds.IdentityTokenRetriever interface. NewIRSACredentialProvider() now takes (region, roleArn, appID, token string) instead of (stsClient, region, namespace, serviceAccount, appID, k8sClient). getRoleARN() is moved to the server. GetAWSConfig() creates an STS client inline rather than receiving one.

  • credential_provider/pod_identity_credential_provider.go — Same pattern. The podIdentityTokenFetcher (K8s TokenRequest API with BoundObjectRef) is replaced with csiTokenProvider, a trivial struct returning a pre-fetched token for the endpointcreds.AuthTokenProvider interface. NewPodIdentityCredentialProvider() drops the namespace, serviceAccount, podName, and k8sClient parameters, taking token string instead. No more input validation for nil K8s client.

  • credential_provider/credential_provider.go — Unchanged. The ConfigProvider interface remains the same.

  • main.go — Removes the auth package import. Changes auth.ProviderName to server.ProviderName. Adds a startup log message: "This provider requires tokenRequests to be configured in the CSIDriver spec".

Test changes

  • server/server_test.gobuildMountReq() now injects CSI tokens JSON into the mount attributes (both IRSA and Pod Identity tokens with far-future expiration). Error message assertions updated from "An IAM role must be associated" to "IAM role must be associated". auth.ProviderName references changed to ProviderName. Four new tests added:

    • TestMountMissingTokensAttribute — verifies the error when serviceAccount.tokens is absent from mount attributes
    • TestMountTokenAudienceMismatch — verifies the error when usePodIdentity=true but only the IRSA token audience is present
    • TestMountMaxRegionsExceeded — validates the max-regions guard doesn't false-positive (and that the bug fix returns a real error)
    • TestAppID — table-driven test for appID() with and without EKS addon version override (coverage moved from deleted auth_test.go)
  • credential_provider/credential_provider_test.go — All mock types removed (mockSTS, mockK8sV1, mockK8sV1SA). Only shared test constants remain.

  • credential_provider/irsa_credential_provider_test.go — Rewritten from 172 to ~51 lines. Complex mock-based tests replaced with simple tests: TestNewIRSACredentialProvider, TestCSITokenFetcher, TestIRSACredentialProvider_GetAWSConfig.

  • credential_provider/pod_identity_credential_provider_test.go — Reduced from 489 to ~100 lines. Complex mock infrastructure removed. New simple tests plus two new dual-stack tests:

    • TestPodIdentityCredentialProvider_GetAWSConfig_AutoFallback — verifies IPv6 works when IPv4 is unreachable
    • TestPodIdentityCredentialProvider_GetAWSConfig_BothFail — verifies failure when both endpoints are unreachable

RBAC changes

  • charts/secrets-store-csi-driver-provider-aws/templates/rbac.yaml — Removes the serviceaccounts/token create rule from the ClusterRole.
  • deployment/aws-provider-installer.yaml — Same removal, plus adds tolerations: [{operator: Exists}] for EKS Auto Mode compatibility.
  • deployment/private-installer.yaml — Same removal and toleration addition.

Helm chart changes

  • charts/secrets-store-csi-driver-provider-aws/values.yaml — Adds tokenRequests configuration to the CSI driver sub-chart dependency, specifying both audiences (sts.amazonaws.com and pods.eks.amazonaws.com).

CI changes

  • .github/workflows/docker-image.yml — Build arg changed from auth.ProviderVersion to server.ProviderVersion.
  • .github/workflows/integ.yml — Same change.
  • Makefile — LDFLAGS changed from auth.ProviderVersion to server.ProviderVersion.

Documentation changes

  • README.md — Adds a new "Separate CSI Driver Installation" section explaining how to configure tokenRequests when installing the CSI driver separately (both Helm and kubectl examples).

Integration test changes

  • tests/integration.bats.template — Adds --set tokenRequests[0].audience=sts.amazonaws.com --set tokenRequests[1].audience=pods.eks.amazonaws.com to the Helm install command. Adds a new test case "Verify serviceaccounts/token create permission is not granted to provider" that checks the ClusterRole.

Testing

  • All existing unit tests pass (adapted for the new token flow)
  • 6 new unit tests added covering token parsing, error paths, and edge cases
  • Integration tests pass across all 4 configurations (arm-irsa, arm-pod-identity, x64-irsa, x64-pod-identity) including rotation tests, confirming the CSI driver correctly passes fresh tokens on remount

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@ThirdEyeSqueegee ThirdEyeSqueegee requested a review from a team as a code owner January 30, 2026 23:48
@codecov
Copy link

codecov bot commented Jan 30, 2026

Codecov Report

❌ Patch coverage is 87.34177% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.57%. Comparing base (ea74e6c) to head (ada59f6).

Files with missing lines Patch % Lines
server/server.go 82.97% 4 Missing and 4 partials ⚠️
main.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #561      +/-   ##
==========================================
- Coverage   61.56%   59.57%   -2.00%     
==========================================
  Files          11       11              
  Lines         752      705      -47     
==========================================
- Hits          463      420      -43     
+ Misses        274      268       -6     
- Partials       15       17       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ThirdEyeSqueegee ThirdEyeSqueegee added the safe-to-test Pull Request has been manually reviewed and deemed to be safe to run integration tests on. label Jan 30, 2026
@github-actions github-actions bot removed the safe-to-test Pull Request has been manually reviewed and deemed to be safe to run integration tests on. label Feb 2, 2026
@ThirdEyeSqueegee ThirdEyeSqueegee added safe-to-test Pull Request has been manually reviewed and deemed to be safe to run integration tests on. and removed safe-to-test Pull Request has been manually reviewed and deemed to be safe to run integration tests on. labels Feb 2, 2026
@github-actions github-actions bot removed the safe-to-test Pull Request has been manually reviewed and deemed to be safe to run integration tests on. label Feb 2, 2026
@ThirdEyeSqueegee ThirdEyeSqueegee added the safe-to-test Pull Request has been manually reviewed and deemed to be safe to run integration tests on. label Feb 2, 2026
@simonmarty simonmarty added safe-to-test Pull Request has been manually reviewed and deemed to be safe to run integration tests on. and removed safe-to-test Pull Request has been manually reviewed and deemed to be safe to run integration tests on. labels Feb 3, 2026
@ThirdEyeSqueegee ThirdEyeSqueegee requested review from a team, i-am-SR and simonmarty and removed request for a team and simonmarty February 6, 2026 19:19
@github-actions github-actions bot removed the safe-to-test Pull Request has been manually reviewed and deemed to be safe to run integration tests on. label Feb 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants