Skip to content

THREESCALE-15096: Encrypt OIDC tokens#596

Open
jlledom wants to merge 7 commits into
masterfrom
THREESCALE-15096-encrypt-oidc-tokens
Open

THREESCALE-15096: Encrypt OIDC tokens#596
jlledom wants to merge 7 commits into
masterfrom
THREESCALE-15096-encrypt-oidc-tokens

Conversation

@jlledom

@jlledom jlledom commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Description:

In order to protect access tokens from leakages, this PR encrypts the column in DB.

We use the Rails 8.0 column encryption feature to implement the encryption. It's pretty easy:

  1. Declare the column as encrypted, in the rails model
  2. Provide the encryption key and salt to rails configuration

The PR includes a new initializer which configures the encryption feature from env variables: ZYNC_DATABASE_ENCRYPT_KEY and ZYNC_DATABASE_ENCRYPT_SALT.

Those must be generated accoring to Rails official docs: https://guides.rubyonrails.org/v8.0/active_record_encryption.html#setup

ZYNC_DATABASE_ENCRYPT_KEY will also accept comma-separated values, to be interpreted as an array of keys. This is needed by the key rotation mechanism. When multiple keys are provided, all keys are attempted to decrypt values, but only the last one is used to encrypt. So a value like ZYNC_DATABASE_ENCRYPT_KEY="leaked_key,new_key", will encrypt all the new records with new_key while still being able to read the old values that were encrypted with leaked_key. A full resync from porta will encrypt all tokens with new_key.

Decisions made:

  1. Deterministic encryption: I decided to use non-deterministic encryption, which is safer but has the drawback of preventing SELECT queries to the column by the plaintext value. Due to that, I added a commit that refactors the two cases where we use such queries.
  2. Read plaintext values: To keep backwards compatibility, we accept non-encrypted values in the table. New values will be encrypted so tokens will remain unencrypted until being rotated. A full resync from porta would encrypt the whole table.

The PR also includes changes in the Dockerfile and the CircleCI config, to provide the new env variables, which are mandatory now. Midstream for alpha might need to be updated as well.

Jira issue:

https://redhat.atlassian.net/browse/THREESCALE-15096

jlledom added 4 commits June 12, 2026 13:29
Implements ActiveRecord::Encryption for the Tenant model to protect
access tokens in the database. Uses non-deterministic encryption for
maximum security since access_token is never queried via SQL (only
read from loaded objects). Supports key rotation via comma-separated
ZYNC_DATABASE_ENCRYPT_KEY env var. Existing plaintext tokens remain
readable during transition via support_unencrypted_data flag.

Related to THREESCALE-15096

Assisted-by: Claude Code
Non-deterministic encryption produces different ciphertext for each
write, breaking SQL queries on encrypted columns. Refactored two
locations that previously queried by access_token:

- message_bus.rb: Changed from find_by(access_token:, id:) to
  find_by(id:) followed by Ruby comparison after decryption
- seeds.rb: Removed access_token from find_or_create_by! finder
  args to prevent duplicate creation on repeated runs

Both maintain the same behavior but work correctly with encrypted
values.

Assisted-by: Claude Code
Adds two new required environment variables for database field
encryption to all deployment configurations:

- ZYNC_DATABASE_ENCRYPT_KEY: Primary encryption key (supports
  comma-separated values for key rotation)
- ZYNC_DATABASE_ENCRYPT_SALT: Key derivation salt

Updated OpenShift secret template, deployment configs for both
zync-puma and zync-que containers, and Dockerfile build step with
placeholder values. These must be provisioned with actual random
values before deploying encryption-enabled code.

Assisted-by: Claude Code
Adds three test cases for Tenant encryption behavior:

1. Verifies access_token is encrypted at rest (ciphertext differs
   from plaintext, encrypted_attribute? returns true)
2. Confirms fixtures bypass encryption and remain plaintext
   (backward compatibility via support_unencrypted_data)
3. Tests key rotation: old key can decrypt, new key encrypts

Configures test encryption keys in test_helper.rb and adds them to
all CircleCI jobs so tests pass in CI.

Assisted-by: Claude Code
@jlledom jlledom self-assigned this Jun 12, 2026
# all keys decrypt. To rotate: set "oldkey,newkey", re-encrypt rows, then
# remove the old key.
encryption_config.primary_key = ENV["ZYNC_DATABASE_ENCRYPT_KEY"]&.split(",")
encryption_config.key_derivation_salt = ENV["ZYNC_DATABASE_ENCRYPT_SALT"]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requiring ZYNC_DATABASE_ENCRYPT_KEY and
ZYNC_DATABASE_ENCRYPT_SALT as separate secrets is unnecessary. Rails already
has Rails.application.key_generator which deterministically derives keys from
SECRET_KEY_BASE. The initializer could be:

config/initializers/active_record_encryption.rb

encryption_config = Rails.application.config.active_record.encryption

key_gen = Rails.application.key_generator
encryption_config.primary_key =
key_gen.generate_key("active_record_encryption/primary_key", 32)
encryption_config.key_derivation_salt =
key_gen.generate_key("active_record_encryption/key_derivation_salt", 32)
encryption_config.support_unencrypted_data = true

This eliminates both new env vars, all the OpenShift secret/deployment
template changes, the CI config additions, the Dockerfile placeholders, and
the test_helper env setup. Zero new operator burden.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the above is sound for an implementation even more so because the tokens would be temporary now.

But here is the advantage for having independent keys:

The Rails guides and community do recommend keeping them separate. The main
  security argument is blast radius isolation: SECRET_KEY_BASE is used broadly
  across the Rails stack (message verifiers, encryptors, cookie signing) and
  flows through more code paths, making it statistically more likely to be
  accidentally logged or exposed. If it leaks and encryption keys are derived
  from it, the attacker gets DB decryption for free. With separate keys,
  compromising SECRET_KEY_BASE only gives them sessions/cookies — not encrypted
  DB columns.

  That said, your counter-argument still holds weight for zync specifically:
  both secrets live in the same OpenShift Secret object. An attacker who can
  read one can read all of them. So the isolation is largely theoretical in this
  deployment model.

  It's a pragmatic trade-off: the Rails-recommended approach (separate keys) is
  more defensively correct, but for zync — an API-only app where all secrets
  share the same store — the real-world security gain is marginal vs. the
  concrete operator cost of managing extra secrets.

I don't think we need that and users don't need that complication. If you really want cryptographically independent keys, I would suggest to generate random keys when such are not found and store them encrypted by the secret key base. This would be best of both worlds. More complicated though so probably still not worth. Some AI thoughts about that below

  The practical question is where to persist the encrypted key material. In
  zync's containerized/OpenShift deployment, filesystem is ephemeral. You'd need
  to store it in the DB (a small encryption_keys table) — which works fine
  since the key protects column values, not the DB connection itself. But that
  adds a migration, bootstrap-on-first-run logic, and handling for
  SECRET_KEY_BASE rotation (re-encrypt the stored keys).

  It's a genuinely better security design, but for one encrypted attribute in a
  deployment where all secrets already share the same OpenShift Secret object,
  I'd say the complexity isn't justified today. If zync grows more encrypted
  attributes, that's when envelope encryption and per-attribute keys start
  paying for themselves.

  For now the simpler derivation from SECRET_KEY_BASE is proportionate to
  the actual threat model.

I'm fine with any approach that doesn't make our users suffer for questionable benefits.

@jlledom jlledom Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: 08e50cd

The drawback is we lose the ability to rotate keys with zero downtime. If the secret key is leaked, the admin would need to launch the resync job from porta and assume Zync being down until it finishes.

I'm fine with any approach that doesn't make our users suffer for questionable benefits.

With the external encryption keys approach, users won't suffer. It only implies more work for the operator engineers, but it can be done in a way users don't have to do anything. The operator can create the keys and set the variables. Only in the case of a zero-downtime rotation the user would need to edit the secret. So from the user point of view, the only reason the new approach implies less trouble is because the only source of trouble would be rotation and rotation is just not supported in the new approach.

Comment thread config/initializers/message_bus.rb Outdated
Comment on lines +33 to +34
env['message_bus.tenant'] ||= Tenant.find_by(access_token: access_token, id: tenant_id)&.to_gid_param
tenant = Tenant.find_by(id: tenant_id)
env['message_bus.tenant'] ||= tenant.to_gid_param if tenant&.access_token == access_token

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI review said this, I haven't verified. On the other hand, I'm not sure this message_bus is used anymore.

  Bug in message_bus.rb

  The refactored lookup at config/initializers/message_bus.rb:33 has a subtle
  behavioral change:

  # Before (line 33 on master):
  env['message_bus.tenant'] ||= Tenant.find_by(access_token: access_token, id:
  tenant_id)&.to_gid_param

  # After:
  tenant = Tenant.find_by(id: tenant_id)
  env['message_bus.tenant'] ||= tenant.to_gid_param if tenant&.access_token ==
  access_token

  The original ||= short-circuits the entire RHS — if env['message_bus.tenant']
  is already set, no DB query runs. The new code always runs find_by regardless.
  Should be:

  env['message_bus.tenant'] ||= begin
    tenant = Tenant.find_by(id: tenant_id)
    tenant.to_gid_param if tenant&.access_token == access_token
  end

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread test/models/tenant_test.rb Outdated
support_unencrypted_data: true
)
end
end No newline at end of file

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no new line 🫨

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I beg you to accept my apologies.

jlledom added 3 commits June 15, 2026 12:18
Eliminate ZYNC_DATABASE_ENCRYPT_KEY and ZYNC_DATABASE_ENCRYPT_SALT
env vars by deriving both keys from the existing SECRET_KEY_BASE
via Rails.application.key_generator. This reduces operator burden
(no new secrets to generate/rotate) while maintaining encryption
security. All secrets already share the same OpenShift Secret
object, so the blast-radius isolation of separate keys provides
negligible real-world benefit in this deployment model.

Removes encryption env vars from OpenShift templates, Dockerfile
build step, CI config, and test helper setup.

Assisted-by: Claude Code
Wrap the tenant lookup and comparison in a begin block so the
||= operator short-circuits when env['message_bus.tenant'] is
already set, avoiding an unnecessary DB query. The original
refactoring to support encrypted access_token comparison
inadvertently broke this optimization by always running
find_by regardless of the cached value.

Assisted-by: Claude Code
Update the key rotation test to read the derived primary_key and
key_derivation_salt from Rails.application.config.active_record.encryption
instead of ENV vars, since keys are now generated from SECRET_KEY_BASE
rather than read from external env vars. Also add trailing newline
to satisfy linter.

Assisted-by: Claude Code

test "existing plaintext access_token is readable" do
# Fixtures bypass AR Encryption (encrypt_fixtures defaults to false),
# so fixture values are stored as plaintext in the database.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we disallow reading non-encrypted entries that will break fixtures 🤔 but anyway, I guess world can't be perfect yet

@akostadinov akostadinov left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants