Skip to content

Add Bazel e2e test target and fix CityHash wire compatibility#511

Open
BYVoid wants to merge 3 commits into
ClickHouse:masterfrom
BYVoid:bazel-e2e-tests
Open

Add Bazel e2e test target and fix CityHash wire compatibility#511
BYVoid wants to merge 3 commits into
ClickHouse:masterfrom
BYVoid:bazel-e2e-tests

Conversation

@BYVoid

@BYVoid BYVoid commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Summary

Implements the server-dependent end-to-end tests under Bazel (the follow-up discussed in #510), and fixes a wire-compatibility bug those tests immediately surfaced.

Two independent commits:

  1. CityHash wire-compatibility fix — a correctness bug in the existing Bazel build, unrelated to the test infra but caught by it.
  2. //ut:e2e_tests target + Docker-backed CI — the actual E2E feature.

1. CityHash fix

The merged Bazel build pulls CityHash from the BCR @cityhash module, which packages Google's stock CityHash 1.0.2. ClickHouse's wire protocol checksums every compressed block with a frozen CityHash128 variant (see clickhouse/base/compressed.cpp) that differs from the stock one. Verified directly — same input, different output:

Source CityHash128("The quick brown fox…")
contrib/cityhash (what CMake uses) 69102202d326a2fde4b1346bbee531a1
BCR @cityhash (Google f5dc541) a7f9a86a2d60c968bf1498f876dbe279

So with the BCR module, a server rejects every LZ4/ZSTD block with DB::Exception: Checksum doesn't match: corrupted data. Any user of the Bazel-built library with compression enabled hits this. The fix builds the vendored contrib/cityhash as a Bazel target (:cityhash) and drops the BCR dependency, exactly mirroring the CMake build and guaranteeing byte-identical hashing.

This bug was introduced when the Bazel build switched cityhash to BCR; it went unnoticed precisely because there were no server-backed tests — which is what part 2 adds.

2. E2E test target

//ut:e2e_tests covers the client round-trip tests that need a live ClickHouse server (client_ut, Column_ut, roundtrip_tests, low_cardinality_nullable_tests, array_of_low_cardinality_tests, abnormal_column_names_test, readonly_client_test, connection_failed_client_test, performance_tests).

  • Tagged manual + external — excluded from wildcard expansion, so a bare bazel test //... stays hermetic and green without a server, and results are never cached (they depend on live server state).
  • Run it explicitly, exactly as suggested in Add Bazel unit test target and Windows (MSVC) support #510:
    bazel run //ut:e2e_tests
    CLICKHOUSE_HOST=my-host bazel test //ut:e2e_tests --test_output=all
    
  • Connects via the existing CLICKHOUSE_HOST / _PORT / _USER / _PASSWORD / _DB env vars (default localhost:9000).

CI

The Bazel workflow runs the E2E suite against a live server on every matrix platform (3 OS × 3 TLS = 9 jobs), not just Linux. Each OS starts the server the same way the corresponding CMake workflow already does — hosted runners can't all spin up a Linux container identically:

OS How the server is started Mirrors
Linux ci/docker-compose.yml via hoverkraft-tech/compose-action linux.yml
macOS native ClickHouse build downloaded and run as a process (./clickhouse server) macos.yml
Windows the Linux container under WSL2 + podman-compose windows_msvc.yml

All three expose the server on localhost:9000, which is the E2E suite's default target, so the test invocation (bazel test //ut:e2e_tests) is identical across platforms. //ut:e2e_tests is also built explicitly on every platform first — the manual tag means bazel build //... skips it — giving compile/link coverage everywhere even though the server run is what exercises it.

Notes / scope

  • No runfiles needed. These tests are pure env-driven network clients — none read data files — so the runfiles plumbing I anticipated in Add Bazel unit test target and Windows (MSVC) support #510 turned out unnecessary. (If a future test needs fixtures, that's where runfiles would come in.)
  • ssl_ut.cpp is intentionally not included: it connects to an external secure endpoint (play.clickhouse.com:9440) with OS-specific CA paths, which the local docker server doesn't provide. Covering it would mean configuring a TLS-enabled server in CI; left for a separate change.

Test plan

  • bazel test //... runs only //ut:unit_tests (e2e correctly excluded by manual), green.
  • bazel build //ut:e2e_tests succeeds for tls=boringssl|openssl|no.
  • bazel test //ut:e2e_tests against a local docker ClickHouse 25.12: 665 passed, 50 skipped. With the BCR cityhash (before the fix) the compression round-trips failed with checksum-mismatch; after the fix they pass.
  • bazel run //ut:e2e_tests -- --gtest_filter=... passes gtest args through.

BYVoid added 2 commits June 8, 2026 14:17
The BCR @cityhash module packages Google's stock CityHash 1.0.2, whose
CityHash128 differs from the frozen variant ClickHouse uses to checksum
compressed blocks. With the BCR module a server rejects every LZ4/ZSTD
block with "Checksum doesn't match: corrupted data". Build the copy in
contrib/ instead, exactly as the CMake build does.
Adds //ut:e2e_tests covering the client round-trip tests that need a
live ClickHouse server. Tagged manual+external so `bazel test //...`
stays hermetic; CI starts a server via docker-compose and runs it on
Linux, mirroring the CMake build's workflow.
@BYVoid BYVoid force-pushed the bazel-e2e-tests branch from ecfb1a9 to e4682da Compare June 8, 2026 21:17
BYVoid added a commit to BYVoid/bazel-central-registry that referenced this pull request Jun 8, 2026
…-cpp#511)

Migrates the CityHash dependency change from upstream
ClickHouse/clickhouse-cpp#511: instead of the BCR @cityhash module, build
CityHash from the vendored contrib/cityhash/ copy.

ClickHouse's wire protocol checksums compressed blocks with a frozen
CityHash128 variant; the stock Google CityHash 1.0.2 that BCR packages
produces different hashes, so a server rejects compressed blocks with
'Checksum doesn't match: corrupted data'. The contrib/ copy is byte-identical
to the CMake build, guaranteeing wire compatibility.

  * overlay/BUILD.bazel: add a vendored ':cityhash' cc_library (with the
    MSVC '/DWIN32' copt from bazelbuild#511 so config.h's __builtin_expect gate
    behaves), and point the main library at ':cityhash'.
  * overlay/MODULE.bazel: drop the 'cityhash' bazel_dep.

Only the CityHash change is migrated; the e2e test target and ut/ changes
from bazelbuild#511 are not.
macOS starts a native ClickHouse process and Windows runs the Linux
container under WSL2 + podman, mirroring macos.yml and windows_msvc.yml,
so the e2e suite exercises a live server on every platform rather than
Linux alone.
@BYVoid BYVoid force-pushed the bazel-e2e-tests branch from 63c8a04 to ebc87d1 Compare June 8, 2026 21:55
@BYVoid BYVoid marked this pull request as ready for review June 8, 2026 23:17
@BYVoid BYVoid requested review from mzitnik and slabko as code owners June 8, 2026 23:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant