Fix $formatInteger word/Roman/letter formatting for values beyond the… by rayokota · Pull Request #6 · rayokota/jsonata-cpp

rayokota · 2026-06-15T02:48:51Z

… int64 range

Summary

$formatInteger(1e46, 'w') (and other magnitudes above INT64_MAX) produced garbage or crashed instead of returning
"ten billion trillion trillion trillion".

The builtin coerced its argument with Utils::toLong, narrowing the double to int64_t. For values beyond the int64 range this is undefined behavior: on x86-64 it yields INT64_MIN (negative), on ARM64 it saturates to INT64_MAX. Either way the value then indexed out of bounds in the word tables (few[], decades[], magnitudes[]) — a silent junk read on release libc++/libstdc++, and a hard assertion abort on MSVC's checked iterators.

This case was previously hidden behind an ignoreError: true test override; it never actually computed the correct result on any platform.

Fix

Carry the value as a double all the way through the word/Roman/letter formatting path, matching the JSONata JS reference (which is the source of the test suite's expected value):

formatInteger builtin now reads the argument via Utils::toDouble instead of Utils::toLong.
formatInteger, numberToWords, and lookup signatures changed from int64_t to double.
lookup rewritten with double arithmetic (std::floor / std::fmod / std::pow); formatInteger floors the value up front (matching JS Math.floor) and renders the DECIMAL case via std::ostringstream with std::fixed/setprecision(0) to avoid scientific notation.
Removed the now-obsolete function-formatInteger/formatInteger.json_43 override so the case is verified for real.

Note on floating-point contraction

After the rewrite, the result was correct except for a spurious trailing remainder. The cause was FP contraction: at -O2, clang fuses num - mant * factor into a single full-precision FMA, whereas JS rounds mant * factor to a double before subtracting. Computing the product in its own statement (double product = mant * factor;) forces the intermediate rounding and reproduces the JS result exactly.

Compatibility note

The Java reference (jsonata-java) uses long throughout and cannot represent these magnitudes, so this is an intentional divergence from the Java port toward the JS reference for out-of-int64 values.

Testing

All 1655 generated tests pass, including the previously-overridden formatInteger case for 1e46.

… int64 range ## Summary `$formatInteger(1e46, 'w')` (and other magnitudes above `INT64_MAX`) produced garbage or crashed instead of returning `"ten billion trillion trillion trillion"`. The builtin coerced its argument with `Utils::toLong`, narrowing the double to `int64_t`. For values beyond the int64 range this is **undefined behavior**: on x86-64 it yields `INT64_MIN` (negative), on ARM64 it saturates to `INT64_MAX`. Either way the value then indexed out of bounds in the word tables (`few[]`, `decades[]`, `magnitudes[]`) — a silent junk read on release libc++/libstdc++, and a hard assertion abort on MSVC's checked iterators. This case was previously hidden behind an `ignoreError: true` test override; it never actually computed the correct result on any platform. ## Fix Carry the value as a `double` all the way through the word/Roman/letter formatting path, matching the JSONata **JS reference** (which is the source of the test suite's expected value): - `formatInteger` builtin now reads the argument via `Utils::toDouble` instead of `Utils::toLong`. - `formatInteger`, `numberToWords`, and `lookup` signatures changed from `int64_t` to `double`. - `lookup` rewritten with double arithmetic (`std::floor` / `std::fmod` / `std::pow`); `formatInteger` floors the value up front (matching JS `Math.floor`) and renders the DECIMAL case via `std::ostringstream` with `std::fixed`/`setprecision(0)` to avoid scientific notation. - Removed the now-obsolete `function-formatInteger/formatInteger.json_43` override so the case is verified for real. ### Note on floating-point contraction After the rewrite, the result was correct except for a spurious trailing remainder. The cause was FP contraction: at `-O2`, clang fuses `num - mant * factor` into a single full-precision FMA, whereas JS rounds `mant * factor` to a double before subtracting. Computing the product in its own statement (`double product = mant * factor;`) forces the intermediate rounding and reproduces the JS result exactly. ## Compatibility note The Java reference (`jsonata-java`) uses `long` throughout and cannot represent these magnitudes, so this is an intentional divergence from the Java port toward the JS reference for out-of-int64 values. ## Testing - All 1655 generated tests pass, including the previously-overridden `formatInteger` case for `1e46`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

rayokota merged commit db0bd7d into master Jun 15, 2026
4 checks passed

rayokota mentioned this pull request Jun 15, 2026

porting jsonata-cpp to use QJson instead of nlohmann json library #3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix $formatInteger word/Roman/letter formatting for values beyond the…#6

Fix $formatInteger word/Roman/letter formatting for values beyond the…#6
rayokota merged 1 commit into
masterfrom
fix-format-int

rayokota commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rayokota commented Jun 15, 2026

Summary

Fix

Note on floating-point contraction

Compatibility note

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant