loc translations, integrity test, token approx by raghavm243512 · Pull Request #147 · ServiceNow/eva

raghavm243512 · 2026-06-11T23:07:56Z

Force user goals to see translated location names so they don't say things in English
Database integrity test: constructs expected DB from initial DB and expected trace for every language. Everything from names, phone numbers, and location names are all translated per language during the replay

Threw in a small fallback for LLM and alm_vllm reasoning token count as well, and added compatibility for latest vllm versions

…ions

gabegma · 2026-06-12T19:42:35Z

@@ -3437,6 +3437,42 @@
            "event_type": "user_utterance",
            "utterance": "Yes, please submit it."
          },
+          {


It might not matter a lot, but these tool calls are not in the right place in the expected trace right?

yea those were hand filled, the trace is really only used to extract and apply write tools

I'm just nervous people might use them for other purposes, so ideally, the tool calls would still be correctly placed.

gabegma · 2026-06-12T19:59:24Z

-    resolved["starting_utterance"] = _replace_in(
-        utt, first, last, first_rom, last_rom, phone, comp_first, comp_first_rom
-    )
+    resolved_utt = _replace_in(utt, first, last, first_rom, last_rom, phone, comp_first, comp_first_rom)


This is not new - but it would be better to add * in the args of _replace_in, and to add keywords arguments here, since it would be easy to mess up the order here.

gabegma

Excellent work Raghav - thanks for taking the time to make this solution more robust and fail proof with a test!

raghavm243512 added 4 commits June 11, 2026 15:37

loc translations, integrity test, token approx

afcab21

use loc tags in trace

092c462

fallback for llm

23003b2

Merge branch 'main' of github.com:ServiceNow/eva into pr/loc_translat…

e3831cc

…ions