Skip to content

loc translations, integrity test, token approx#147

Merged
raghavm243512 merged 5 commits into
mainfrom
pr/loc_translations
Jun 12, 2026
Merged

loc translations, integrity test, token approx#147
raghavm243512 merged 5 commits into
mainfrom
pr/loc_translations

Conversation

@raghavm243512

@raghavm243512 raghavm243512 commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Force user goals to see translated location names so they don't say things in English
Database integrity test: constructs expected DB from initial DB and expected trace for every language. Everything from names, phone numbers, and location names are all translated per language during the replay

Threw in a small fallback for LLM and alm_vllm reasoning token count as well, and added compatibility for latest vllm versions

Comment thread data/itsm_dataset.json
@@ -3437,6 +3437,42 @@
"event_type": "user_utterance",
"utterance": "Yes, please submit it."
},
{

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might not matter a lot, but these tool calls are not in the right place in the expected trace right?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea those were hand filled, the trace is really only used to extract and apply write tools

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just nervous people might use them for other purposes, so ideally, the tool calls would still be correctly placed.

Comment thread src/eva/assistant/pipeline/alm_vllm.py Outdated
Comment thread src/eva/assistant/services/llm.py Outdated
Comment thread src/eva/assistant/services/llm.py
Comment thread src/eva/utils/culture.py Outdated
Comment thread src/eva/utils/culture.py Outdated
Comment thread src/eva/utils/culture.py Outdated
resolved["starting_utterance"] = _replace_in(
utt, first, last, first_rom, last_rom, phone, comp_first, comp_first_rom
)
resolved_utt = _replace_in(utt, first, last, first_rom, last_rom, phone, comp_first, comp_first_rom)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not new - but it would be better to add * in the args of _replace_in, and to add keywords arguments here, since it would be easy to mess up the order here.

Comment thread src/eva/utils/culture.py Outdated
Comment thread tests/unit/data/test_db_integrity.py
Comment thread tests/unit/utils/test_multilingual_integrity.py Outdated
Comment thread tests/unit/data/test_db_integrity.py
Comment thread tests/unit/data/test_db_integrity.py

@gabegma gabegma left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work Raghav - thanks for taking the time to make this solution more robust and fail proof with a test!

@raghavm243512 raghavm243512 enabled auto-merge June 12, 2026 21:03
@raghavm243512 raghavm243512 added this pull request to the merge queue Jun 12, 2026
Merged via the queue into main with commit 2a58980 Jun 12, 2026
1 check passed
@raghavm243512 raghavm243512 deleted the pr/loc_translations branch June 12, 2026 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants