enabled embeddings of raw turns, small other edits#22
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces an opt-in switch to embed raw conversation turns on write (so they can be vector-searched), and updates both the SDK and Function App plumbing to support searching either the “memories” container or the “turns” container. It also updates Cosmos container provisioning so the turns container is always created with vector + full-text policies (primed for future turn-search) and standardizes vector index type to quantizedFlat.
Changes:
- Add
ENABLE_TURN_EMBEDDINGS/enable_turn_embeddingstoggle to control whetherturndocs get embeddings on write. - Add
targetparameter to search APIs to route vector search to either memories or turns containers. - Update Cosmos provisioning (SDK + IaC) and tests/docs to reflect always-on turns vector/full-text policies and
quantizedFlatvector indexes.
Reviewed changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_thresholds.py | Adds unit coverage for ENABLE_TURN_EMBEDDINGS env parsing/defaults. |
| tests/unit/test_cosmos_memory_client.py | Updates provisioning expectations: turns container now includes vector/full-text policies and quantizedFlat. |
| tests/unit/test_container_routing.py | Adds tests for resolving search(target=...) to the correct container. |
| tests/unit/store/test_memory_store.py | Adds coverage ensuring turn embeddings are skipped by default and enabled when configured; validates search routing to turns. |
| tests/unit/function_app/test_pipeline_factory.py | Ensures FA pipeline wires enable_turn_embeddings into MemoryStore. |
| tests/unit/aio/test_cosmos_memory_client.py | Async equivalent of turns container provisioning assertions. |
| tests/unit/aio/store/test_memory_store.py | Async equivalent coverage for turn embedding toggle and search routing. |
| infra/modules/functions.bicep | Adds enableTurnEmbeddings parameter and app setting ENABLE_TURN_EMBEDDINGS. |
| infra/modules/cosmos.bicep | Switches vector index type to quantizedFlat; primes turns container with vector/full-text policies and updates excluded path pattern for turns embedding. |
| infra/main.parameters.json | Adds parameter wiring for enableTurnEmbeddings. |
| infra/main.bicep | Adds top-level enableTurnEmbeddings parameter and wires into functions module; updates embedding-dimensions description for quantizedFlat. |
| function_app/shared/pipeline_factory.py | Passes enable_turn_embeddings from config into MemoryStore. |
| function_app/shared/config.py | Adds get_enable_turn_embeddings() to read ENABLE_TURN_EMBEDDINGS. |
| function_app/local.settings.json.template | Documents and adds local ENABLE_TURN_EMBEDDINGS setting. |
| Docs/public_api.md | Documents new constructor arg enable_turn_embeddings and search_cosmos(..., target=...). |
| Docs/concepts.md | Documents turn-embedding default/flag and how to search turns. |
| azure/cosmos/agent_memory/thresholds.py | Adds defaults + env parsing for enabling turn embeddings. |
| azure/cosmos/agent_memory/store/memory_store.py | Implements enable_turn_embeddings behavior + search(..., target=...) routing. |
| azure/cosmos/agent_memory/processors/inprocess.py | Wires SDK in-process processor store creation to the env-based toggle. |
| azure/cosmos/agent_memory/cosmos_memory_client.py | Adds enable_turn_embeddings constructor arg; primes turns container with vector/full-text policies; adds search_cosmos(..., target=...). |
| azure/cosmos/agent_memory/aio/store/memory_store.py | Async equivalent: turn embedding toggle + search(..., target=...). |
| azure/cosmos/agent_memory/aio/processors/inprocess.py | Async equivalent wiring for env-based toggle. |
| azure/cosmos/agent_memory/aio/cosmos_memory_client.py | Async equivalent constructor/provisioning/search changes. |
| azure/cosmos/agent_memory/_utils.py | Updates container policy builder: quantizedFlat, salience composite toggle, and excluded embedding path pattern. |
| azure/cosmos/agent_memory/_container_routing.py | Adds resolve_search_target() mapping public targets to containers. |
| azure/cosmos/agent_memory/_base/base_client.py | Adds base config wiring so enable_turn_embeddings comes from arg or env default. |
| .env.template | Documents and adds ENABLE_TURN_EMBEDDINGS default. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| indexing_policy = { | ||
| "includedPaths": [{"path": "/*"}], | ||
| "excludedPaths": [ | ||
| {"path": "/embedding/*"}, | ||
| {"path": "/source_memory_ids/*"}, | ||
| {"path": "/supersedes_ids/*"}, | ||
| ], |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (2)
infra/modules/cosmos.bicep:111
- The containers’ indexingPolicy.excludedPaths no longer excludes the /embedding field. Embedding vectors are large arrays; indexing them in the standard index can significantly increase RU/index size without improving vector search (vectorIndexes/vectorEmbeddingPolicy cover vector search).
Recommend re-adding an excludedPaths entry for /embedding to keep indexing costs predictable.
excludedPaths: [
{
path: '/source_memory_ids/*'
}
{
infra/modules/cosmos.bicep:196
- Same as above for the turns container: indexing the /embedding array in the standard index can inflate RU/index/storage costs. Excluding /embedding from scalar indexing still allows vector search via vectorIndexes/vectorEmbeddingPolicy.
excludedPaths: [
{
path: '/source_memory_ids/*'
}
{
aayush3011
left a comment
There was a problem hiding this comment.
Overall the PR looks, some changes to be made before it can be merged.
| if not include_superseded: | ||
| qb.add_is_null_or_undefined("c.superseded_by") | ||
|
|
||
| query = f"SELECT * FROM c{qb.build_where()} ORDER BY c.created_at DESC" |
There was a problem hiding this comment.
We need to project explicit columns here excluding embedding. With enable_turn_embedding=True, get_thread() will return the vectors as well.
| if not include_superseded: | ||
| qb.add_is_null_or_undefined("c.superseded_by") | ||
|
|
||
| query = f"SELECT * FROM c{qb.build_where()} ORDER BY c.created_at DESC" |
There was a problem hiding this comment.
We need to project explicit columns here excluding embedding. With enable_turn_embedding=True, get_thread() will return the vectors as well.
| store = MemoryStore( | ||
| containers=containers, | ||
| embeddings_client=embeddings_client, | ||
| enable_turn_embeddings=get_enable_turn_embeddings(), |
There was a problem hiding this comment.
Turn embeddings only ever get created in one spot: when the client saves a turn (through store.add() / store.push() ). That's the only code that looks at this switch.
Setting the enable_turn_embeddings doesn't do anything here, the value is picked from the client.
we should remove this.
| store = AsyncMemoryStore( | ||
| containers=containers, | ||
| embeddings_client=embeddings_client, | ||
| enable_turn_embeddings=get_enable_turn_embeddings(), |
There was a problem hiding this comment.
Turn embeddings only ever get created in one spot: when the client saves a turn (through store.add() / store.push() ). That's the only code that looks at this switch.
Setting the enable_turn_embeddings doesn't do anything here, the value is picked from the client.
we should remove this.
| store = MemoryStore( | ||
| containers=containers, | ||
| embeddings_client=embeddings, | ||
| enable_turn_embeddings=config.get_enable_turn_embeddings(), |
There was a problem hiding this comment.
Same as others
Turn embeddings only ever get created in one spot: when the client saves a turn (through store.add() / store.push() ). That's the only code that looks at this switch.
Setting the enable_turn_embeddings doesn't do anything here, the value is picked from the client.
we should remove this.
| ) | ||
|
|
||
|
|
||
| def get_enable_turn_embeddings() -> bool: |
There was a problem hiding this comment.
Same as others
Turn embeddings only ever get created in one spot: when the client saves a turn (through store.add() / store.push() ). That's the only code that looks at this switch.
Setting the enable_turn_embeddings doesn't do anything here, the value is picked from the client.
we should remove this.
| name: 'MEMORY_PROCESSOR_OWNER' | ||
| value: memoryProcessorOwner | ||
| } | ||
| { |
There was a problem hiding this comment.
Same as others
Turn embeddings only ever get created in one spot: when the client saves a turn (through store.add() / store.push() ). That's the only code that looks at this switch.
Setting the enable_turn_embeddings doesn't do anything here, the value is picked from the client.
we should remove this.
| ]) | ||
| param memoryProcessorOwner string | ||
|
|
||
| @description('Embed raw conversation turns on write so they can be vector-searched. The turns container is always provisioned with a vector index; this only toggles embedding generation. Default false.') |
There was a problem hiding this comment.
Same as others
Turn embeddings only ever get created in one spot: when the client saves a turn (through store.add() / store.push() ). That's the only code that looks at this switch.
Setting the enable_turn_embeddings doesn't do anything here, the value is picked from the client.
we should remove this.
No description provided.