Skip to content

feat(triton): add embed and chatComplete support#1677

Open
goingforstudying-ctrl wants to merge 1 commit into
Portkey-AI:mainfrom
goingforstudying-ctrl:feat/triton-embed-chatcomplete
Open

feat(triton): add embed and chatComplete support#1677
goingforstudying-ctrl wants to merge 1 commit into
Portkey-AI:mainfrom
goingforstudying-ctrl:feat/triton-embed-chatcomplete

Conversation

@goingforstudying-ctrl
Copy link
Copy Markdown

Summary

This PR adds support for embeddings and chat completions to the Triton inference server provider, which were previously returning "is not supported by triton" errors.

Changes

  • Added TritonEmbedConfig and TritonEmbedResponseTransform for /v1/embeddings
  • Added TritonChatCompleteConfig, TritonChatCompleteResponseTransform, and TritonChatCompleteStreamChunkTransform for /v1/chat/completions
  • Updated TritonAPIConfig to route chatComplete and embed to the correct OpenAI-compatible endpoints
  • Registered all new transforms in TritonConfig

Test Plan

  • Build passes successfully
  • App starts without errors

Fixes #1189

- Add TritonEmbedConfig and TritonEmbedResponseTransform for /v1/embeddings
- Add TritonChatCompleteConfig, TritonChatCompleteResponseTransform,
  and TritonChatCompleteStreamChunkTransform for /v1/chat/completions
- Update TritonAPIConfig with endpoints for chatComplete and embed
- Register new transforms in TritonConfig index

Fixes Portkey-AI#1189
@goingforstudying-ctrl
Copy link
Copy Markdown
Author

Hi, just checking in on this PR. It has been a few days since the last update. Please let me know if there is anything else needed from my side to move this forward. Thanks for your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug Report: Self-Hosted Gateway's triton provider lacks support for embeddings and chat/completions

1 participant