Add Gemma 4 architecture support to TransformerBridge#1377
Conversation
Adds a text-only adapter covering both Gemma4ForConditionalGeneration (E2B/E4B/31B/26B-A4B) and Gemma4UnifiedForConditionalGeneration (12B), addressing TransformerLensOrg#1297. Gemma 4 layers are heterogeneous: KV-shared layers drop k/v projections, K==V layers drop v_proj, and per-layer-embedding / MoE submodules appear only on some variants -- all mapped optional and delegated to HF. Unlike Gemma 1-3, Gemma4RMSNorm has no (1+weight) offset. Adds DelegatedAttentionBlockBridge (drops the split-QKV fork aliases, as MLABlockBridge does) so hook-alias resolution stays clean when attention is delegated wholesale to HF. google/gemma-4-E2B-it passes verification (P1 100%, P2 100%, P4 94.7%). - New adapter + four-place registration + gemma4/gemma4_unified model_type mappings - 10 checkpoints added to the model registry - Unit + integration tests (logit parity vs HF on all three structural variants)
|
@punishell We do have a different contributor actively working on this already. Once his implementation is ready I'll review both and determine which is correct for TransformerLens. We will also want full multimodal support, not just text only (See Gemma3ForConditional's architecture adapter for details on how that works) |
|
I needed it for my current project so made it and pushed :)
…On Wed, Jun 10, 2026 at 4:12 PM Jonah Larson ***@***.***> wrote:
*jlarson4* left a comment (TransformerLensOrg/TransformerLens#1377)
<#1377 (comment)>
@punishell <https://github.com/punishell> We do have a different
contributor actively working on this already. Once his implementation is
ready I'll review both and determine which is correct for TransformerLens.
We will also want full multimodal support, not just text only (See
Gemma3ForConditional's architecture adapter for details on how that works)
—
Reply to this email directly, view it on GitHub
<#1377?email_source=notifications&email_token=AH45ASP2M6WRKVLVBPKCACD47FUFDA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTINRXGEYTKNJQGM4KM4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJLDGN5XXIZLSL5RWY2LDNM#issuecomment-4671155038>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH45ASKAPPNU6A5RKQR2NVL47FUFDAVCNFSNUABFKJSXA33TNF2G64TZHM2TEOJTHE2DGNJVHNEXG43VMU5TINRTGEZDEMRTG42KC5QC>
.
Triage notifications, keep track of coding agent tasks and review pull
requests on the go with GitHub Mobile for iOS
<https://github.com/notifications/mobile/ios/AH45ASPRAF3FOPHMBBAM7BL47FUFDA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTINRXGEYTKNJQGM4KM4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJKTGN5XXIZLSL5UW64Y>
and Android
<https://github.com/notifications/mobile/android/AH45ASIOJKSMZMLP3RVNO5D47FUFDA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTINRXGEYTKNJQGM4KM4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJLTGN5XXIZLSL5QW4ZDSN5UWI>.
Download it today!
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
@punishell Oh that's wonderful! Glad to hear that it's working for you, and thank you for using TransformerLens! |
|
@punishell after reviewing both this solution and @huseyincavusbi's draft for his initial setup, I am hoping that Hüseyin can build on what you've put together here. I especially like your solution for delegating the attention bridge, very clever. Thanks again for putting this up! |
|
@punishell quick update! I will be closing this PR. We rebased #1385 onto your code, you have been credited as a co-author for that PR commit. This was done to ensure Gemma4 is merged with multimodal support. Thank you for putting this together, it was invaluable in getting Gemma4 integrated into the project. |
|
Thank you
…On Tue, Jun 23, 2026 at 5:48 PM Jonah Larson ***@***.***> wrote:
*jlarson4* left a comment (TransformerLensOrg/TransformerLens#1377)
<#1377 (comment)>
@punishell <https://github.com/punishell> quick update! I will be closing
this PR. We rebased #1385
<#1385> onto
your code, you have been credited as a co-author for that PR commit
<8e76367>.
This was done to ensure Gemma4 is merged with multimodal support.
Thank you for putting this together, it was invaluable in getting Gemma4
integrated into the project.
—
Reply to this email directly, view it on GitHub
<#1377?email_source=notifications&email_token=AH45ASOGCDHO2TTGXJ2GSJ35BKREXA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTINZYGA4TIMBVG432M4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJLDGN5XXIZLSL5RWY2LDNM#issuecomment-4780940577>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH45ASOFQG4FYLLS5AMTE7L5BKREXAVCNFSNUABFKJSXA33TNF2G64TZHM2TEOJTHE2DGNJVHNEXG43VMU5TINRTGEZDEMRTG42KC5QC>
.
Triage notifications, keep track of coding agent tasks and review pull
requests on the go with GitHub Mobile for iOS
<https://github.com/notifications/mobile/ios/AH45ASJAOQSB455IW2V2IUL5BKREXA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTINZYGA4TIMBVG432M4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJKTGN5XXIZLSL5UW64Y>
and Android
<https://github.com/notifications/mobile/android/AH45ASN2CVCJ3PEGAMA54MD5BKREXA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTINZYGA4TIMBVG432M4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJLTGN5XXIZLSL5QW4ZDSN5UWI>.
Download it today!
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Description
Adds TransformerBridge support for Google's Gemma 4 family (released April 2026), which had no support in TransformerLens.
Fixes #1297
A single text-only adapter covers both architectures:
Gemma4ForConditionalGeneration— E2B / E4B / 31B / 26B-A4BGemma4UnifiedForConditionalGeneration— the encoder-free 12B (needs transformers >= 5.10)Gemma 4 layers are heterogeneous, so the adapter delegates all math to HF and maps variant-specific submodules
optional: KV-shared layers drop k/v projections, K==V layers dropv_proj, and Per-Layer-Embedding / MoE submodules appear only on some variants. Unlike Gemma 1-3,Gemma4RMSNormhas no(1 + weight)offset.Adds
DelegatedAttentionBlockBridge(drops the split-QKV fork aliases, mirroringMLABlockBridge) so hook-alias resolution stays clean when attention is delegated wholesale to HF.google/gemma-4-E2B-itpassesverify_models(P1 100%, P2 100%, P4 94.7%).Type of change
Checklist: