Audio: MFCC: Use the MFCC module as compress PCM encoder with discontinuous stream by singalsu · Pull Request #10814 · thesofproject/sof

singalsu · 2026-05-26T15:46:04Z

This PR adds commits to previous VAD add PR #10782

audio: mfcc: switch to source/sink API, int32 output, and DTX
base_fw: advertise BESPOKE codec for MFCC compress capture
audio: mfcc: update decode tools and add Python compress scripts
tools: topology: add MFCC compress capture for jack and DMIC

A kernel PR for encoder type ALSA controls fix is needed to run this.

singalsu · 2026-05-26T16:33:10Z

Note: To run the MFCC compress topologies, need kernel patches thesofproject/linux#5647 and thesofproject/linux#5789.

Copilot

Pull request overview

This PR extends the SOF MFCC component and related tooling/topology to support VAD + DTX behavior and to use MFCC as a compress PCM “encoder” that can emit discontinuous (DTX-suppressed) feature frames, including optional IPC4 control notifications for VAD state.

Changes:

Add MFCC VAD/DTX support in firmware (new VAD implementation, frame header with VAD/energy fields, optional IPC4 notifications, and compress-output mode).
Add/adjust topology2 definitions to expose MFCC feature capture for both normal PCM and compress PCM on SDW jack/DMIC, including new build targets.
Update MFCC tuning/export and host-side decode/visualization/transcription tools (Matlab/Octave + Python scripts), plus new documentation.

Reviewed changes

Copilot reviewed 40 out of 40 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tools/topology/topology2/platform/intel/sdw-jack-audio-feature.conf	Adds MFCC frame sizing define and VAD mixer control naming for jack feature capture.
tools/topology/topology2/platform/intel/sdw-jack-audio-feature-compress.conf	New compress PCM MFCC feature-capture topology for jack (MFCC encoder type, blob selection, VAD control).
tools/topology/topology2/platform/intel/sdw-dmic-audio-feature.conf	Adds MFCC frame sizing define and VAD mixer control naming for DMIC feature capture.
tools/topology/topology2/platform/intel/sdw-dmic-audio-feature-compress.conf	New compress PCM MFCC feature-capture topology for DMIC (MFCC encoder type, blob selection, VAD control).
tools/topology/topology2/platform/intel/dmic1-mfcc.conf	Renames MFCC bytes control and adds VAD mixer control naming.
tools/topology/topology2/include/pipelines/cavs/host-gateway-src-mfcc-capture.conf	Adds MFCC_FRAME_BYTES-driven ibs/obs to support variable-sized (compress) MFCC frames.
tools/topology/topology2/include/components/mfcc/mel80.conf	Updates exported MFCC configuration blob.
tools/topology/topology2/include/components/mfcc/mel80_compress.conf	New exported MFCC configuration blob for compress output.
tools/topology/topology2/include/components/mfcc/mel80_compress_dtx.conf	New exported MFCC configuration blob for compress output + DTX.
tools/topology/topology2/include/components/mfcc/default.conf	Updates exported default MFCC configuration blob.
tools/topology/topology2/include/components/mfcc/ceps13_compress_dtx.conf	New exported MFCC configuration blob for cepstral output + compress + DTX.
tools/topology/topology2/include/components/mfcc.conf	Adds mixer control template to MFCC widget and allows type override (e.g., encoder).
tools/topology/topology2/include/common/common_definitions.conf	Adds default feature flags for SDW jack/DMIC compress MFCC capture.
tools/topology/topology2/include/bench/mfcc_controls_playback.conf	Enables an MFCC mixer switch control in bench playback controls.
tools/topology/topology2/include/bench/mfcc_controls_capture.conf	Enables an MFCC mixer switch control in bench capture controls.
tools/topology/topology2/development/tplg-targets.cmake	Renames MFCC topology targets and adds compress MFCC mel/ceps variants with frame sizing + blob selection.
tools/topology/topology2/cavs-sdw.conf	Adds feature-gated includes for new compress MFCC capture topologies.
src/include/user/mfcc.h	Extends MFCC config ABI with VAD/DTX/compress flags and timing parameters.
src/include/sof/audio/mfcc/mfcc_vad.h	New VAD API/state definitions for MFCC.
src/include/sof/audio/mfcc/mfcc_comp.h	Refactors MFCC component interfaces (source/sink API, frame header, VAD/DTX state, IPC4 helpers).
src/audio/mfcc/tune/sof_mel_to_text_live_dsp_vad.py	New live Whisper transcription script using DSP VAD embedded in PCM stream.
src/audio/mfcc/tune/sof_mel_to_text_live_compress.py	New live Whisper transcription script for compress PCM + DTX/discontinuous frames.
src/audio/mfcc/tune/sof_mel_spectrogram_compress.py	New live mel spectrogram viewer for compress PCM MFCC frames.
src/audio/mfcc/tune/sof_ceps_spectrogram_compress.py	New live cepstral viewer for compress PCM MFCC frames.
src/audio/mfcc/tune/setup_mfcc.m	Updates blob export for new config layout; adds compress + DTX blob exports.
src/audio/mfcc/tune/README.txt	Removed in favor of README.md.
src/audio/mfcc/tune/README.md	New markdown documentation for tuning, decoding, and live scripts.
src/audio/mfcc/tune/decode_mel.m	Updates decoder for new int32 + header format and DTX gap filling.
src/audio/mfcc/tune/decode_ceps.m	Updates decoder for new int32 + header format and DTX gap filling.
src/audio/mfcc/tune/decode_all.m	Updates batch decode to new decoder signatures and int32 outputs.
src/audio/mfcc/mfcc.c	Moves MFCC to source/sink API processing, hooks VAD notifications and compress/DTX behavior.
src/audio/mfcc/mfcc_vad.c	New VAD implementation (noise floor tracking + weighted energy + hangover).
src/audio/mfcc/mfcc_setup.c	Adds VAD init, DTX/compress state init, buffer free fixes, sample-rate limit check.
src/audio/mfcc/mfcc_ipc4.c	New IPC4 control notification plumbing for VAD state reporting.
src/audio/mfcc/mfcc_hifi4.c	Removes old stream-buffer source copy implementations (now in common source/sink code).
src/audio/mfcc/mfcc_hifi3.c	Removes old stream-buffer source copy implementations (now in common source/sink code).
src/audio/mfcc/mfcc_generic.c	Removes old stream-buffer source copy implementations (now in common source/sink code).
src/audio/mfcc/mfcc_common.c	Adds source/sink copy funcs, header/VAD handling, legacy vs compress output paths, and DTX suppression logic.
src/audio/mfcc/CMakeLists.txt	Registers new mfcc_vad.c and conditionally mfcc_ipc4.c in build.
src/audio/base_fw.c	Advertises BESPOKE codec capability for MFCC compress capture.

Copilot

Pull request overview

Copilot reviewed 40 out of 40 changed files in this pull request and generated 4 comments.

Copilot

Pull request overview

Copilot reviewed 40 out of 40 changed files in this pull request and generated 2 comments.

Copilot

Pull request overview

Copilot reviewed 40 out of 40 changed files in this pull request and generated 7 comments.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Pull request overview

Copilot reviewed 32 out of 32 changed files in this pull request and generated 1 comment.

Copilot

Pull request overview

Copilot reviewed 32 out of 32 changed files in this pull request and generated no new comments.

kv2019i

Code changes look good, some notes of newly added python apps.

lyakh

nothing critical, can be addressed later at the next convenience

kv2019i

My comments addressed.

Copilot

Pull request overview

Copilot reviewed 32 out of 32 changed files in this pull request and generated 5 comments.

Switch from process_audio_stream to source/sink API. Add compress PCM output mode (variable-size frames, no zero padding) alongside legacy mode (full period with zero-fill). Unify all output to int32 Q9.23 regardless of source format. Remove out_data_ptr_32, mel_spectra int16 copy, mfcc_func typedef, and per-format output functions from mfcc_common/hifi3/hifi4. Add DTX for compress mode: suppress silence frames after configurable trailing count, with optional periodic keepalive. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>

Register SND_AUDIOCODEC_BESPOKE capture in codec info TLV when CONFIG_COMP_MFCC is enabled so the kernel detects compress capture support via IPC4_SOF_CODEC_INFO. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>

Update Octave decode scripts for int32 Q9.23 output and DTX gap filling. Add DTX blob generation to setup_mfcc.m. Add Python compress capture tools: sof_mel_spectrogram_compress.py, sof_ceps_spectrogram_compress.py, sof_mel_to_text_live_compress.py. Refactor sof_mel_to_text_live_dsp_vad.py to use shared compress capture code. Add README with usage examples. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>

Add sdw-jack-audio-feature-compress.conf (PCM 53, pipeline 132) and sdw-dmic-audio-feature-compress.conf (PCM 54, pipeline 133) for compress MFCC capture with DTX blobs. Fix buffer sizes: set MFCC obs and host-copier ibs/obs to 344 bytes (24-byte header + 80 x int32). Add mel and ceps compress topology targets for MTL and ARL. Rename normal MFCC topologies to *-mfcc-mel-normal for clarity. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>

singalsu changed the title q Audio: MFCC: Use the MFCC module as compress PCM encoder with discontinuous stream May 26, 2026

singalsu mentioned this pull request May 26, 2026

ASoC: dapm: Add encoder and decoder widget types to kcontrol handling thesofproject/linux#5789

Merged

singalsu commented May 26, 2026

View reviewed changes

Comment thread src/audio/mfcc/mfcc.c Outdated

Comment thread src/audio/mfcc/mfcc.c

Comment thread src/audio/mfcc/mfcc_generic.c Outdated

singalsu force-pushed the mfcc_compress_encoder branch from d5267b3 to 969d644 Compare May 27, 2026 08:16

singalsu changed the title ~~Audio: MFCC: Use the MFCC module as compress PCM encoder with discontinuous stream~~ [DNM] Audio: MFCC: Use the MFCC module as compress PCM encoder with discontinuous stream May 27, 2026

singalsu marked this pull request as ready for review May 27, 2026 09:54

Copilot AI review requested due to automatic review settings May 27, 2026 09:54

singalsu requested review from a team, dbaluta, jsarha, kv2019i, lbetlej, lgirdwood, mmaka1, plbossart and ranj063 as code owners May 27, 2026 09:54

Copilot started reviewing on behalf of singalsu May 27, 2026 09:54 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

singalsu requested a review from Copilot May 27, 2026 13:14

Copilot started reviewing on behalf of singalsu May 27, 2026 13:14 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

singalsu force-pushed the mfcc_compress_encoder branch from 35e56d5 to 71404ce Compare May 27, 2026 14:23

singalsu requested a review from Copilot May 27, 2026 14:24

Copilot started reviewing on behalf of singalsu May 27, 2026 14:24 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

Comment thread src/audio/mfcc/mfcc_common.c

Comment thread src/audio/mfcc/mfcc_common.c Outdated

singalsu force-pushed the mfcc_compress_encoder branch from 71404ce to 97a3c57 Compare May 28, 2026 13:03

singalsu requested a review from Copilot May 28, 2026 13:04

Copilot started reviewing on behalf of singalsu May 28, 2026 13:04 View session

Copilot AI reviewed May 28, 2026

View reviewed changes

Copilot AI reviewed May 28, 2026

singalsu requested a review from Copilot May 28, 2026 18:09

Copilot started reviewing on behalf of singalsu May 28, 2026 18:09 View session

Copilot AI reviewed May 28, 2026

View reviewed changes

Comment thread src/audio/mfcc/mfcc.c

lyakh reviewed May 29, 2026

View reviewed changes

Comment thread src/audio/mfcc/mfcc.c Outdated

Comment thread src/audio/mfcc/mfcc.c Outdated

Comment thread src/audio/mfcc/mfcc_common.c

Comment thread src/audio/mfcc/mfcc_common.c Outdated

singalsu force-pushed the mfcc_compress_encoder branch from b491cf1 to 65792ff Compare May 29, 2026 09:27

singalsu requested a review from Copilot May 29, 2026 09:30

Copilot started reviewing on behalf of singalsu May 29, 2026 09:30 View session

Copilot AI reviewed May 29, 2026

View reviewed changes

singalsu force-pushed the mfcc_compress_encoder branch from 65792ff to 4ff0a7e Compare May 29, 2026 10:20

singalsu requested a review from lyakh May 29, 2026 10:21

lyakh reviewed Jun 1, 2026

View reviewed changes

Comment thread src/audio/mfcc/mfcc_common.c

singalsu force-pushed the mfcc_compress_encoder branch from 4ff0a7e to 66743ea Compare June 3, 2026 10:39

singalsu requested a review from lyakh June 3, 2026 10:46

kv2019i requested changes Jun 3, 2026

View reviewed changes

Comment thread src/audio/mfcc/tune/sof_ceps_spectrogram_compress.py

Comment thread src/audio/mfcc/tune/sof_mel_spectrogram_compress.py

Comment thread src/audio/mfcc/tune/sof_mel_to_text_live_compress.py

singalsu force-pushed the mfcc_compress_encoder branch from 66743ea to 687e01b Compare June 5, 2026 11:56

singalsu requested a review from kv2019i June 5, 2026 11:59

lyakh reviewed Jun 8, 2026

View reviewed changes

Comment thread src/audio/mfcc/mfcc.c Outdated

Comment thread src/audio/mfcc/mfcc_common.c Outdated

kv2019i approved these changes Jun 10, 2026

View reviewed changes

singalsu force-pushed the mfcc_compress_encoder branch from 687e01b to e1fa45b Compare June 11, 2026 11:30

singalsu requested review from Copilot and lyakh June 11, 2026 11:30

Copilot started reviewing on behalf of singalsu June 11, 2026 16:46 View session

Copilot AI reviewed Jun 11, 2026

View reviewed changes

Comment thread src/audio/mfcc/tune/sof_mel_to_text_live_compress.py

Comment thread src/audio/mfcc/tune/sof_mel_to_text_live_compress.py

Comment thread src/audio/mfcc/mfcc_common.c

Comment thread src/audio/mfcc/tune/README.md Outdated

Comment thread src/audio/mfcc/tune/README.md Outdated

singalsu force-pushed the mfcc_compress_encoder branch from e1fa45b to 9f14d81 Compare June 16, 2026 07:17

singalsu added 4 commits June 16, 2026 10:23

base_fw: advertise BESPOKE codec for MFCC compress capture

25eade1

Register SND_AUDIOCODEC_BESPOKE capture in codec info TLV when CONFIG_COMP_MFCC is enabled so the kernel detects compress capture support via IPC4_SOF_CODEC_INFO. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>

singalsu force-pushed the mfcc_compress_encoder branch from 9f14d81 to 4c40079 Compare June 16, 2026 07:24

Conversation

singalsu commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

singalsu commented May 26, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

kv2019i left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lyakh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kv2019i left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

singalsu commented May 26, 2026 •

edited

Loading