Skip to content

Better spectrograms for UI#452

Merged
akashmjn merged 13 commits intomainfrom
claude/debug-orchestrator-images-Ag6nR
Mar 14, 2026
Merged

Better spectrograms for UI#452
akashmjn merged 13 commits intomainfrom
claude/debug-orchestrator-images-Ag6nR

Conversation

@akashmjn
Copy link
Collaborator

@akashmjn akashmjn commented Mar 13, 2026

Summary

  • Spectrogram visualizer generates a mel spectrogram with freq labels for better readability. Uses src.audio_frontend module (same for model inference)
  • 480 mel bins match the 480px image height for 1:1 pixel-per-bin rendering (n_fft=4096, hop_length=1024)
  • Adds optional test_spectrogram_viz pytest for local visual inspection (--save-debug)
  • Color map change magma -> Blues

Test plan

  • Manually running pytest tests/test_audio_preprocessing.py -k "spectrogram_viz" --save-debug — generates 1280x480 PNG to tests/tmp/ for inspection
spectrogram_viz
  • Verify orchestrator still runs end-to-end, saving images to azure with *LocalDebug orch_config

Fixes #429
Fixes #139

claude and others added 5 commits March 13, 2026 18:46
Replace the split-in-half STFT approach in spectrogram_visualizer with the
model's audio_frontend (load_processed_waveform + featurize_waveform). This
produces a single mel spectrogram with uniform color normalization, fixing
the purple/blue half-and-half artifacts in the moderator UI.

https://claude.ai/code/session_018X71PrWAjeFTXkEdqH65W7
…ms (#429)

Spectrogram visualizer now uses its own visualization config instead of
the model's inference config:
- No resampling — uses native sample rate (48kHz for hydrophones)
- mel_f_min=20Hz, mel_f_max=Nyquist (full audible bandwidth)
- 960 mel bins matching 960px image height for 1:1 pixel rendering
- n_fft=8192, hop_length=2048 for good frequency resolution
- Output: 1920x960 PNG with uniform magma colormap

Also adds an optional pytest (test_spectrogram_viz) under
test_audio_preprocessing for local visual inspection via --save-debug.

https://claude.ai/code/session_018X71PrWAjeFTXkEdqH65W7
@akashmjn akashmjn changed the title Refactor spectrogram generation to use model's audio frontend Better spectrograms for UI Mar 13, 2026
@akashmjn akashmjn requested a review from dthaler March 13, 2026 21:07
@akashmjn akashmjn marked this pull request as ready for review March 13, 2026 21:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates spectrogram generation in the InferenceSystem so the moderator UI receives more consistent, visualization-optimized spectrogram images derived from the audio clip’s native sample rate (instead of the model inference config). It also adds a local-only pytest for visually inspecting the generated PNG output.

Changes:

  • Refactors spectrogram_visualizer.write_spectrogram() to use model.audio_frontend and a visualization-specific config derived from the WAV’s native sample rate.
  • Adds an optional test_spectrogram_viz test that generates and validates a 1280x480 spectrogram PNG (and can save a debug artifact).
  • Registers a new optional pytest marker.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
InferenceSystem/src/spectrogram_visualizer.py Replaces the previous “two halves stitched together” approach with a single mel-spectrogram render pipeline using audio_frontend + Matplotlib.
InferenceSystem/tests/test_audio_preprocessing.py Adds an optional visual-inspection test to generate/validate a spectrogram image.
InferenceSystem/tests/conftest.py Registers the optional pytest marker for local-debug tests.

You can also share your feedback on Copilot code review. Take the survey.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves the InferenceSystem’s spectrogram PNG generation used by the orchestrator (and ultimately shown in the moderator UI) by switching the visualizer to the shared model.audio_frontend mel-spectrogram pipeline and adding a small visualization-focused test.

Changes:

  • Refactors spectrogram_visualizer.write_spectrogram() to compute a mel spectrogram via model.audio_frontend and render it with matplotlib (including frequency labels + new colormap).
  • Adds a new pytest test that generates a spectrogram from the 1-minute fixture WAV and asserts the output image dimensions (with optional --save-debug output copying).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
InferenceSystem/src/spectrogram_visualizer.py Replaces the previous STFT/concatenation approach with a single mel-spectrogram render path and overlays frequency labels.
InferenceSystem/tests/test_audio_preprocessing.py Adds a visualization-oriented test that generates and validates the PNG output (and optionally saves it for local inspection).

You can also share your feedback on Copilot code review. Take the survey.

akashmjn and others added 3 commits March 14, 2026 08:53
Co-authored-by: Dave Thaler <dthaler1968@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@akashmjn akashmjn merged commit 6a81be9 into main Mar 14, 2026
26 checks passed
@akashmjn akashmjn deleted the claude/debug-orchestrator-images-Ag6nR branch March 14, 2026 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix weird looking spectrograms generated for detections Potential Audio splicing issue

4 participants