Skip to content

[TASK-101] Voicemail transcription#230

Open
rbienvenu0 wants to merge 9 commits intomasterfrom
claude-voicemail-transcript
Open

[TASK-101] Voicemail transcription#230
rbienvenu0 wants to merge 9 commits intomasterfrom
claude-voicemail-transcript

Conversation

@rbienvenu0
Copy link
Contributor

@rbienvenu0 rbienvenu0 commented Mar 10, 2026

  • claude: Bootstrap markdown file
  • transcript: Start transcript for voicemails

Note

Medium Risk
Adds a new event-driven plugin that fetches voicemail recordings via wazo-calld and submits audio to an external transcription service, plus new Celery retry logic; misconfiguration or service failures could impact webhook processing and background workers. Also introduces a GitHub Action that writes PR metadata to Notion, which depends on repository secrets and external API availability.

Overview
Introduces a new voicemail_transcription plugin that subscribes to voicemail-created bus events, fetches the voicemail recording from wazo-calld, submits it to a configured transcription service, and schedules a Celery task (poll_transcription_job) to poll job status with dynamic retry countdowns.

Adds supporting configuration and typing (calld and voicemail_transcription sections), wires the new task into Celery startup, and updates integration test assets to include a scribed mock service; includes unit tests covering the handler, plugin wiring, and polling/retry behavior.

Separately adds a GitHub workflow (pr-notion-sync.yml) that upserts PR records into a Notion database based on bracketed card IDs in the PR title and updates relations/status/metadata on most PR events.

Written by Cursor Bugbot for commit de96439. This will update automatically on new commits. Configure here.

@rbienvenu0 rbienvenu0 marked this pull request as ready for review March 10, 2026 21:41
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for all 3 issues found in the latest run.

  • ✅ Fixed: ChainMap config not serializable for Celery tasks
    • Both voicemail event handlers now pass dict(self._config) to Celery so task arguments are JSON-serializable.
  • ✅ Fixed: Completed transcription result is silently discarded
    • The completed branch now includes the transcription payload in the completion log instead of dropping the result data.
  • ✅ Fixed: Auth token fetched unnecessarily during poll retries
    • Auth token retrieval was moved after the polling branch so retries with an existing job_id no longer call auth.

Create PR

Or push these changes by commenting:

@cursor push 8261c13931
Preview (8261c13931)
diff --git a/wazo_webhookd/plugins/voicemail_transcription/celery_tasks.py b/wazo_webhookd/plugins/voicemail_transcription/celery_tasks.py
--- a/wazo_webhookd/plugins/voicemail_transcription/celery_tasks.py
+++ b/wazo_webhookd/plugins/voicemail_transcription/celery_tasks.py
@@ -44,18 +44,6 @@
 
     task.max_retries = max_poll_attempts
 
-    token = get_auth_token(config['auth'])
-
-    # License check for user voicemails
-    if require_license_check and user_uuid:
-        confd_config = config.get('confd', {})
-        if not user_has_uc_license(confd_config, user_uuid, token):
-            logger.info(
-                'User %s does not have UC license, skipping transcription',
-                user_uuid,
-            )
-            return
-
     # If we already have a job_id, we're polling for the result
     if job_id:
         result = get_transcription_result(service_url, job_id)
@@ -63,9 +51,10 @@
 
         if status == 'completed':
             logger.info(
-                'Transcription completed for voicemail %s, message %s',
+                'Transcription completed for voicemail %s, message %s: %s',
                 voicemail_id,
                 message_id,
+                result,
             )
             return
 
@@ -98,6 +87,18 @@
             countdown=countdown,
         )
 
+    token = get_auth_token(config['auth'])
+
+    # License check for user voicemails
+    if require_license_check and user_uuid:
+        confd_config = config.get('confd', {})
+        if not user_has_uc_license(confd_config, user_uuid, token):
+            logger.info(
+                'User %s does not have UC license, skipping transcription',
+                user_uuid,
+            )
+            return
+
     # Fetch audio and submit transcription job
     calld_config = config.get('calld', {})
     audio_data = fetch_voicemail_recording(

diff --git a/wazo_webhookd/plugins/voicemail_transcription/handler.py b/wazo_webhookd/plugins/voicemail_transcription/handler.py
--- a/wazo_webhookd/plugins/voicemail_transcription/handler.py
+++ b/wazo_webhookd/plugins/voicemail_transcription/handler.py
@@ -40,7 +40,7 @@
         )
 
         transcribe_voicemail_task.delay(
-            self._config,
+            dict(self._config),
             voicemail_id,
             message_id,
             user_uuid=user_uuid,
@@ -67,7 +67,7 @@
         )
 
         transcribe_voicemail_task.delay(
-            self._config,
+            dict(self._config),
             voicemail_id,
             message_id,
             user_uuid=None,

diff --git a/wazo_webhookd/plugins/voicemail_transcription/tests/__init__.py b/wazo_webhookd/plugins/voicemail_transcription/tests/__init__.py
new file mode 100644
--- /dev/null
+++ b/wazo_webhookd/plugins/voicemail_transcription/tests/__init__.py
@@ -1,0 +1 @@
+"""Tests for voicemail transcription plugin."""

diff --git a/wazo_webhookd/plugins/voicemail_transcription/tests/test_celery_tasks.py b/wazo_webhookd/plugins/voicemail_transcription/tests/test_celery_tasks.py
new file mode 100644
--- /dev/null
+++ b/wazo_webhookd/plugins/voicemail_transcription/tests/test_celery_tasks.py
@@ -1,0 +1,69 @@
+from __future__ import annotations
+
+from unittest.mock import Mock, patch
+
+import pytest
+
+from ..celery_tasks import transcribe_voicemail_task
+
+
+@patch(
+    'wazo_webhookd.plugins.voicemail_transcription.celery_tasks.get_transcription_result'
+)
+@patch('wazo_webhookd.plugins.voicemail_transcription.celery_tasks.get_auth_token')
+def test_polling_does_not_fetch_auth_token(
+    mock_get_auth_token: Mock,
+    mock_get_transcription_result: Mock,
+) -> None:
+    mock_get_transcription_result.return_value = {'status': 'processing'}
+    with patch.object(
+        transcribe_voicemail_task, 'retry', side_effect=RuntimeError('retry')
+    ) as mock_retry:
+        with pytest.raises(RuntimeError):
+            transcribe_voicemail_task.run(
+                {
+                    'auth': {},
+                    'voicemail_transcription': {'service_url': 'http://service'},
+                },  # type: ignore[arg-type]
+                12,
+                '34',
+                user_uuid='user-uuid',
+                require_license_check=False,
+                job_id='job-123',
+            )
+
+    mock_get_auth_token.assert_not_called()
+    mock_get_transcription_result.assert_called_once_with('http://service', 'job-123')
+    mock_retry.assert_called_once()
+
+
+@patch('wazo_webhookd.plugins.voicemail_transcription.celery_tasks.logger')
+@patch(
+    'wazo_webhookd.plugins.voicemail_transcription.celery_tasks.get_transcription_result'
+)
+@patch('wazo_webhookd.plugins.voicemail_transcription.celery_tasks.get_auth_token')
+def test_completed_transcription_logs_result(
+    mock_get_auth_token: Mock,
+    mock_get_transcription_result: Mock,
+    mock_logger: Mock,
+) -> None:
+    transcription_result = {'status': 'completed', 'transcript': 'hello world'}
+    mock_get_transcription_result.return_value = transcription_result
+
+    transcribe_voicemail_task.run(
+        {
+            'auth': {},
+            'voicemail_transcription': {'service_url': 'http://service'},
+        },  # type: ignore[arg-type]
+        12,
+        '34',
+        job_id='job-123',
+    )
+
+    mock_logger.info.assert_called_once_with(
+        'Transcription completed for voicemail %s, message %s: %s',
+        12,
+        '34',
+        transcription_result,
+    )
+    mock_get_auth_token.assert_not_called()

diff --git a/wazo_webhookd/plugins/voicemail_transcription/tests/test_handler.py b/wazo_webhookd/plugins/voicemail_transcription/tests/test_handler.py
new file mode 100644
--- /dev/null
+++ b/wazo_webhookd/plugins/voicemail_transcription/tests/test_handler.py
@@ -1,0 +1,55 @@
+from __future__ import annotations
+
+from collections import ChainMap
+from unittest.mock import Mock, patch
+
+from ..handler import VoicemailTranscriptionHandler
+
+
+@patch(
+    'wazo_webhookd.plugins.voicemail_transcription.handler.transcribe_voicemail_task'
+)
+def test_on_user_voicemail_created_serializes_config(mock_task: Mock) -> None:
+    config = ChainMap({'voicemail_transcription': {'service_url': 'http://service'}})
+    handler = VoicemailTranscriptionHandler(config)  # type: ignore[arg-type]
+
+    handler.on_user_voicemail_created(
+        {
+            'data': {
+                'voicemail_id': 12,
+                'message_id': '34',
+                'user_uuid': 'user-uuid',
+            }
+        },
+        {},
+    )
+
+    mock_task.delay.assert_called_once()
+    args, kwargs = mock_task.delay.call_args
+    assert type(args[0]) is dict
+    assert args == (dict(config), 12, '34')
+    assert kwargs == {'user_uuid': 'user-uuid', 'require_license_check': True}
+
+
+@patch(
+    'wazo_webhookd.plugins.voicemail_transcription.handler.transcribe_voicemail_task'
+)
+def test_on_global_voicemail_created_serializes_config(mock_task: Mock) -> None:
+    config = ChainMap({'voicemail_transcription': {'service_url': 'http://service'}})
+    handler = VoicemailTranscriptionHandler(config)  # type: ignore[arg-type]
+
+    handler.on_global_voicemail_created(
+        {
+            'data': {
+                'voicemail_id': 12,
+                'message_id': '34',
+            }
+        },
+        {},
+    )
+
+    mock_task.delay.assert_called_once()
+    args, kwargs = mock_task.delay.call_args
+    assert type(args[0]) is dict
+    assert args == (dict(config), 12, '34')
+    assert kwargs == {'user_uuid': None, 'require_license_check': False}

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

message_id,
user_uuid=user_uuid,
require_license_check=True,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChainMap config not serializable for Celery tasks

High Severity

self._config is a ChainMap (created in config.py via xivo.chain_map.ChainMap) and is passed directly to transcribe_voicemail_task.delay(). ChainMap is not JSON-serializable, so Celery will raise a TypeError when trying to enqueue the task. Every other Celery task dispatch in this codebase wraps the config with dict() first — see subscription/bus.py line 164 and mobile/http.py line 59.

Additional Locations (1)
Fix in Cursor Fix in Web

@wazo-community-zuul
Copy link
Contributor

Build succeeded.
https://zuul.wazo.community/zuul/t/local/buildset/260c381460424291adf1376a01ac0628

✔️ wazo-tox-py311 SUCCESS in 3m 16s
✔️ tox-linters SUCCESS in 3m 03s
✔️ webhookd-tox-integration SUCCESS in 18m 59s
✔️ webhookd-debian-packaging SUCCESS in 3m 07s

@rbienvenu0 rbienvenu0 changed the title AIckathon: Voicemail transciption TASK-101: Voicemail transcription Mar 12, 2026
@rbienvenu0 rbienvenu0 changed the title TASK-101: Voicemail transcription [TASK-101] Voicemail transcription Mar 12, 2026
@rbienvenu0 rbienvenu0 force-pushed the claude-voicemail-transcript branch from 343ae66 to ff4cf55 Compare March 12, 2026 21:10
@rbienvenu0 rbienvenu0 marked this pull request as draft March 12, 2026 21:11
@rbienvenu0 rbienvenu0 marked this pull request as ready for review March 12, 2026 21:12
@rbienvenu0 rbienvenu0 self-assigned this Mar 12, 2026
@rbienvenu0 rbienvenu0 requested a review from DrPyser March 12, 2026 21:13
@rbienvenu0 rbienvenu0 marked this pull request as draft March 12, 2026 21:13
@rbienvenu0 rbienvenu0 marked this pull request as ready for review March 12, 2026 21:13
bus_consumer.subscribe(
GlobalVoicemailMessageCreatedEvent.name,
handler.on_global_voicemail_created,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dual event subscription causes duplicate transcription jobs

Medium Severity

The plugin subscribes to both UserVoicemailMessageCreatedEvent and GlobalVoicemailMessageCreatedEvent. In Wazo Platform's bus architecture, both User and Global variants of an event are typically published for the same occurrence. Since both handlers call _process_voicemail with identical logic, every voicemail message would be submitted to the transcription service twice, creating duplicate jobs and wasting resources.

Additional Locations (1)
Fix in Cursor Fix in Web

)
return

self._process_voicemail(voicemail_id, message_id)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate handler methods with identical implementations

Low Severity

on_user_voicemail_created and on_global_voicemail_created are completely identical — same payload extraction, same validation, same call to _process_voicemail. This duplication increases maintenance burden and risks inconsistent bug fixes if one method is updated but not the other.

Fix in Cursor Fix in Web

@rbienvenu0 rbienvenu0 marked this pull request as draft March 12, 2026 21:19
@rbienvenu0 rbienvenu0 marked this pull request as ready for review March 12, 2026 21:20
@rbienvenu0 rbienvenu0 force-pushed the claude-voicemail-transcript branch from ff4cf55 to de96439 Compare March 12, 2026 21:20
@rbienvenu0 rbienvenu0 marked this pull request as draft March 12, 2026 21:20
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 4 total unresolved issues (including 3 from previous reviews).

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

voicemail_id = data.get('voicemail_id')
message_id = data.get('message_id')

if not all([voicemail_id, message_id]):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Falsy check rejects valid zero voicemail ID

Low Severity

not all([voicemail_id, message_id]) uses truthiness to validate presence, so any falsy-but-valid value (e.g., voicemail_id of 0 or an empty-string message_id) would be silently rejected. Using explicit None checks (is None) would be more correct for validating that the fields are actually present in the payload.

Additional Locations (1)
Fix in Cursor Fix in Web

@rbienvenu0 rbienvenu0 force-pushed the claude-voicemail-transcript branch from de96439 to 4d2406e Compare March 12, 2026 21:24
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
11 Security Hotspots

See analysis details on SonarQube Cloud

@rbienvenu0 rbienvenu0 marked this pull request as ready for review March 12, 2026 21:28
@wazo-community-zuul
Copy link
Contributor

Build succeeded.
https://zuul.wazo.community/zuul/t/local/buildset/7d274a17feb34cd5a61e5981cb180099

✔️ wazo-tox-py311 SUCCESS in 3m 14s
✔️ tox-linters SUCCESS in 3m 06s
✔️ webhookd-tox-integration SUCCESS in 21m 00s
✔️ webhookd-debian-packaging SUCCESS in 3m 20s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant