Fix deadlock in PJSIP channel creation with endpoint variables#181
Fix deadlock in PJSIP channel creation with endpoint variables#181
Conversation
…nnel var
Move endpoint channel_vars loop after ast_channel_unlock() in
chan_pjsip_new() to prevent deadlock when variables invoke dialplan
functions (e.g., PJSIP_HEADER) that block on PJSIP serializer tasks
while the channel lock is held.
Deadlock cycle with ao2_legacy (default) storage backend:
1. Thread A (chan_pjsip_new): holds channel_lock → calls
pbx_builtin_setvar_helper("PJSIP_HEADER(add,...)") → dispatches to
func_write_header() → ast_sip_push_task_wait_serializer() → blocks
waiting for serializer to complete the task
2. Serializer thread: already processing a prior task that iterates
channels → ao2_callback on channels container → acquires whole-
container lock → by_name_cb tries ast_channel_lock on the channel
held by Thread A → BLOCKED
3. Thread C (channel hangup): needs container lock for ao2_unlink →
BLOCKED (serializer holds it)
This is a classic ABBA inversion: Thread A holds channel_lock and
(transitively via serializer) needs container_lock; serializer holds
container_lock and needs channel_lock.
The race window widened in 22.6.0 due to added channel locking in
res_musiconhold.c (moh_files_alloc, local_ast_moh_start) and new CEL
event publishing, increasing contention enough to trigger reliably
under high call volume.
Moving the loop after unlock is safe because pbx_builtin_setvar_helper()
acquires its own channel lock internally.
https://claude.ai/code/session_017gxrJFgjWNAVm25sYc76DP
|
Build failed. ❌ debian-packaging-bookworm FAILURE in 1m 57s |
The patch was hand-written with spaces instead of tabs, causing dpkg-source to reject it as malformed. Regenerated from a real unified diff of the source file to preserve exact whitespace. https://claude.ai/code/session_017gxrJFgjWNAVm25sYc76DP
|
Build failed. ❌ debian-packaging-bookworm FAILURE in 9m 27s |
and into header; to minimize patch changes and avoid upstream conflicts
|
Build failed. ❌ debian-packaging-bookworm FAILURE in 9m 20s |
|
Build succeeded. ✔️ debian-packaging-bookworm SUCCESS in 11m 57s |
Summary
This patch fixes a deadlock that occurs during PJSIP channel creation when endpoint channel variables are set while the channel lock is held.
Key Changes
chan_pjsip.cast_channel_unlock(chan)instead of before, preventing deadlock scenariosImplementation Details
The deadlock occurred because setting endpoint channel variables can trigger dialplan functions (such as
PJSIP_HEADER) that may block on serializer tasks. When these operations occur while the channel lock is held, it creates a deadlock condition.By deferring the variable assignment until after the channel is unlocked, the channel lock is no longer held when these potentially-blocking operations execute, eliminating the deadlock while maintaining the same functional behavior.
https://claude.ai/code/session_017gxrJFgjWNAVm25sYc76DP