Segfault in free-threaded Python 3.14t during cluster shutdown (logging race in Cythonized cluster.so)

## Summary

The `test libev (3.14t)` integration test job segfaults at ~95% through cqlengine tests (during `test_ifexists.py`) on free-threaded Python 3.14t. The crash is a thread-safety race condition in logging calls that format `Host`/`EndPoint` objects while those objects are being concurrently torn down during cluster shutdown.

## Stack Trace

```
Fatal Python error: Segmentation fault

<Cannot show all threads while the GIL is disabled>
Stack (most recent call first):
  File ".../logging/__init__.py", line 1154 in emit
  File ".../concurrent/futures/thread.py", line 73 in run
  ...

Current thread's C stack trace (most recent call first):
  ... at _PyUnicodeWriter_WriteStr+0x77
  ... cassandra/cluster.cpython-314t-x86_64-linux-gnu.so, at +0x13b1cc
  ... cassandra/cluster.cpython-314t-x86_64-linux-gnu.so, at +0x14771a (PyObject_VectorcallMethod)
  ... cassandra/cluster.cpython-314t-x86_64-linux-gnu.so, at +0x1121f9
```

## Root Cause

The crash is a race condition between cluster shutdown and executor threads doing logging with `%s`/`%r` formatting of `Host` objects. With the GIL disabled in 3.14t, this is no longer safe.

**The race:**
- **Thread A** (main): `Cluster.shutdown()` → `Session.shutdown()` → iterates/clears `_pools` and shuts down pools, potentially triggering cleanup of `Host`/`EndPoint` objects
- **Thread B** (executor worker): still running a submitted task (e.g. `run_add_or_renew_pool()`, `on_down_potentially_blocking()`, or a future callback), hits a logging call like `log.debug("... %s", host)` which calls `Host.__str__()` → `str(self.endpoint)` → `DefaultEndPoint.__str__()` → `"%s:%d" % (self._address, self._port)`

The segfault occurs in `_PyUnicodeWriter_WriteStr` because the endpoint's `_address` string (or the endpoint object itself) is being garbage collected by Thread A while Thread B is trying to format it.

**Shutdown order in `Cluster.shutdown()` (cluster.py:1772):**
1. `self.is_shutdown = True`
2. `self.scheduler.shutdown()`
3. `self.control_connection.shutdown()`
4. Session shutdown → pool shutdown
5. `self.executor.shutdown()` ← executor tasks may still be running during steps 2-4

The executor is shut down **last**, so in-flight tasks submitted before `is_shutdown` was set can still be executing during pool/session teardown.

**Likely logging call sites involved:**
- `cluster.py:3247` — `log.debug("Removed connection pool for %r", host)` in `remove_pool()`
- `cluster.py:3236` — `log.debug("Added pool for host %s to session", host)` in `run_add_or_renew_pool()`
- `cluster.py:1955-1958` — `log.debug("... %s", host)` in `_start_reconnector()`, called from `@run_in_executor` decorated `on_down_potentially_blocking()`
- `cluster.py:1843-1852` — `log.error/debug/info("... %s", host)` in `_on_up_future_completed()`

## Observed In

- Job: `test libev (3.14t)` in [PR #704](https://github.com/scylladb/python-driver/pull/704)
- CI run: https://github.com/scylladb/python-driver/actions/runs/22279731008/job/64448351909
- Not specific to PR #704 changes — the PR only touches `connection.py:factory()`. This is a pre-existing race exposed by free-threaded Python.

[logs_58180451190.zip](https://github.com/user-attachments/files/25473184/logs_58180451190.zip)

## Possible Fixes

1. **Reorder shutdown**: shut down (or at least drain) the executor **before** shutting down sessions/pools, so no executor tasks are running during teardown
2. **Defensive string caching**: cache `str(host)` / `repr(host)` results so they don't access mutable state during formatting
3. **Guard logging calls**: check `is_shutdown` before logging in executor-submitted callbacks, or catch exceptions in `__str__`/`__repr__`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segfault in free-threaded Python 3.14t during cluster shutdown (logging race in Cythonized cluster.so) #717

Summary

Stack Trace

Root Cause

Observed In

Possible Fixes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Segfault in free-threaded Python 3.14t during cluster shutdown (logging race in Cythonized cluster.so) #717

Description

Summary

Stack Trace

Root Cause

Observed In

Possible Fixes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions