-
Notifications
You must be signed in to change notification settings - Fork 48
Description
Hello, I was trying Droid Cli (version 0.70.0) with a local model.
Specifically Qwen3.5-122B-A10B-Q8_0 hosted with llama.cpp's llama-server.
I ran into timeout issues because my local model is so slow at processing the prompt after the session built up a decent amount of context.
It was caching the context fine, so something must have broken caching. Possibly this error I had:
For a while it was hitting timeouts, but retrying, so llama-server was making progress processing the context. Processing about 4K tokens before hitting the timeout again (my rig is very slow). My llama-server output:
It eventually stopped retrying to process the prompt and I got the The AI model timed out. Please retry or switch models with /model. error in Droid.
It would be great if we could set a custom timeout for slow machines/models such as my own. I did not see this option in the documentation.
Thanks.