Important
Big thanks to jakobdylanc for the amazing original llmcord. If you do not need all the features of llmcord+ and want a more minimal bot, check out his repo here! This bot is also primarily a side project designed mostly for GrainWare use, so you may have a better experience using the original llmcord.
Just @ the bot to start a conversation and reply to continue. Build conversations with reply chains!
You can:
- Branch conversations endlessly
- Continue other people's conversations
- @ the bot while replying to ANY message to include it in the conversation
Additionally:
- When DMing the bot, conversations continue automatically (no reply required). To start a fresh conversation, just @ the bot. You can still reply to continue from anywhere.
- You can branch conversations into threads. Just create a thread from any message and @ the bot inside to continue.
- Back-to-back messages from the same user are automatically chained together. Just reply to the latest one and the bot will see all of them.
llmcord+ supports remote models from:
Or run local models with:
...Or use any other OpenAI compatible API server.
- Supports image attachments when using a vision model (like gpt-5, qwen2.5-vl, claude-4, etc.)
- Supports text file attachments (.txt, .py, .c, etc.)
- Customizable personality (aka system prompt)
- User identity aware (OpenAI API only)
- Streamed responses (turns green when complete, automatically splits into separate messages when too long)
- Hot reloading config (you can change settings without restarting the bot)
- Displays helpful warnings when appropriate (like "
⚠️ Only using last 25 messages" when the customizable message limit is exceeded) - Caches message data in a size-managed (no memory leaks) and mutex-protected (no race conditions) global dictionary to maximize efficiency and minimize Discord API calls
- Fully asynchronous
- Modular Python package with clear separation of concerns
The "thinking" header (e.g., "💭 Thinking since…" / "💡 Done thinking!") relies on providers returning hidden reasoning wrapped in <think>...</think> tags. This currently works with APIs that emit these tags (for example ollama). Most OpenAI-compatible hosted APIs do not send <think> blocks, so the indicator may not appear for those.
If you want to add support, please make a PR.
This project is a refactor and extension of the original llmcord. Major differences:
- Refactored into a clean, modular package (
auth.py,messages.py,streaming.py,reasoning.py,discord_utils.py,constants.py,config.py) with a script entrypoint (uv run llmcord). - Shows output speed and model name in the footer.
- Reasoning progress header that redacts
<think>blocks and shows timing when supported (see note above). - Modern Python codebase with ruff, uv, and basedpyright.
-
Clone the repo:
git clone https://github.com/GrainWare/llmcord
-
Create a copy of "config-example.yaml" named "config.yaml" and set it up:
| Setting | Description |
|---|---|
| bot_token | Create a new Discord bot at discord.com/developers/applications and generate a token under the "Bot" tab. Enable "MESSAGE CONTENT INTENT". If you plan to use the {users} placeholder in your system prompt (recommended), also enable the "SERVER MEMBERS INTENT" so the bot can list server members. |
| client_id | Found under the "OAuth2" tab of the Discord bot you just made. |
| status_message | Set a custom message that displays on the bot's Discord profile. Max 128 characters. |
| max_text | The maximum amount of text allowed in a single message, including text from file attachments. (Default: 100,000) |
| max_images | The maximum number of image attachments allowed in a single message. (Default: 5)Only applicable when using a vision model. |
| max_messages | The maximum number of messages allowed in a reply chain. When exceeded, the oldest messages are dropped. (Default: 25) |
| use_plain_responses | When set to true the bot will use plaintext responses instead of embeds. Plaintext responses have a shorter character limit so the bot's messages may split more often. (Default: false)Also disables streamed responses and warning messages. |
| allow_dms | Set to false to disable direct message access. (Default: true) |
| block_response_regex | Optional regex. If any outgoing bot message matches, the bot aborts the reply, deletes partial output, and sends an error. Leave blank to disable. |
| reply_length_cap | Optional hard cap (characters) for a single reply. When reached during generation, the bot aborts, deletes partial output, and sends an error. Leave blank or 0 to disable. |
| experimental_message_formatting | When true, user messages sent to the model are prefixed with the sender's Discord display name (e.g., nickname: message). This can help models track multi-user conversations. This may break some models, so it's disabled by default. (Default: false) |
| permissions | Configure access permissions for users, roles and channels, each with a list of allowed_ids and blocked_ids.Control which users are admins with admin_ids. Admins can change the model with /model and DM the bot even if allow_dms is false.Leave allowed_ids empty to allow ALL in that category.Role and channel permissions do not affect DMs. You can use category IDs to control channel permissions in groups. |
| Setting | Description |
|---|---|
| providers | Add the LLM providers you want to use, each with a base_url and optional api_key entry. Popular providers (openai, ollama, etc.) are already included.Only supports OpenAI compatible APIs. Some providers may need extra_headers / extra_query / extra_body entries for extra HTTP data. See the included azure-openai provider for an example. |
| models | Add the models you want to use in <provider>/<model>: <parameters> format (examples are included). When you run /model these models will show up as autocomplete suggestions.Refer to each provider's documentation for supported parameters. The first model in your models list will be the default model at startup.Some vision models may need :vision added to the end of their name to enable image support. |
| system_prompt | Write anything you want to customize the bot's behavior! Leave blank for no system prompt. You can use placeholders: - {date} and {time} insert the current date/time (based on your host's time zone).- {users} expands to a newline-separated list of known server members in the format username: <username>, nickname: <nickname>, mention: <@id>. This is populated automatically when messages come from a guild. |
Add :vision to the end of the model name to enable image support.
- Install the dependencies:
Important
Before installing the dependencies, make sure you have uv installed. See here for instructions.
uv sync-
Run the bot:
No Docker:
uv run llmcord
With Docker:
docker compose up --build

