KIM Chatbot

This is the repository for the KIM Chatbot, an LLM-driven asisstant for the reflection of study subjects. It was developed as part of the project "KI-Coach – Ein digitaler Reflexionshelfer für Studierende" in cooperation with the IAAI at HdM from 2024-2025. The project was financed by the Stifterverband and the Ministry of Science, Research and Arts Baden-Württemberg as part of the program Fellowships für Innovationen in der digitalen Lehre Baden-Württemberg.

For details see the project description.

Key Features

Interaction for reflective discussion in natural language.
New models or new/adjusted sysprompts can be configured.
Easy setup within docker.

Installation & Setup

The KIM-Chatbot can be run either by using the (1) image from dockerhub, (2) building the docker image locally from scratch or (3) running the application straight via Python.

(1) Using the docker image from dockerhub (recommended)

If you haven't already, install Docker.
Pull the image from dockerhub: docker pull frupp/llm-chatbot .
Create a directory where chats should be saved.
Run and create a container: docker run -d -p 8501:8501 --name llm-chatbot -v <path-to-save-dir>:/llm-chatbot/data frupp/llm-chatbot .
The KIM Chatbot will then be available at localhost:8501 .

(2) Building the docker image locally

If you haven't already, install Docker.
Navigate to the code/ directory and run: docker build -t llm-chatbot:latest .
Create a directory where chats should be saved.
Run and create a container: docker run -d -p 8501:8501 --name llm-chatbot -v <path-to-save-dir>:/llm-chatbot/data llm-chatbot .
The KIM Chatbot will then be available at localhost:8501 .

(3) Running the application with Python directly

Install Python $\ge$ 3.11 .
Navigate to code/ .
Install additionally required modules via: pip install -r requirements.txt .
Create the following environment variables:

ENV CHAT_PATH=data/chat
ENV ENDPOINTS_CONFIG=data/endpoints.json 
ENV MODELS_CONFIG=data/models.json
ENV GUI_CONFIG=data/gui_config.json
ENV DATABASE_PATH=data/chatbot.db
ENV TIME_LIMIT_DELTA_SECONDS=60
ENV RATE_LIMIT_FRACTION_LEFT=0.3

Run streamlit run main.py

Customization

The KIM Chatbot can be customized in terms of API endpoints for serving language models and configuring models via system prompts for instance.

Config mode

We do not support user authentication. You can set an "admin_token" in the gui_config.json. This token can be entered to activate the config mode.

Creating new models

Activate config mode and create, update and delete models. For model creation the following must be provided:

select the endpoint where the model is served
select an available base model from the endpoint
a system prompt for the model
a summary prompt (optional)
temperature value (optional)
top_p value (optional)
a model name
default: if it should be used as default model Configured models are then saved persistently in the code/models.json file.

Example of one model entry in code/models.json:

{
    "endpoint_name": "academiccloud",
    "name": "Kim 2.0",
    "model": "qwen2.5-vl-72b-instruct",
    "temperature": 1.7,
    "top_p": 0.1,
    "system_prompt": "your system prompt for chatting",
    "summary_prompt": "your summary prompt which creates a summary of your chat history",
    "default": true
}

Endpoints

At the moment, two endpoints are implemented:

New endpoints can be defined in the module code/endpoints.py by creating a class that implements the abstract baseclass Endpoint. This approach requires endpoint classes to implement the methods:

chat(messages, model:str, temperature:float, top_p:float): is used to interact with a language model served by the endpoint. It returns an iterator returning the tokens from the language model. The parameters are:
- messages: A list with the potential existing messages in the conversation.
- model: a string identifying the model defined in models.json (see below).
- temperature: optional, adjusts the temperature for the LLM in $[0,1]$. It controls the influence of randomness of a model's output. The higher the more random.
- top_p: optional, adjusts the diversity for the LLM in $[0,1]$. Defines the percentage of top tokens considered. The higher the more variation.
model_list: returns a list of all configured models for this endpoint.

In addition, endpoints must be configured in the code/endpoints.json file in the form:

[
  {
    "name": "<endpoint name>",
    "endpoint": "https://url-to-endpoint",
    "type": "openai|ollama",
    "api_key": "<api-key>"
  }
]

Save chat history

"save_buttons_visible" can be set to true (default: false). This enables two save buttons which a user can use to store their chat history. Meta data is stored automatically.

Happiness buttons

"happiness_buttons_visible" can be set to true (default: false). This enables two buttons where a user can indicate how they currently feel about the chatbots responses.

Summary

"summary_visible" can be set to true (default: false). This enables a second model with a configured summary prompt. This summary-model is intended to create a summary, achievement and goals based on the chat history. It will be displayed on the left sidebar.

Code

`endpoints.py`

Overview

The endpoints.py module provides the functionality for interacting with external APIs (OpenAI and Ollama) to facilitate chatbot communication. It defines an abstract class that serves as a wrapper for these APIs, allowing for seamless integration and communication with either of the two services. This modular design enables flexibility by allowing different endpoints to be defined in the endpoints.json configuration file.

Classes

`Endpoint` (Abstract Class)

Endpoint serves as the base class for interacting with external APIs. It defines two key methods that all subclasses must implement:

chat(): A method to handle sending a chat request to the respective API and receiving the response.
model_list(): A method to retrieve a list of available models from the respective API.

This class is extended by both OllamaEndpoint and OpenAIEndpoint to implement specific API interaction logic.

`OllamaEndpoint` (Subclass)

OllamaEndpoint is a subclass of Endpoint that provides specific implementations for interacting with the Ollama API. It overrides the chat() and model_list() methods to implement the required functionality for this service.

`OpenAIEndpoint` (Subclass)

OpenAIEndpoint is another subclass of Endpoint, providing the implementation for the OpenAI API. Like OllamaEndpoint, it overrides the chat() and model_list() methods to provide the necessary functionality for communication with the OpenAI service.

`models.py`

Overview

The models.py module defines a class that represents a model used by the chatbot. This class encapsulates essential information about the model, including the API endpoint, the model type, configuration parameters, and specific prompts for different stages of the conversation. The configuration data for each model is stored in the models.json file.

Class

`Model`

The Model class represents a single model used by the chatbot. Each instance of this class holds key information for interacting with a specific model and provides the necessary configuration details required for the chatbot's operation.

Attributes

endpoint: The API endpoint (either OpenAI or Ollama) that the model uses for communication.
model: A string representing the model's name (e.g., qwen2.5, gpt-4).
name: A custom name for the model, used to identify it in the chatbot’s configuration.
system_prompt: A predefined prompt that sets the tone or behavior of the model during the conversation. This is used at the beginning of the discussion to guide the chatbot’s responses.
summary_prompt: A predefined prompt used to summarize the entire conversation at the end of the discussion. This is helpful for providing a summary of what has been discussed.
temperature: A float value (typically between 0 and 1) that controls the randomness of the model's responses. Higher values result in more diverse outputs, while lower values result in more focused and deterministic outputs.
top_p: A float value (typically between 0 and 1) that controls the diversity of responses based on cumulative probability. This is another parameter for controlling the randomness of the generated text, often used in conjunction with temperature.

Methods

__init__(): Initializes the Model object using configuration data such as the endpoint, model type, name, prompts, and parameters (temperature and top_p).
create_system_message()
Creates the initial system message that is sent to the underlying Ollama or OpenAI model.
This method ensures that the correct prompt (system prompt for discussion or summary prompt for summarization) is injected in the format expected by the selected API.

Free Functions

`get_models()`

get_models() is a free function responsible for loading and creating all available Model instances from the models.json configuration file.

Reads the model definitions from models.json
Resolves the configured endpoint for each model
Instantiates and returns the corresponding Model objects

`conversation.py`

Overview

The conversation.py module defines the Conversation class, which is responsible for managing and storing the complete state of a single chatbot conversation with one user. It acts as the central data structure for tracking user-specific information, conversation metadata, and all messages exchanged during a session.

This class ensures that the chatbot can maintain context throughout the interaction and allows conversations to be analyzed or evaluated after they have ended.

Class

`Conversation`

The Conversation class represents one full interaction between a student and the chatbot. It stores both user-related data and technical metadata required for evaluation, testing, and user experience analysis.

Attributes

user_name
The name of the user participating in the conversation.
course
The course or class the user attended during the semester being evaluated.
terms_accepted
A boolean flag indicating whether the user has agreed to the terms and conditions (e.g. privacy policy and data usage).
survey_code
An optional code used to associate the conversation with a specific survey or experiment.
This is mainly used for testing and evaluating different system prompts with students.
messages
A list of all generated messages exchanged during the conversation, including system, user, and assistant messages.
model
The Model instance used for this conversation, defining which endpoint, LLM, and prompts are applied.
user_satisfaction
A value representing how happy the user was with the conversation.
This is typically collected at the end of the interaction and is explained in more detail in the UI section.
started_at
A timestamp indicating when the conversation started.
ended_at
A timestamp indicating when the conversation ended.

Methods

__init__()
Initializes a new conversation and sets up the initial state, including metadata such as start time and selected model.
Message management methods
Helper methods to add, retrieve, or update messages during the conversation, ensuring the full chat history is preserved.
Lifecycle methods
Methods to mark the start and end of a conversation and record the corresponding timestamps.

Usage

A Conversation instance is created when a new user session starts.
All user input, assistant responses, and system messages are stored within this instance.
Once the conversation ends, the stored data can be used for:
- Generating summaries (if enabled)
- Evaluating prompt effectiveness (if enabled)
- Analyzing student feedback and satisfaction (if enabled)
- Persisting conversations for later review or research (if enabled)

`chatbot.py`

Overview

The chatbot.py module contains the Streamlit-based graphical user interface (GUI) for the chatbot. It is responsible for rendering the chat layout, handling user input, displaying generated messages, and managing user interactions such as model selection, dialog control, and feedback collection.

This module connects the frontend UI with the underlying conversation and model logic.

Responsibilities

Render the chat interface using Streamlit
Display the conversation history (user and assistant messages)
Provide an input prompt for user messages
Allow model selection when enabled
Manage conversation lifecycle actions (finish or abort)
Collect user feedback on the conversation experience
Provide access to a configuration mode for administrators

User Interface Layout

Chat Area

The main area displays the conversation history, with:
- Sent user messages
- Received assistant responses
A text input prompt is shown at the bottom of the page, allowing the user to enter new messages.

Sidebar

The sidebar provides additional controls and actions:

Finish dialog
Allows the user to properly end the conversation (see finish_dialog()).
Abort dialog
Allows the user to cancel the conversation prematurely (see abort_dialog()).
Model rating
The user can rate their experience with the conversation.
This rating is stored in the Conversation object and used for later evaluation.
Admin access
A configuration mode can be activated by entering a valid admin token.

Model Selection

`change_model()`

If model switching is enabled, users can dynamically switch between available models during the conversation using the change_model() function.

When enabled:
- A model selector is shown in the UI
- The selected model is applied to subsequent messages
When disabled:
- A predefined default model is used for the entire conversation

This allows controlled experimentation with different models and prompts.

Conversation Control

`finish_dialog()`

Gracefully ends the conversation
Marks the conversation as completed
Triggers final steps such as summarization or feedback collection

`abort_dialog()`

Immediately stops the conversation
Discards or flags the conversation as incomplete
Used when the user chooses not to continue

Configuration Mode

A configuration (admin) mode can be enabled by entering a valid admin token.

When activated, this mode allows:

Access to additional configuration options
Testing and adjusting model or prompt behavior
Administrative control over the chatbot setup

`config.py`

Overview

The config.py module provides helper functions for loading configuration files and initializing shared infrastructure required by the chatbot. Its main responsibilities include reading JSON-based configuration files and setting up a small SQLite database used for rate limiting.

By centralizing configuration and initialization logic, this module helps keep the rest of the codebase clean and modular.

Responsibilities

Load and parse configuration files (e.g. endpoints.json, models.json)
Provide access to configuration data for other modules
Initialize and manage a lightweight SQLite database
Store and track rate-limiting information

Rate Limiting Database

SQLite Initialization

config.py initializes a small SQLite database that is used to store rate-limiting information.

The database tracks the current rate limit state
It helps prevent excessive API calls
It ensures fair and controlled usage of external APIs

The database is lightweight, file-based, and requires no additional setup, making it suitable for local development and small deployments.

`data.py`

Overview

The data.py module contains helper functions for persisting conversation data. It is responsible for anonymizing and storing conversations when data collection is enabled, while respecting different privacy levels.

The module supports multiple anonymization modes, allowing a balance between research needs and user privacy.

Responsibilities

Anonymize conversation data based on the selected privacy level
Store conversation data in a persistent format
Ensure that sensitive user information is handled appropriately
Support privacy-aware evaluation and analysis of chatbot usage

Anonymization Levels

The system supports three levels of anonymization, which can be configured depending on privacy requirements:

1. No Anonymization

The full conversation is stored
Includes:
- User name
- Course information
- All generated messages (user and assistant)
Intended for controlled testing or development environments

2. Name Only

The user name is removed or replaced with an anonymous identifier
All conversation messages are still stored in full
Suitable when message content is required for analysis, but direct identification should be avoided

3. Name and Messages

Both user name and message content are anonymized
Only aggregate statistics are stored, including:
- Number of words generated per message
- Total duration of the conversation
No textual content is preserved
Intended for maximum privacy while still allowing high-level usage analysis

Data Storage

Conversations are stored only if data collection is explicitly enabled
Stored data can be used for:
- Evaluating system performance
- Analyzing usage patterns
- Improving prompts and models
The storage format is designed to align with the selected anonymization level

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
code		code
doc/img		doc/img
.gitignore		.gitignore
README.md		README.md

HdM-IAAI/kim-chatbot

Folders and files

Latest commit

History

Repository files navigation

KIM Chatbot

Key Features

Installation & Setup

(1) Using the docker image from dockerhub (recommended)

(2) Building the docker image locally

(3) Running the application with Python directly

Customization

Config mode

Creating new models

Endpoints

Save chat history

Happiness buttons

Summary

Code

endpoints.py

Overview

Classes

Endpoint (Abstract Class)

OllamaEndpoint (Subclass)

OpenAIEndpoint (Subclass)

models.py

Overview

Class

Model

Attributes

Methods

Free Functions

get_models()

conversation.py

Overview

Class

Conversation

Attributes

Methods

Usage

chatbot.py

Overview

Responsibilities

User Interface Layout

Chat Area

Sidebar

Model Selection

change_model()

Conversation Control

finish_dialog()

abort_dialog()

Configuration Mode

config.py

Overview

Responsibilities

Rate Limiting Database

SQLite Initialization

data.py

Overview

Responsibilities

Anonymization Levels

1. No Anonymization

2. Name Only

3. Name and Messages

Data Storage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

`endpoints.py`

`Endpoint` (Abstract Class)

`OllamaEndpoint` (Subclass)

`OpenAIEndpoint` (Subclass)

`models.py`

`Model`

`get_models()`

`conversation.py`

`Conversation`

`chatbot.py`

`change_model()`

`finish_dialog()`

`abort_dialog()`

`config.py`

`data.py`

Packages