VectorDB

Open-source vector database designed for simplicity and speed, with flexible deployment options.

For embedded use as a library in Java projects, see:

https://github.com/tutikka/LibVectorDB

Features

Create and manage indexes
Create and manage vector embeddings in indexes
Search for embeddings in indexes based on distance
- Cosine distance
- Euclid distance
- Manhattan distance

APIs

RESTful HTTP API using JSON

Deployment

Standalone Java application
Docker

Installation

Binary Releases

None

Build from Source

Requirements

Java 21 or later (tested using 21.0.10-zulu)
Git client
Maven
Docker

MacOS/Linux/Unix

Clone this repository:

git clone https://github.com/tutikka/VectorDB.git

Change to cloned folder:

cd VectorDB

Clean, compile and package using Maven:

mvnw package

Change to created target directory:

cd target

Start the application:

java -jar vectordb-0.0.1-SNAPSHOT.jar

Or build a Docker image and run it, for example:

docker build -t vectordb/vectordb .
docker run -p 8080:8080 vectordb/vectordb

Configuration

The application will look for a configuration file named vectordb.properties in the root directory of the application during startup. If the file is not found, default values (shown in the example below) will be used.

#
# directory for data files (default = 'data')
#
data.directory = data

#
# maximum number of vectors per index (default = 65536)
#
data.max_vectors_per_index = 65536

Examples

Random Values

Create index with 3 dimensions and similarity based on manhattan distance
Create entries into the index with random values as embeddings
Search for the best matching entry based on given embedding
Clean up and delete index

View full source (Python)

Planet Positions

This example maps the positions of the planets in our solar system on 1.1.2025 to a 3D space using the sun as the origin, and then tests which planets are closest.

Create index with 3 dimensions (X, Y and Z coordinates) and similarity based on euclid distance
Create entries to the index for each planet based on the position at 1.1.2025
Search for the 3 closest planets to the sun
Clean up and delete index

View full source (Python)

RAG Example with OpenAI Embeddings and Chat Completion

This example is closer to a real-world scenario, where we have documents that we want to index to perform queries based on similarity, and then summarize best results based on a user's question.

Create index with 1536 dimensions (from OpenAI ada-002 text embedding model) and similarity based on cosine distance
Create entries into the index by embedding each document using the OpenAI ada-002 text embedding model
Search for the best matching entry based on the user's question (embedded with the same model)
Retrieve the original document identifier from the search results
Use a chat completion model (OpenAI gpt-5) to summarize the retrieved document based on the user's original question

Note! Make sure to add a .env file in the same directory with your OpenAI API Key

View full source (Python)

API

Summary

Method	URI	Description
`POST`	`/api/indexes`	Create new index
`GET`	`/api/indexes`	List indexes
`GET`	`/api/indexes/{id}`	Get index
`POST`	`/api/indexes/{id}/entries`	Create new entry into index
`GET`	`/api/indexes/{id}/entries`	List entries in index
`POST`	`/api/indexes/{id}/search`	Submit search for entries in index

Reference

Create New Index

Method

POST

URI

/api/indexes

Query Parameters

None

Request Body

{
  "name": "test",
  "dimensions": 1536,
  "similarity": "cosine",
  "optimization": "none"
}

Response Status

HTTP 200: Ok
HTTP 400: Error creating index due to client input
HTTP 500: Error creating index due to server error

Response Body

{
  "id": 1,
  "name": "test",
  "dimensions": 1536,
  "similarity": "cosine",
  "optimization": "none"
}

Note! The server will populate the id field, which is used to refer to the index in other API methods.

List Indexes

Method

GET

URI

/api/indexes

Query Parameters

None

Request Body

None

Response Status

HTTP 200: Ok

Response Body

[
    {
      "id": 1,
      "name": "test",
      "dimensions": 1536,
      "similarity": "cosine",
      "optimization": "none"
    }
]

Get Index

Method

GET

URI

/api/indexes/{id}

Query Parameters

id: The index identifier

Request Body

None

Response Status

HTTP 200: Ok
HTTP 404: Index not found

Response Body

{
  "id": 1,
  "name": "test",
  "dimensions": 1536,
  "similarity": "cosine",
  "optimization": "none",    
  "extras": {
    "_max_vectors": 65536,
    "_num_vectors": 1,
    "_size_on_disk": 1310728
  }
}

Create New Entry into Index

Method

POST

URI

/api/indexes/{id}/entries

Query Parameters

id: The index identifier

Request Body

{
    "id": 1,
    "embedding": [
        0.1,
        0.2,
        0.3
    ]
}

Response Status

HTTP 200: Ok
HTTP 400: Error creating entry due to client input
HTTP 404: Index not found
HTTP 500: Error creating entry due to server error

Response Body

{
  "id": 1,
  "embedding": [
    0.2672612,
    0.5345224,
    0.8017837
  ]
}

List Entries in Index

Method

GET

URI

/api/indexes/{id}/entries

Query Parameters

id: The index identifier
offset: The position in the index where to start retrieving entries
limit: Maximun number of entries to retrieve

Request Body

None

Response Status

HTTP 200: Ok
HTTP 400: Error listing entries due to client input
HTTP 404: Index not found
HTTP 500: Error listing entries due to server error

Response Body

[
  {
    "id": 1,
    "embedding": [
      0.2672612,
      0.5345224,
      0.8017837
    ]
  },
  {
    "id": 2,
    "embedding": [
      0.37139064,
      0.557086,
      0.7427813
    ]
  },
  {
    "id": 3,
    "embedding": [
      0.4242641,
      0.56568545,
      0.70710677
    ]
  }
]

Submit Search for Entries in Index

Method

POST

URI

/api/indexes/{id}/search

Query Parameters

id: The index identifier

Request Body

{
    "embedding": [
        0.1,
        0.2,
        0.3
    ],
    "top": 3
}

Response Status

HTTP 200: Ok
HTTP 400: Error searching entries due to client input
HTTP 404: Index not found
HTTP 500: Error searching entries due to server error

Response Body

{
  "matches": [
    {
      "id": 1,
      "distance": 5.9604644775390625E-8
    },
    {
      "id": 2,
      "distance": 0.007416725158691406
    },
    {
      "id": 3,
      "distance": 0.017292380332946777
    }
  ],
  "duration": 0,
  "scanned": 3,
  "total": 3,
  "similarity": "cosine"
}

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
examples		examples
lib		lib
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
local_repo.sh		local_repo.sh
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml
vectordb.properties		vectordb.properties

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VectorDB

Installation

Binary Releases

Build from Source

Requirements

MacOS/Linux/Unix

Configuration

Examples

Random Values

Planet Positions

RAG Example with OpenAI Embeddings and Chat Completion

API

Summary

Reference

Create New Index

List Indexes

Get Index

Create New Entry into Index

List Entries in Index

Submit Search for Entries in Index

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VectorDB

Installation

Binary Releases

Build from Source

Requirements

MacOS/Linux/Unix

Configuration

Examples

Random Values

Planet Positions

RAG Example with OpenAI Embeddings and Chat Completion

API

Summary

Reference

Create New Index

List Indexes

Get Index

Create New Entry into Index

List Entries in Index

Submit Search for Entries in Index

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages