Skip to content

Linux-first CLI for converting documents and data between common formats.

License

Notifications You must be signed in to change notification settings

Chalo1996/dtconvert

Repository files navigation

dtconvert

dtconvert is a Linux-first CLI for converting documents and data between common formats and for moving data in/out of PostgreSQL, with an optional AI helper for summarization, web search, and citations.

The core binary is written in C for speed, portability, and predictable behavior. Actual file-format conversions are delegated to small converter modules (shell scripts) so it’s easy to extend.

What problem this solves

If you routinely deal with:

  • Mixed formats (DOCX/ODT/TXT/PDF/CSV/XLSX/JSON/YAML)
  • “Just convert this quickly” tasks that are annoying to do repeatedly
  • Moving CSV data into PostgreSQL and exporting results back to CSV
  • Needing quick, local-first summarization/citations over documents and URLs

dtconvert provides a single CLI interface with consistent flags, clear errors, and a modular conversion pipeline.

Key goals

  • Simple UX: dtconvert <input> --to <format>
  • Linux-first: shells out to common tools, designed for scripting
  • Modular conversion: add/replace converters without changing the core binary
  • Local-first AI: works with Ollama locally; also supports an OpenAI-compatible backend
  • Predictable behavior: stable exit codes and minimal surprises

Uniqueness / prior art

There are plenty of existing tools that cover parts of this space (document converters like Pandoc/LibreOffice, data-format converters, psql-based import/export scripts, and standalone AI CLI tools).

dtconvert is intended to be distinctive in how it packages these workflows:

  • One CLI for common “format glue” work: document/data conversion + PostgreSQL import/export + optional AI helper.
  • Easy to extend: most new conversions can be added as a small script in modules/.
  • Simple pipelines: some conversions intentionally chain steps (e.g., CSV→PDF via CSV→TXT→PDF).
  • Clear dependencies: install guidance is mapped to modules/features so you can keep setups minimal.

Quick start

1) Clone

git clone https://github.com/Chalo1996/dtconvert.git
cd dtconvert

2) Build

make

If make fails due to missing build tools, install the build dependencies in the Dependencies section below (e.g., build-essential on Ubuntu/Debian).

This builds:

  • bin/dtconvert (the main CLI)
  • helper binaries under lib/converters/ used by some modules

3) Run

./bin/dtconvert --help
./bin/dtconvert --version

Sanity check (runs a real conversion with no extra dependencies beyond the build):

printf '%s\n' 'name,age' 'Ada,37' > /tmp/dtconvert_sanity.csv
./bin/dtconvert /tmp/dtconvert_sanity.csv --to json -o /tmp/dtconvert_sanity.json
cat /tmp/dtconvert_sanity.json

4) Install

System-wide install (requires sudo):

sudo make install

User install (no sudo):

make install PREFIX=$HOME/.local

Notes:

  • Converter modules and helper binaries are installed under $PREFIX/lib/dtconvert/.
  • If $PREFIX/bin is not on your PATH, make install appends an export PATH=... line to your shell rc files (idempotent) and asks you to restart your shell.

Updating an existing install (common troubleshooting):

command -v dtconvert
./bin/dtconvert --version
dtconvert --version

If dtconvert (from your PATH) is older than ./bin/dtconvert (from this repo), rebuild + reinstall. For a prior user install:

make clean && make
make install PREFIX=$HOME/.local

Docker Quick Start

Run dtconvert in Docker without installing any dependencies on your host system.

Option 1: Pull from Docker Hub (recommended)

docker pull chaloemmanuel/dtconvert:latest
docker run --rm chaloemmanuel/dtconvert --version

Option 2: Build locally

docker build -t dtconvert .

Run basic conversions

# Using the Docker Hub image:
docker run --rm chaloemmanuel/dtconvert --help
docker run --rm chaloemmanuel/dtconvert --version

# Or using a locally built image:
docker run --rm dtconvert --help
docker run --rm dtconvert --version

# Convert a file (mount current directory)
docker run --rm -v $(pwd):/data chaloemmanuel/dtconvert input.csv --to json -o output.json

The Docker image includes all optional dependencies (LibreOffice, enscript, Ghostscript, xlsx2csv, psql, curl), so every supported conversion works out of the box.

Volume Mounting

Mount your local directory to /data inside the container for file access:

# Single file conversion
docker run --rm -v $(pwd):/data dtconvert myfile.docx --to pdf

# Convert file in a subdirectory
docker run --rm -v $(pwd):/data dtconvert subdir/data.csv --to json -o subdir/data.json

# Mount a specific directory
docker run --rm -v /path/to/files:/data dtconvert input.xlsx --to csv

The container's working directory is /data, so paths are relative to your mounted directory.

Environment Variables

Pass environment variables for PostgreSQL and AI features:

PostgreSQL credentials:

docker run --rm -v $(pwd):/data \
  -e PGPASSWORD=yourpassword \
  dtconvert data.csv --to postgresql -o config.json

AI features (OpenAI):

docker run --rm -v $(pwd):/data \
  -e OPENAI_API_KEY=your-api-key \
  dtconvert ai summarize document.txt --backend openai

AI features (Ollama):

docker run --rm -v $(pwd):/data \
  --network host \
  -e DTCONVERT_OLLAMA_HOST=http://127.0.0.1:11434 \
  -e DTCONVERT_OLLAMA_MODEL=llama3.1 \
  dtconvert ai summarize document.txt

AI search and citations (no API key needed):

# Generate search URL
docker run --rm chaloemmanuel/dtconvert ai search "docker containers"

# Generate citations
docker run --rm chaloemmanuel/dtconvert ai cite https://example.com --style apa

Docker Compose

A docker-compose.yml is included for running dtconvert with a local PostgreSQL database:

# Start PostgreSQL service
docker compose up -d postgres

# Run conversions (PostgreSQL available at localhost)
docker compose run --rm dtconvert data.csv --to postgresql -o config.json

# Export from PostgreSQL
docker compose run --rm dtconvert config.json --from postgresql --to csv -o export.csv

# Stop services
docker compose down

The compose setup uses these default PostgreSQL credentials:

  • Host: postgres (or localhost with --network host)
  • User: dtconvert
  • Password: dtconvert
  • Database: dtconvertdb

Convenience Wrapper

For frequent use, create a shell alias or wrapper script:

Shell alias (add to ~/.bashrc or ~/.zshrc):

alias dtconvert='docker run --rm -v $(pwd):/data chaloemmanuel/dtconvert'

Then use it like the native command:

dtconvert input.csv --to json -o output.json

Wrapper script (save as ~/bin/dtconvert-docker):

#!/bin/bash
docker run --rm \
  -v "$(pwd):/data" \
  -e PGPASSWORD="${PGPASSWORD:-}" \
  -e OPENAI_API_KEY="${OPENAI_API_KEY:-}" \
  -e DTCONVERT_OLLAMA_HOST="${DTCONVERT_OLLAMA_HOST:-}" \
  -e DTCONVERT_OLLAMA_MODEL="${DTCONVERT_OLLAMA_MODEL:-}" \
  chaloemmanuel/dtconvert "$@"

Make it executable: chmod +x ~/bin/dtconvert-docker

Dependencies

dtconvert builds with just a C toolchain, but some conversions rely on external tools.

  • Build (required): gcc (or clang) and make
  • AI (dtconvert ai ...): curl (required), xdg-open (only for ai search --open)
  • DOCX/ODT→PDF: libreoffice (recommended) or unoconv or pandoc (pandoc PDF output may require LaTeX)
  • TXT/CSV→PDF: enscript + Ghostscript (ps2pdf)
  • XLSX/CSV: xlsx2csv (preferred) or libreoffice or ssconvert (Gnumeric)
  • PostgreSQL: psql (postgresql-client)

Note: If LibreOffice prints Warning: failed to launch javaldx - java may not function correctly, install Java support for LibreOffice (Ubuntu/Debian: sudo apt install -y default-jre libreoffice-java-common). The warning is typically non-fatal for PDF export, but installing these packages usually removes it.

Ubuntu/Debian (install only what you need):

sudo apt update
sudo apt install -y build-essential \
  curl xdg-utils \
  libreoffice unoconv pandoc texlive-latex-recommended \
  enscript ghostscript \
  xlsx2csv gnumeric \
  postgresql-client

Fedora (install only what you need):

sudo dnf install -y gcc make \
  curl xdg-utils \
  libreoffice unoconv pandoc \
  texlive-scheme-medium \
  enscript ghostscript \
  xlsx2csv gnumeric \
  postgresql

Arch Linux (install only what you need):

sudo pacman -S --needed base-devel \
  curl xdg-utils \
  libreoffice-fresh unoconv pandoc \
  texlive \
  enscript ghostscript \
  xlsx2csv gnumeric \
  postgresql

Note (Arch): If you prefer the stable LibreOffice track, replace libreoffice-fresh with libreoffice-still. If a package isn’t available in your enabled repos (varies by distro mirrors/config), you may need an AUR build instead.

Usage

Common usage examples are below. Note that some conversions require external tools (LibreOffice, Pandoc, etc.); see Dependencies for install options.

Supported conversions

From To Implementation
csv json lib/converters/data_convert
csv pdf modules/csv_to_pdf.sh
csv postgresql modules/csv_to_postgresql.sh
csv sql modules/csv_to_sql.sh
csv txt modules/csv_to_txt.sh
csv xlsx modules/csv_to_xlsx.sh
csv yaml lib/converters/data_convert
docx odt modules/docx_to_odt.sh
docx pdf modules/docx_to_pdf.sh
json csv lib/converters/data_convert
json yaml lib/converters/data_convert
odt docx modules/odt_to_docx.sh
odt pdf modules/odt_to_pdf.sh
postgresql csv modules/postgresql_to_csv.sh
sql csv modules/sql_to_csv.sh
txt pdf modules/txt_to_pdf.sh
txt tokens modules/txt_to_tokens.sh
xlsx csv modules/xlsx_to_csv.sh
yaml csv lib/converters/data_convert
yaml json lib/converters/data_convert

Convert files

./bin/dtconvert document.docx --to pdf
./bin/dtconvert document.docx --to odt
./bin/dtconvert document.odt --to docx
./bin/dtconvert notes.txt --to pdf -o notes.pdf
./bin/dtconvert spreadsheet.xlsx --to csv
./bin/dtconvert data.csv --to json
./bin/dtconvert data.yaml --to json

PostgreSQL import/export

Import a CSV into PostgreSQL using a JSON config file:

./bin/dtconvert people.csv --to postgresql -o examples/postgresql.csv_to_postgresql.json

Export from PostgreSQL to CSV:

./bin/dtconvert examples/postgresql.csv_to_postgresql.json --from postgresql --to csv -o export.csv

Notes:

  • PostgreSQL operations require psql available on your PATH.
  • Import/export uses a JSON config file passed via -o/--output.
  • Import/export will not prompt for a password; set up credentials via .pgpass or PGPASSWORD rather than embedding passwords in the config file.

If you want to run the full conversion test suite (including PostgreSQL) without being prompted for a password, you can pass it via the environment:

PGPASSWORD=dtconvert make conversions-smoke

The conversion sweep automatically detects PGPASSWORD/~/.pgpass; if neither is set up (or the DB is unreachable), PostgreSQL tests will be skipped.

Local PostgreSQL quick setup (example):

# Install server + client (Debian/Ubuntu)
sudo apt update
sudo apt install -y postgresql postgresql-client

# Create user + database
sudo -u postgres psql -c "CREATE USER dtconvert WITH PASSWORD 'dtconvert';"
sudo -u postgres psql -c "CREATE DATABASE dtconvertdb OWNER dtconvert;"

# Verify connectivity (non-interactive)
PGPASSWORD=dtconvert psql -h localhost -U dtconvert -d dtconvertdb -c 'SELECT 1;'

Recommended (avoid exporting PGPASSWORD every time):

printf '%s\n' 'localhost:5432:dtconvertdb:dtconvert:dtconvert' >> ~/.pgpass
chmod 600 ~/.pgpass

AI helper

AI features are built into the dtconvert binary:

  • summarize: summarize a local file
  • search: print a DuckDuckGo URL and optionally open it in your browser
  • cite: generate simple APA/MLA citations for one or more URLs

Examples:

./bin/dtconvert ai summarize README.md
./bin/dtconvert ai search "postgresql copy csv" --open
./bin/dtconvert ai cite https://example.com --style apa

AI dependencies:

  • curl (HTTP requests)
  • xdg-open (only needed when using ai search --open)

AI environment variables

General:

  • DTCONVERT_AI_TIMEOUT (curl max-time in seconds; affects ai summarize and ai cite. Defaults vary by operation.)

Local-first (Ollama):

  • DTCONVERT_OLLAMA_HOST (default: http://127.0.0.1:11434)
  • DTCONVERT_OLLAMA_MODEL (default: llama3.1)

OpenAI-compatible:

  • OPENAI_API_KEY (required)
  • OPENAI_BASE_URL (optional, default: https://api.openai.com/v1)
  • DTCONVERT_OPENAI_MODEL (default: gpt-4o-mini)

How it works (high level)

  • The main CLI (C) parses arguments, validates inputs, and selects a converter.
  • Converters are executable scripts under modules/ and follow a simple contract:
<module_script> <input_path> <output_path>
  • Some modules call small helper binaries built from C sources in lib/converters/.

For more details, see:

Adding a new converter module

  1. Create a new script in modules/, for example modules/foo_to_bar.sh.
  2. Ensure it’s executable.
  3. Register it in the converter registry (currently a static table in src/conversion.c).

Project status

This is an MVP focused on pragmatic conversions and predictable CLI behavior. Expect formats and modules to evolve.

License

MIT. See LICENSE.

Contributions are welcome. The project currently has a single maintainer.

Maintainer: Emmanuel Chalo (emusyoka759@gmail.com)

About

Linux-first CLI for converting documents and data between common formats.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published