A desktop app to help you transcribe Ghanaian audio files using Gemini.
sudo apt install python3-tk xclip wl-clipboard
git clone https://github.com/GhanaNLP/transcriber.git
cd transcriber
python transcriber.py --code YOUR_CODEgit clone https://github.com/GhanaNLP/transcriber.git
cd transcriber
python transcriber.py --code YOUR_CODENo extra dependencies needed. Python on Windows includes everything out of the box.
Replace YOUR_CODE with the code you were sent. The app will automatically download your assigned audio files and open ready to transcribe.
- On first run, your assigned audio files are downloaded automatically
- Click "⎘ Copy audio file" → paste the file into Gemini
- Click "✦ Gemini prompt 1" (or 2) → paste the prompt into Gemini
- Copy Gemini's response → paste into the textarea
- The app validates and auto-saves:
- Consecutive repeated sentences are removed automatically
- Transcripts outside the 18,000–36,000 character range are blocked — re-paste a better version or click "Skip ⇥" to move on
- Duplicate transcripts (identical to one already saved for another file) are blocked
Click "Skip ⇥" if you can't get a valid transcript after several attempts. The filename is written to skipped.log in your transcripts folder so it won't appear again in future sessions. To un-skip a file, remove its entry from skipped.log.
When you are done (or want to submit progress):
- Find your transcripts folder — it is at
transcripts/<language>/inside the folder where you ran the app - Zip the entire
transcripts/<language>/folder - Go to the GitHub repo → Issues → New Issue → select "Submit Transcription Results"
- Fill in the form and attach your zip file