A fast command-line tool to estimate token counts for AI models (GPT-3.5, GPT-4, Claude, etc.).
- π Fast token estimation without external dependencies
- π Support for multiple AI model token patterns
- π Cross-platform (Linux, macOS, Windows)
- π Multiple input methods (direct text, files, stdin)
- π Verbose mode with detailed token breakdown
- π JSON output for programmatic use
- π― Lightweight and minimal binary size
Download the latest release from the releases page.
# Download and extract (replace with your architecture: x86_64 or arm64)
tar -xzf tokencount_Darwin_x86_64.tar.gz
# IMPORTANT: Remove macOS quarantine flag (required for unsigned binaries)
xattr -d com.apple.quarantine tokencount
# Make executable
chmod +x tokencount
# Move to PATH (optional)
sudo mv tokencount /usr/local/bin/Note: macOS may block the binary because it's not code-signed. Use the xattr command above, or:
- Right-click the binary and select "Open", then click "Open" in the dialog
- Or go to System Settings > Privacy & Security to allow it after trying to run it
# Download and extract (replace with your architecture)
tar -xzf tokencount_Linux_x86_64.tar.gz
# Make executable
chmod +x tokencount
# Move to PATH (optional)
sudo mv tokencount /usr/local/bin/Download the Windows zip file, extract it, and run tokencount.exe from the command prompt.
git clone https://github.com/wearewebera/tokencount.git
cd tokencount
go build -o tokencount# Estimate tokens for direct text
tokencount "Hello, world!"
# Estimate tokens from a file
tokencount -f document.txt
# Pipe text from another command
echo "Some text" | tokencount
cat document.txt | tokencountThe tool supports different estimation algorithms optimized for various AI models:
# Use GPT-4 estimation (default)
tokencount "Your text here"
# Use GPT-3.5 estimation
tokencount -m gpt-3.5 "Your text here"
# Use Claude estimation
tokencount -m claude "Your text here"
# Use simple estimation (4 chars = 1 token)
tokencount -m simple "Your text here"# Verbose output with token breakdown
tokencount -v "Hello, world!"
# JSON output for programmatic use
tokencount -j "Your text here"
# Combine options
tokencount -m gpt-4 -v -j "Detailed analysis"$ tokencount "The quick brown fox jumps over the lazy dog"
Tokens: 9
Model: gpt-4 (GPT-4 estimation algorithm)$ tokencount -v "Hello, world! 123"
Tokens: 5
Model: gpt-4 (GPT-4 estimation algorithm)
Token breakdown:
--------------------------------------------------
Hello word 1 token(s)
, punctuation 1 token(s)
world word 1 token(s)
! punctuation 1 token(s)
123 number 1 token(s)
Original text:
--------------------------------------------------
Hello, world! 123$ echo "API request" | tokencount -j
{
"token_count": 2,
"model": "gpt-4",
"model_info": "GPT-4 estimation algorithm"
}$ tokencount -f large_document.txt
Tokens: 1523
Model: gpt-4 (GPT-4 estimation algorithm)The tool uses different algorithms to estimate token counts based on the selected model:
- Simple: Basic estimation using 4 characters = 1 token rule
- GPT-3.5/GPT-4: Advanced algorithm considering:
- Word boundaries and length
- Punctuation (typically 1 token each)
- Numbers (grouped by digits)
- Unicode characters (1 token per character for CJK, etc.)
- Claude: Similar to GPT but with Claude-specific optimizations
Note: These are estimations. Actual token counts may vary slightly from the real tokenizers used by AI providers.
Usage:
tokencount [options] [text]
echo "text" | tokencount [options]
tokencount -f file.txt [options]
Options:
-m, --model string Model to use for estimation (simple, gpt-3.5, gpt-4, claude) (default "gpt-4")
-f, --file string Input file to read
-v, --verbose Verbose output with token breakdown
-j, --json Output results as JSON
-h, --help Show help
--version Show version information
go test ./... -v# Linux
GOOS=linux GOARCH=amd64 go build -o tokencount-linux-amd64
# macOS
GOOS=darwin GOARCH=amd64 go build -o tokencount-darwin-amd64
GOOS=darwin GOARCH=arm64 go build -o tokencount-darwin-arm64
# Windows
GOOS=windows GOARCH=amd64 go build -o tokencount-windows-amd64.exeContributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.