stranger-strings-rs

Rust implementation of Stranger Strings: extract candidate strings from binaries and score them using a Ghidra-compatible trigram model.

What It Does

Extracts strings from binaries (with offset tracking)
Scores strings with trigram probabilities (.sng model format)
Supports multiple extraction encodings: ascii, utf8, utf16le, utf16be, latin1, latin9
Can use script-aware scoring for han, arabic, and cyrillic
Outputs in text, json, or csv

Install

Prebuilt binaries

Builds for Linux, macOS (x86_64 + aarch64), and Windows are published from Git tags in GitHub Actions.

From source

git clone https://github.com/closed-systems/stranger-strings-rs
cd stranger-strings-rs
cargo build --release

Binary path:

target/release/stranger-strings

CLI

stranger-strings [OPTIONS] <input>

input can be a file path or - for stdin.

Quick start

# Analyze a binary with default settings (ASCII extraction)
stranger-strings ./sample.bin

# Verbose output
stranger-strings -v ./sample.bin

# JSON output
stranger-strings -f json -o result.json ./sample.bin

# Use explicit model path
stranger-strings -m ./StringModel.sng ./sample.bin

Model path behavior

If --model is omitted, the CLI looks for StringModel.sng next to the executable (not in the current working directory).

Encodings

# Multiple encodings
stranger-strings -e utf8,utf16le,latin1 ./sample.bin

# All supported encodings
stranger-strings -e all ./sample.bin

Language-aware scoring

# Auto-detect script and score with script-specific scorer
stranger-strings --auto-detect -e utf8 ./sample.bin

# Restrict to specific scripts
stranger-strings -L chinese,russian,arabic -e utf8 ./sample.bin

Important note:

Latin and unknown-script scoring use the trigram scorer.
If no trigram model is loaded and Latin/unknown text is scored, analysis will fail with ModelNotLoaded.

Other useful flags

# Minimum extracted string length
stranger-strings -l 6 ./sample.bin

# Keep only unique strings
stranger-strings -u ./sample.bin

# Sort: score | alpha | offset
stranger-strings -s offset ./sample.bin

# Show model metadata and exit
stranger-strings --info

# Run built-in test strings
stranger-strings --test

Library Usage

Basic trigram scoring

use stranger_strings_rs::{AnalysisOptions, StrangerStrings};

let mut analyzer = StrangerStrings::new();
analyzer.load_model(&AnalysisOptions {
    model_path: Some("./StringModel.sng".to_string()),
    ..Default::default()
})?;

let result = analyzer.analyze_string("hello world")?;
println!("valid={} score={:.3}", result.is_valid, result.score);

Binary analysis with multiple encodings

use stranger_strings_rs::{BinaryAnalysisOptions, StrangerStrings, SupportedEncoding};

let mut analyzer = StrangerStrings::new();
analyzer.load_model(&stranger_strings_rs::AnalysisOptions {
    model_path: Some("./StringModel.sng".to_string()),
    ..Default::default()
})?;

let bytes = std::fs::read("./sample.bin")?;
let results = analyzer.analyze_binary_file(
    &bytes,
    &BinaryAnalysisOptions {
        min_length: Some(4),
        encodings: Some(vec![SupportedEncoding::Ascii, SupportedEncoding::Utf16le]),
        use_language_scoring: false,
        ..Default::default()
    },
)?;

println!("{} strings analyzed", results.len());

Script detection only

use stranger_strings_rs::StrangerStrings;

let mut analyzer = StrangerStrings::new();
analyzer.enable_language_detection()?;

let detection = analyzer.detect_language("Привет мир")?;
println!("script={:?} confidence={:.2}", detection.primary_script, detection.confidence);

Compatibility

For Latin text with a loaded model, scoring is intended to match the original TypeScript implementation and .sng model behavior.

Current tests include compatibility checks and language-scoring checks:

cargo test

Release Process

A release workflow is included at:

.github/workflows/release.yml

On tag push (for example v0.1.0), it builds and publishes artifacts for:

x86_64-unknown-linux-gnu
x86_64-apple-darwin
aarch64-apple-darwin
x86_64-pc-windows-msvc

Development

# Format + lint (if installed)
cargo fmt
cargo clippy --all-targets --all-features

# Test
cargo test

# Run CLI locally
cargo run -- --help

Contributing

PRs are welcome. Keep changes focused, add/adjust tests with behavior changes, and include CLI/library docs updates when flags or API behavior change.

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
examples		examples
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE-APACHE		LICENSE-APACHE
README.md		README.md
StringModel.sng		StringModel.sng
strangerstrings.png		strangerstrings.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

stranger-strings-rs

What It Does

Install

Prebuilt binaries

From source

CLI

Quick start

Model path behavior

Encodings

Language-aware scoring

Other useful flags

Library Usage

Basic trigram scoring

Binary analysis with multiple encodings

Script detection only

Compatibility

Release Process

Development

Contributing

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

stranger-strings-rs

What It Does

Install

Prebuilt binaries

From source

CLI

Quick start

Model path behavior

Encodings

Language-aware scoring

Other useful flags

Library Usage

Basic trigram scoring

Binary analysis with multiple encodings

Script detection only

Compatibility

Release Process

Development

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages