A fast Python toolkit for Persian (Farsi) text normalization, number/character conversion, and localization utilities.
pip install persianimport persian
persian.convert_ar_characters("علي") # 'علی'
persian.convert_fa_numbers("۱۳۷۱") # '1371'
persian.convert_en_numbers("345") # '۳۴۵'
persian.convert_en_characters("sghl") # 'سلام'
persian.convert_fa_spaces("آمده ای") # 'آمدهای'
persian.decode_url("https://.../%D8%B5%D9%81%D8%AD%D9%87") # 'https://.../صفحه'Need a one-stop helper? Use normalize_persian("سلام ٣٤٥ می آیم")
to get سلام ۳۴۵ میآیم.
Version 1.0.0 includes major speedups:
- 3–5× faster number and keyboard conversions via
str.translate - 2–3× faster spacing fixes using pre-compiled regular expressions
- ~50% lower memory usage on large strings thanks to fewer temporary objects
See docs/PERFORMANCE.md for benchmark methodology and charts.
The package ships with native type hints (py.typed) and strict type checking.
Static analyzers know the exact return type of every function:
from persian import convert_en_numbers
result: str = convert_en_numbers("123") # ✔️ type checker is satisfiedAll public functions validate inputs and raise clear exceptions:
import persian
persian.convert_en_numbers(None) # ValueError: input_str cannot be None
persian.convert_en_numbers(123) # TypeError: input_str must be str, got intEmpty strings are accepted and return empty strings.
| Category | Functions |
|---|---|
| Numbers | convert_en_numbers, convert_fa_numbers, convert_ar_numbers |
| Characters | convert_en_characters, convert_ar_characters, remove_arabic_diacritics |
| Spacing & URLs | convert_fa_spaces, decode_url |
| Utilities | normalize_persian, contains_persian_digits, contains_arabic_digits, is_persian_text |
A detailed description is available in docs/API.md.
CamelCase helpers remain available but emit DeprecationWarning:
enToPersianNumb()→convert_en_numbers()enToPersianChar()→convert_en_characters()arToPersianNumb()→convert_ar_numbers()arToPersianChar()→convert_ar_characters()
See docs/CONTRIBUTING.md for setup instructions and code style rules.