Complete text-to-speech solution with native GUI for Linux desktop
- About
- History
- Features
- TTS Engines
- Architecture
- Installation
- Usage
- Configuration
- Internationalization
- Technical Details
- Building from Source
- License
- Authors
BigLinux TTS is a native desktop Linux application that converts text to speech. Built with GTK4, libadwaita, and a native Rust audio engine, it is the built-in screen reader for BigLinux — a Brazilian Linux distribution based on Manjaro/Arch Linux.
Select any text on screen, press Alt+V, and hear it read aloud. Press again to stop. No complicated setup.
- Accessibility — screen reading for users with visual impairments or reading difficulties
- Multitasking — listen to articles, documents, and emails while doing other things
- Language learning — hear correct pronunciation in 100+ languages
- Proofreading — catch writing errors by listening to what was written
- Productivity — convert passive reading into active listening
- 4 TTS engines — RHVoice, espeak-ng native FFI, Piper Neural TTS, and Kokoro Neural TTS
- Native Rust audio — espeak-ng via direct FFI and Piper ONNX inference via
ort, no subprocess overhead - Automatic voice discovery — scans all installed engines and voices system-wide
- Smart text processing — expands abbreviations, pronounces special characters, strips HTML/Markdown
- KDE Plasma integration — global hotkey, system tray icon, launcher pinning
- Modern UI — GTK4 + libadwaita (GNOME HIG), clean and responsive interface
- 29 languages — gettext-based i18n with
.pofiles
BigLinux TTS was born from a practical need: making text-to-speech accessible and easy on Linux desktop.
| Date | Version | Milestone |
|---|---|---|
| Sep 2021 | — | First commit by Bruno Gonçalves: initial web-based interface |
| Mar 2022 | — | Rafael Ruscher joins: icon design, CSS refinements, translations |
| Aug 2022 | — | PKGBUILD packaging, i18n with 29 locales, CI/CD workflow |
| Dec 2023 | — | Volume/pitch/rate range inputs, UI polish |
| Feb 2026 | 3.0 | Full rewrite: web UI → GTK4 + libadwaita + Python. speech-dispatcher integration, Piper Neural TTS, tray icon (PySide6 subprocess), text processor with abbreviation expansion |
| Mar 2026 | 3.1 | Native RHVoice backend, parallel voice discovery, Python DBus launcher |
| Mar 2026 | 3.2 | Voice Manager dialog with install/remove, theme support, khotkeys sync |
| Jun 2026 | 4.0 | Native Rust engine (PyO3): espeak-ng FFI (zero-subprocess latency), Piper ONNX inference via ort with model caching (7× faster short text). Kokoro Neural TTS integration. Complete i18n audit (212 strings). Full codebase cleanup. |
- Configurable global hotkey (default Alt+V) — select text anywhere, press to speak, press again to stop (toggle)
- System tray icon — left-click to speak, right-click for menu (Read text, Settings, Quit)
- Built-in voice test — text field to type and hear with current voice settings
- Launcher pinning — option to pin the speak button to KDE Plasma taskbar
- Speed — scale from -100 (slow) to +100 (fast)
- Pitch — scale from -100 (low) to +100 (high)
- Volume — scale from 0 (mute) to 100 (max)
- Voice selection — dynamic list filtered by engine: "Name — Language [Quality]"
| Feature | Description | Example |
|---|---|---|
| Expand abbreviations | Converts slang/abbreviations per language | tb → "também", btw → "by the way" |
| Special characters | Pronounces symbols by name | # → "hash", @ → "at" |
| Strip formatting | Removes HTML tags, Markdown bold/italic/code | **bold** → "bold" |
| URL handling | Option to read or skip links | https://... → read or skip |
| Character limit | Truncates long text | Unlimited, 1K, 5K, 10K, 50K, 100K |
| Shortcut | Action |
|---|---|
| Alt+V (default) | Speak/stop selected text (toggle) |
| Ctrl+Q | Quit application |
- PySide6
QSystemTrayIconrunning in isolated subprocess (avoids GTK/Qt conflicts) - Left-click: toggle speak/stop
- Right-click: context menu (Read text, Settings, Quit)
- Communicates with main process via JSON lines over stdin/stdout
High-quality multilingual TTS through the speech-dispatcher daemon.
| Voice | Language | Quality |
|---|---|---|
| Letícia F123 | pt-BR | ★★★★ |
| Evgeniy | English | ★★★★ |
| + others | Multiple | ★★★–★★★★ |
Communication via speechd.SSIPClient (SSIP protocol) with automatic daemon restart fallback.
Direct C FFI to libespeak-ng.so — zero subprocess overhead. The Rust engine calls espeak-ng API functions directly via unsafe extern "C" bindings, compiled through PyO3.
AUDIO_OUTPUT_PLAYBACKmode: espeak-ng handles audio output internally- One-time initialization via
OnceLock(thread-safe, nostatic mut) - Supports 100+ languages with basic quality
Neural TTS with near-human speech quality. Runs ONNX models locally via the ort crate — no piper-tts binary needed for native mode.
Pipeline: text → espeak-ng IPA phonemes (FFI) → phoneme IDs → ONNX model → f32 audio → WAV → rodio playback
| Feature | Detail |
|---|---|
| Runtime | ort 2.0 (ONNX Runtime, system library) |
| Model cache | Mutex<Option<CachedModel>> — load once, reuse across calls |
| Phonemization | espeak-ng TextToPhonemes via FFI |
| Audio | rodio with AtomicBool stop flag |
| Performance | 7× faster than subprocess for short text |
Advanced neural TTS with voice blending and emotion presets. Runs via Python kokoro package with PyTorch backend.
- Voice blending: mix two voices with configurable ratio
- Emotion presets: neutral, happy, calm, urgent, narrative
- Per-language code selection: Portuguese, English, Spanish, and more
The system discovers voices from all engines simultaneously in background threads:
- RHVoice:
spd-say -o rhvoice -L→ parses SSIP names with hardcoded metadata (language, gender). Fallback: scan/usr/share/RHVoice/voices/and pacman packages - espeak-ng:
espeak-ng --voices→ parses tabular output (language code, gender) - Piper: scans
/usr/share/piper-voices/,~/.local/share/piper-voices/→ detects.onnxfiles with.onnx.jsonconfig - Kokoro: scans installed voice packs and user-downloaded
.npyvoice files
Result: VoiceCatalog with all available voices, filterable by language, engine, and quality.
┌─────────────────────────────────────────────────────────────────────┐
│ main.py │
│ CLI args, logging, App.run() │
├─────────────────────────────────────────────────────────────────────┤
│ application.py │
│ TTSApplication (Adw.Application) │
│ startup → activate → shutdown lifecycle │
├──────────────────┬──────────────────┬───────────────────────────────┤
│ UI Layer │ Service Layer │ Data Layer │
├──────────────────┼──────────────────┼───────────────────────────────┤
│ window.py │ tts_service.py │ config.py │
│ ├ HeaderBar │ ├ speak() │ ├ AppSettings (dataclasses) │
│ ├ NavigationView │ ├ stop() │ ├ TTSBackend enum │
│ └ Toast overlay │ └ state machine │ └ load/save JSON │
│ │ │ │
│ main_view.py │ voice_manager.py │ settings_service.py │
│ ├ Hero section │ └ discover() │ └ debounced auto-save (500ms) │
│ ├ Voice controls │ │ │
│ ├ Text options │ text_processor.py│ │
│ ├ Backend select │ ├ abbreviations │ │
│ └ Advanced │ ├ special chars │ │
│ │ └ formatting │ │
│ components.py │ │ │
│ └ Widget factory │ clipboard_svc.py │ │
│ │ ├ wl-paste │ │
│ welcome_dialog.py│ └ xsel │ │
│ voice_manager_dlg│ │ │
│ history_view.py │ tray_service.py │ │
│ audio_player.py │ └ PySide6 subproc│ │
│ │ │ │
│ │ kokoro_voice_svc │ │
│ │ └ voice download │ │
├──────────────────┴──────────────────┴───────────────────────────────┤
│ tts_engine.so (Rust/PyO3) │
│ ┌──────────┐ ┌──────────────┐ ┌─────────────┐ │
│ │ espeak │ │ piper (ONNX) │ │ audio │ │
│ │ FFI │ │ ort + cache │ │ rodio + stop│ │
│ └──────────┘ └──────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
tts-engine/
├── Cargo.toml # PyO3, ort, rodio, hound, serde, thiserror
├── build.rs # Link args: pyo3 + libespeak-ng
└── src/
├── lib.rs # PyO3 module: speak_espeak, speak_piper, synthesize_piper, stop
├── audio.rs # rodio playback with AtomicBool stop flag
├── error.rs # TtsError enum (thiserror derive)
└── backends/
├── espeak.rs # FFI to libespeak-ng (OnceLock init, SetVoice, Synth, Cancel)
└── piper.rs # ONNX pipeline: phonemize → IDs → infer → WAV → play
Key dependencies: pyo3 0.25 · ort 2.0 · rodio 0.20 · hound 3.5 · thiserror 2 · serde 1
speak() stop() / error / done
┌──────────────┐ ┌────────────────────────┐
│ ▼ │ │
│ ┌────────┐ │ ┌──────────┐ │
│ │ IDLE │───┘ │ SPEAKING │─────────┘
│ └────────┘ └──────────┘
│ │ │
│ speak() error()
│ │ │
│ ┌────▼────┐ ┌────▼─────┐
│ │SPEAKING │ │ ERROR │
│ └─────────┘ └──────────┘
│ │
└──────────────────────────────┘
speak()
# Install from BigLinux repository
sudo pacman -S tts-biglinux
# Optional: RHVoice Portuguese voice
sudo pacman -S rhvoice rhvoice-voice-leticia-f123
# Optional: Piper neural TTS
sudo pacman -S piper-tts-bin piper-voices-pt-BR
# Optional: system tray icon
sudo pacman -S pyside6git clone https://github.com/biglinux/tts-biglinux.git
cd tts-biglinux/pkgbuild
makepkg -sigit clone https://github.com/biglinux/tts-biglinux.git
cd tts-biglinux
# Build native Rust engine
cd tts-engine
ORT_LIB_LOCATION=/usr/lib ORT_PREFER_DYNAMIC_LINK=1 cargo build --release
cd ..
# Symlink the .so
ln -sf ../../tts-engine/target/release/libtts_engine.so \
usr/share/biglinux/tts-biglinux/tts_engine.so
# Run
cd usr/share/biglinux/tts-biglinux
python main.py --debug| Package | Description |
|---|---|
python (3.10+) |
Python interpreter |
python-gobject |
GTK bindings for Python (PyGObject) |
gtk4 |
GTK 4 toolkit |
libadwaita |
Adwaita widget library (GNOME HIG) |
speech-dispatcher |
Speech synthesis daemon |
espeak-ng |
Open-source TTS engine + libespeak-ng.so |
xsel |
X11 clipboard access (primary selection) |
wl-clipboard-rs |
Wayland clipboard access (wl-paste) |
alsa-utils |
ALSA audio utilities |
onnxruntime |
ONNX Runtime library (for Piper native inference) |
| Package | Description |
|---|---|
rust (1.85+) |
Rust toolchain |
cargo |
Rust package manager |
| Package | Description |
|---|---|
pyside6 |
System tray icon (QSystemTrayIcon subprocess) |
rhvoice |
High-quality multilingual TTS engine |
rhvoice-voice-leticia-f123 |
Brazilian Portuguese female voice |
piper-tts-bin |
Piper TTS binary (subprocess fallback) |
piper-voices-pt-BR |
Brazilian Portuguese neural voices |
python-kokoro |
Kokoro neural TTS engine |
python-pytorch |
PyTorch runtime for Kokoro |
biglinux-tts # Open settings window
biglinux-tts --debug # Debug mode with detailed logging
biglinux-tts --version # Print versionbiglinux-tts-speak # Speak selected text (called by Alt+V)The biglinux-tts-speak script works as a toggle:
- Already speaking → stop immediately (kill process via PID file)
- Text selected → read aloud with configured engine/voice
- No text → exit silently
- First launch: welcome dialog explains features and setup
- Configure: select TTS engine, voice, adjust speed/pitch/volume
- Test: type text in the test field and click "Test voice"
- Daily use: select text anywhere → Alt+V → listen
| Path | Content |
|---|---|
~/.config/biglinux-tts/settings.json |
All app settings (JSON) |
/tmp/biglinux-tts-{user}.pid |
Speech process PID (toggle) |
{
"speech": {
"rate": -25,
"pitch": -25,
"volume": 75,
"voice_id": "piper:/usr/share/piper-voices/pt/pt_BR/faber/medium/pt_BR-faber-medium.onnx",
"backend": "piper",
"output_module": "rhvoice",
"kokoro": {
"speed": 1.0,
"voice_blend": "",
"blend_ratio": 0.5,
"emotion_preset": "neutral",
"lang_code": "p"
}
},
"text": {
"expand_abbreviations": true,
"process_urls": false,
"process_special_chars": true,
"strip_formatting": true,
"max_chars": 0
},
"shortcut": {
"keybinding": "<Alt>v",
"enabled": true,
"show_in_launcher": true
},
"window": {
"width": 560,
"height": 680,
"maximized": false
},
"history": {
"enabled": false,
"save_audio": true,
"save_text": true,
"playback_mode": "interrupt"
},
"show_welcome": true
}The app automatically detects old-format settings in ~/.config/tts-biglinux/ (individual files: rate, pitch, volume, voice) and migrates them to the unified JSON format.
Translation uses gettext .po files with a custom Python parser (not binary .mo):
- Locale detection:
LANGUAGE→LC_ALL→LC_MESSAGES→LANG - File lookup: tries
pt-BRandpt_BRvariants, then base codept - Search paths:
./locale/(dev) →/usr/share/tts-biglinux/locale/(installed)
from utils.i18n import _
label.set_text(_("Ready to speak")) # → "Pronto para falar" in pt-BR212 translatable strings across all source files.
| Code | Language | Code | Language |
|---|---|---|---|
| bg | Bulgarian | ko | Korean |
| ca | Catalan | nl | Dutch |
| cs | Czech | no | Norwegian |
| da | Danish | pl | Polish |
| de | German | pt | Portuguese |
| el | Greek | pt-BR | Portuguese (Brazil) |
| en | English | ro | Romanian |
| es | Spanish | ru | Russian |
| et | Estonian | sk | Slovak |
| fi | Finnish | sv | Swedish |
| fr | French | tr | Turkish |
| he | Hebrew | uk | Ukrainian |
| hr | Croatian | zh | Chinese |
| hu | Hungarian | is | Icelandic |
| it | Italian | ja | Japanese |
- Copy the template:
cp locale/tts-biglinux.pot locale/<code>.po - Translate the
msgstrentries in the.pofile - The app loads
.pofiles directly — no compilation step needed
The tts-engine crate provides zero-overhead TTS backends via PyO3:
- espeak-ng FFI:
unsafe extern "C"bindings tolibespeak-ng.so.OnceLockfor thread-safe one-time initialization. No subprocess, no IPC — direct function calls - Piper ONNX:
ort2.0 for inference,houndfor WAV encoding,rodiofor playback. Model sessions cached inMutex<Option<CachedModel>>— loaded once, reused across calls - Audio:
rodiowithAtomicBoolstop flag for interruptible playback. Dedicated audio thread (OutputStream is!Send + !Sync) - Error handling:
thiserrorderive macro, properResultpropagation to Python viaPyRuntimeError
Build: ORT_LIB_LOCATION=/usr/lib ORT_PREFER_DYNAMIC_LINK=1 cargo build --release
Clippy: 0 quality warnings (clippy::all + clippy::pedantic + clippy::nursery). Only expected unsafe_code warnings from FFI.
text_processor.py applies transformations before synthesis:
- Strip formatting: removes HTML tags, Markdown bold/italic/code, headers, lists, links
- URL handling: removes or keeps
https?://\S+ - Abbreviation expansion (language-aware): ~65 Portuguese, ~30 English, ~10 Spanish
- Special characters (language-aware):
#→ "hash"/"cerquilha",@→ "at"/"arroba" - Cleanup: collapse multiple spaces/newlines
clipboard_service.py auto-detects the display server:
- Wayland:
wl-paste --primary --no-newline, fallback to regular clipboard - X11:
xsel --primary -o, fallback toxsel -o, thenxclip - Detection:
XDG_SESSION_TYPE == "wayland"orWAYLAND_DISPLAYset
JSON lines over stdin/stdout between GTK parent and PySide6 child:
GTK (parent) Qt (child)
│ │
│── {"cmd":"set_menu",...} ──────▶│ configure context menu
│── {"cmd":"set_tooltip",...} ───▶│ set tooltip
│── {"cmd":"set_speaking",...} ──▶│ update speaking state
│ │
│◀── {"event":"ready"} ─────────│ tray icon visible
│◀── {"event":"activate"} ──────│ left click
│◀── {"event":"menu","id":1} ───│ menu item clicked
│ │
│── {"cmd":"quit"} ─────────────▶│ terminate
- Debouncer:
GLib.timeout_add(500ms)— saves settings after 500ms of inactivity - run_in_thread: heavy ops (clipboard, voice discovery) in daemon threads, results via
GLib.idle_add() - TTS monitoring: 300ms polling via
GLib.timeout_add()to detect speech completion - UI thread: no blocking operations on GTK main thread
cd pkgbuild && makepkg -siThe build process:
- Compiles the Rust
tts-enginecrate withcargo build --release - Copies the
usr/tree (Python code, icons, desktop file, locale) - Installs
libtts_engine.soastts_engine.sointo the application directory - Sets executable permissions on
usr/bin/*
pkgver=$(date +%y.%m.%d) # Date-based: e.g. 26.06.19
pkgrel=$(date +%H%M) # Release by hour (multiple builds/day)
arch=('x86_64') # x86_64 only (native Rust binary)
tts-biglinux/
├── locale/ # Translation source files (.po, .pot)
│ ├── tts-biglinux.pot # Template (212 strings)
│ ├── pt-BR.po # Brazilian Portuguese (100%)
│ └── ... # 28 more languages
├── pkgbuild/
│ └── PKGBUILD # Arch/BigLinux packaging
├── tts-engine/ # Native Rust TTS engine
│ ├── Cargo.toml # Dependencies and lints
│ ├── build.rs # Link: pyo3 + libespeak-ng
│ └── src/
│ ├── lib.rs # PyO3 module entry
│ ├── audio.rs # rodio playback + stop
│ ├── error.rs # TtsError enum
│ └── backends/
│ ├── espeak.rs # espeak-ng FFI
│ └── piper.rs # ONNX inference + phonemization
├── usr/
│ ├── bin/
│ │ ├── biglinux-tts # Entry: cd + exec python main.py
│ │ └── biglinux-tts-speak # Standalone toggle script (Alt+V)
│ └── share/
│ ├── applications/
│ │ └── br.com.biglinux.tts.desktop
│ ├── biglinux/tts-biglinux/ # Python application code
│ │ ├── main.py # CLI args, logging, App.run()
│ │ ├── application.py # Adw.Application lifecycle
│ │ ├── config.py # Constants, enums, dataclasses
│ │ ├── window.py # Adw.ApplicationWindow
│ │ ├── services/ # TTS, voice mgr, clipboard, tray
│ │ ├── ui/ # Views, dialogs, components
│ │ ├── utils/ # i18n, async, speechd
│ │ └── resources/ # CSS, __init__.py
│ ├── icons/hicolor/scalable/ # SVG icons (app + status)
│ └── khotkeys/ # KDE Plasma 5 shortcut
└── README.md
Licensed under GPL-3.0-or-later.
TTS engines (speech-dispatcher, espeak-ng, RHVoice, Piper, Kokoro) have their own licenses. See their respective documentation.
- Tales A. Mendonça — BigLinux project creator
- Bruno Gonçalves Araujo — BigLinux project, initial implementation
- Rafael Ruscher — Architecture, GTK4 rewrite, Rust engine, v3.0–4.0