Speech to Text for MacFast, Private, Multilingual

Local speech recognition powered by the SenseVoice model. Converts your voice to text in under 70ms — entirely on your Mac, no internet required.

Messages
S

Can you send me the latest designs when you get a chance?

Yeah of course, I'll send them over in a few minutes. Just finishing up the last couple of screens now.

SpeakingFn
100%
Offline & private
5
Languages supported
<1s
Transcription speed
0
Data sent to the cloud

How speech-to-text works

Three steps. No setup. No cloud accounts.

1

Hold your hotkey

Press and hold a keyboard shortcut to start recording from your microphone.

2

Speak naturally

The SenseVoice speech recognition model processes your audio in real time, locally on your Mac.

3

Text appears instantly

Release the key and your transcribed text is pasted at the cursor — in any app.

Features

Built for people who type a lot

Whether you're writing emails, messages, docs, or code comments — speaking is faster than typing.

Sits in your menu bar

No app to open. No window to find. Just Parley lives quietly in your menu bar, always one hotkey away. It stays out of your way until you need it.

Sits in your menu bar

Completely offline

Your voice never leaves your Mac. No cloud processing, no accounts, no data collection. The speech model runs entirely on your machine.

Completely offline

Data sent to the cloud

Five languages, auto-detect

English, Chinese, Japanese, Korean, and Cantonese. Pick one or let the app detect the language automatically as you speak.

English
中文
日本語
한국어
粵語
Auto

Works everywhere

Any text field, any app. Emails, Slack messages, Google Docs, code editors, search bars — if you can type in it, Just Parley can type in it for you.

MailHi team, just wanted to check in on...
SlackSounds good, let's go with option B...
DocsThe quarterly results show a clear...

The technology behind Just Parley

Built on proven speech recognition research, optimised for macOS.

SenseVoice speech model

Just Parley uses FunAudioLLM's SenseVoice — a compact, high-accuracy speech-to-text model trained on over 400,000 hours of multilingual audio data. It runs as an optimised ONNX model via sherpa-onnx for low-latency inference.

70ms processing latency

Speech recognition runs locally on your CPU. Apple Silicon Macs (M1/M2/M3/M4) deliver around 70ms transcription time. Intel Macs are fully supported with slightly longer processing.

No cloud dependency

The entire model runs on-device. There are no API calls, no server roundtrips, no network requirements. Your speech is processed in the same process that captures it.

Multilingual speech recognition

Five languages with automatic language detection. Speak in any supported language and Just Parley identifies it automatically.

English
English
中文
Chinese
日本語
Japanese
한국어
Korean
粵語
Cantonese
Auto-detect
Switch languages mid-session — no settings to change

Your voice never leaves your Mac

No cloud. No accounts. No data collection.

All speech processing happens locally on your Mac's CPU
No audio is ever sent to a server — not even anonymised
No user accounts or sign-ups required
No telemetry, analytics, or usage tracking
The app works fully offline — disconnect your Wi-Fi and it still runs

How Just Parley compares

Speech-to-text options for Mac, side by side.

FeatureJust ParleyApple DictationCloud STT (Google, Otter.ai)Whisper (local)
Works in every appYesPartialBrowser onlyCLI / manual
Fully offlineYesPartialNoYes
Privacy100% localSome cloudCloud-based100% local
Latency~70ms~200ms500ms-2s1-5s
Multilingual5 languagesManyMany99 languages
Auto language detectYesNoSomeYes
PunctuationAutomaticAutomaticAutomaticAutomatic
Setup requiredNoneNoneAccount + API keyPython + CLI
Price$29 onceFreeSubscriptionFree

Accurate speech-to-text. On your Mac. Right now.

One-time purchase. No subscription. No cloud.

Early pricing
$29
one-time purchase
SenseVoice speech recognition model
100% offline — no cloud, no accounts
5 languages with auto-detect
Works in every app on your Mac
Use on up to 3 Macs
Free updates for life

30-day money-back guarantee.

Frequently asked questions

How accurate is the speech recognition?

Just Parley uses the SenseVoice model, which achieves competitive accuracy with cloud-based services on standard benchmarks. For clear speech in supported languages, you can expect 95%+ accuracy. Accuracy depends on microphone quality, background noise, and speaking clarity.

Does it use AI / machine learning?

Yes. The SenseVoice model is a deep neural network trained on over 400,000 hours of multilingual speech data. It runs as an optimised ONNX model on your Mac's CPU — no GPU required, no cloud inference.

Can it handle different accents?

SenseVoice was trained on diverse speech data covering multiple accents and speaking styles. It handles most English accents well (American, British, Australian, Indian, etc.) and performs strongly across regional variations in Chinese, Japanese, Korean, and Cantonese.

Does it work without internet?

Yes, completely. The speech recognition model is bundled with the app and runs entirely on your Mac's CPU. You can disconnect from the internet entirely and it works exactly the same.

What's the difference between speech-to-text and dictation?

Speech-to-text (also called speech recognition or STT) is the underlying technology that converts audio to text. Dictation is a use case built on top of STT — it's the act of speaking to type. Just Parley provides both: the speech-to-text engine and the dictation workflow.

Does it support automatic punctuation?

Yes. The SenseVoice model includes inverse text normalisation (ITN) that adds punctuation, capitalisation, and number formatting automatically. You don't need to say "period" or "comma" — it infers them from context.