Speech to Text for MacFast, Private, Multilingual

Local speech recognition powered by the SenseVoice model. Converts your voice to text in under 70ms — entirely on your Mac, no internet required.

Get Just Parley

Messages

Can you send me the latest designs when you get a chance?

Yeah of course, I'll send them over in a few minutes. Just finishing up the last couple of screens now.

SpeakingFn

100%

Offline & private

Languages supported

<1s

Transcription speed

Data sent to the cloud

How speech-to-text works

Three steps. No setup. No cloud accounts.

Hold your hotkey

Press and hold a keyboard shortcut to start recording from your microphone.

Speak naturally

The SenseVoice speech recognition model processes your audio in real time, locally on your Mac.

Text appears instantly

Release the key and your transcribed text is pasted at the cursor — in any app.

Features

Built for people who type a lot

Whether you're writing emails, messages, docs, or code comments — speaking is faster than typing.

Sits in your menu bar

No app to open. No window to find. Just Parley lives quietly in your menu bar, always one hotkey away. It stays out of your way until you need it.

Sits in your menu bar

Completely offline

Your voice never leaves your Mac. No cloud processing, no accounts, no data collection. The speech model runs entirely on your machine.

Completely offline

Data sent to the cloud

Five languages, auto-detect

English, Chinese, Japanese, Korean, and Cantonese. Pick one or let the app detect the language automatically as you speak.

English

中文

日本語

한국어

粵語

Auto

Works everywhere

Any text field, any app. Emails, Slack messages, Google Docs, code editors, search bars — if you can type in it, Just Parley can type in it for you.

MailHi team, just wanted to check in on...

SlackSounds good, let's go with option B...

DocsThe quarterly results show a clear...

The technology behind Just Parley

Built on proven speech recognition research, optimised for macOS.

SenseVoice speech model

Just Parley uses FunAudioLLM's SenseVoice — a compact, high-accuracy speech-to-text model trained on over 400,000 hours of multilingual audio data. It runs as an optimised ONNX model via sherpa-onnx for low-latency inference.

70ms processing latency

Speech recognition runs locally on your CPU. Apple Silicon Macs (M1/M2/M3/M4) deliver around 70ms transcription time. Intel Macs are fully supported with slightly longer processing.

No cloud dependency

The entire model runs on-device. There are no API calls, no server roundtrips, no network requirements. Your speech is processed in the same process that captures it.

Multilingual speech recognition

Five languages with automatic language detection. Speak in any supported language and Just Parley identifies it automatically.

English

中文

Chinese

日本語

Japanese

한국어

Korean

粵語

Cantonese

Auto-detect

Switch languages mid-session — no settings to change

Your voice never leaves your Mac

No cloud. No accounts. No data collection.

All speech processing happens locally on your Mac's CPU

No audio is ever sent to a server — not even anonymised

No user accounts or sign-ups required

No telemetry, analytics, or usage tracking

The app works fully offline — disconnect your Wi-Fi and it still runs

How Just Parley compares

Speech-to-text options for Mac, side by side.

Feature	Just Parley	Apple Dictation	Cloud STT (Google, Otter.ai)	Whisper (local)
Works in every app	Yes	Partial	Browser only	CLI / manual
Fully offline	Yes	Partial	No	Yes
Privacy	100% local	Some cloud	Cloud-based	100% local
Latency	~70ms	~200ms	500ms-2s	1-5s
Multilingual	5 languages	Many	Many	99 languages
Auto language detect	Yes	No	Some	Yes
Punctuation	Automatic	Automatic	Automatic	Automatic
Setup required	None	None	Account + API key	Python + CLI
Price	Free	Free	Subscription	Free

Works in every Mac app

Speech-to-text that pastes into any text field — chat apps, code editors, browsers, and more.

Accurate speech-to-text. On your Mac. Right now.

Free download. No subscription. No cloud.

Free

no card required

SenseVoice speech recognition model

100% offline — no cloud, no accounts

5 languages with auto-detect

Works in every app on your Mac

Use on up to 3 Macs

Free updates for life

Frequently asked questions

How accurate is the speech recognition?

Just Parley uses the SenseVoice model, which achieves competitive accuracy with cloud-based services on standard benchmarks. For clear speech in supported languages, you can expect 95%+ accuracy. Accuracy depends on microphone quality, background noise, and speaking clarity.

Does it use AI / machine learning?

Yes. The SenseVoice model is a deep neural network trained on over 400,000 hours of multilingual speech data. It runs as an optimised ONNX model on your Mac's CPU — no GPU required, no cloud inference.

Can it handle different accents?

SenseVoice was trained on diverse speech data covering multiple accents and speaking styles. It handles most English accents well (American, British, Australian, Indian, etc.) and performs strongly across regional variations in Chinese, Japanese, Korean, and Cantonese.

Does it work without internet?

Yes, completely. The speech recognition model is bundled with the app and runs entirely on your Mac's CPU. You can disconnect from the internet entirely and it works exactly the same.

What's the difference between speech-to-text and dictation?

Speech-to-text (also called speech recognition or STT) is the underlying technology that converts audio to text. Dictation is a use case built on top of STT — it's the act of speaking to type. Just Parley provides both: the speech-to-text engine and the dictation workflow.

Does it support automatic punctuation?

Yes. The SenseVoice model includes inverse text normalisation (ITN) that adds punctuation, capitalisation, and number formatting automatically. You don't need to say "period" or "comma" — it infers them from context.