Post
1686
PersonaPlex-7B on Apple Silicon (Swift + MLX Swift)
NVIDIA PersonaPlex is a full-duplex speech-to-speech model — it can listen while it speaks, which enables more natural conversational behaviors like interruptions, overlaps, and quick backchannels.
We put together a native Swift implementation using MLX Swift so it can run locally on Apple Silicon, along with a 4-bit MLX conversion and a small CLI/demo to make it easy to try out.
If you’re interested in on-device voice agents (or just want to see what full-duplex S2S looks like in a real Swift codebase), the details and setup notes are here:
Blog post: https://blog.ivan.digital/nvidia-personaplex-7b-on-apple-silicon-full-duplex-speech-to-speech-in-native-swift-with-mlx-0aa5276f2e23
Repo: https://github.com/ivan-digital/qwen3-asr-swift
NVIDIA PersonaPlex is a full-duplex speech-to-speech model — it can listen while it speaks, which enables more natural conversational behaviors like interruptions, overlaps, and quick backchannels.
We put together a native Swift implementation using MLX Swift so it can run locally on Apple Silicon, along with a 4-bit MLX conversion and a small CLI/demo to make it easy to try out.
If you’re interested in on-device voice agents (or just want to see what full-duplex S2S looks like in a real Swift codebase), the details and setup notes are here:
Blog post: https://blog.ivan.digital/nvidia-personaplex-7b-on-apple-silicon-full-duplex-speech-to-speech-in-native-swift-with-mlx-0aa5276f2e23
Repo: https://github.com/ivan-digital/qwen3-asr-swift