Give your agent
eyes and ears.
Record your screen, dictate what you want, and Zerro turns it into exactly what you need — a structured prompt for your AI agent, or a plain-language explanation of what's on screen.
v1.0 · Apple Silicon · Signed & notarized
What is Zerro?
Show your screen. Say what you want.
Zerro is a native macOS menu-bar app that turns a screen recording and a few spoken words into a clear, structured prompt for whatever AI tool you're using — ChatGPT, Claude, a coding agent, or anything else that takes text.
Everything runs locally first, and you bring your own keys — no servers in the middle, no account required. Talking is faster than typing, and Zerro turns what you say into something an AI can actually act on.
Works with
Use it for
- Hand a coding agent real context
- Explain a bug to an AI assistant
- Understand a confusing screen, error, or document
- Explain what you're looking at to a teammate
How it works
Record. Speak. Copy.
Think voice dictation — but instead of plain text, you get a structured prompt. No new app to learn; the whole flow runs from the menu bar.
Select a region
Hit the hotkey, drag to frame the part of your screen you want the agent to see. Native macOS crosshair, no window switching.
Dictate what you want
Talk it through like you'd explain it to a teammate. Point at things, change your mind, ramble. Zerro records up to 3 minutes.
Zerro processes locally
Audio gets isolated and frames get downsampled on your machine. Then a single API call turns it into a structured prompt.
Paste the prompt
Markdown prompt lands on your clipboard, ready to drop into Cursor, Windsurf, v0, or wherever you ship from.
The output
The prompt writes itself.
Every recording becomes exactly what you need: a structured instruction your agent can act on, or a clear explanation in plain language.
Same recording, two modes
Improve the sign-in screen's conversion by fixing layout, hierarchy, and accessibility issues on the login form.
- - Move the "Forgot password?" link up into the Sign In cluster, directly below the password field.
- - Tighten the helper copy so it fits on a single line rather than wrapping.
- - Change the primary call-to-action from muted gray to the brand blue.
- - Fix the tab order so the "Remember me" checkbox is reachable by keyboard.
Keep the existing form fields and overall layout; these are refinements, not a redesign.
Example output. Every prompt is generated from your real recording.
Built right
Native, local-first, no surprises.
Zerro is built the way you'd build it if it were your tool.
Native Swift & SwiftUI
A real menu-bar app, built on ScreenCaptureKit. Not an Electron tab pretending to be a Mac app.
Local-first processing
Audio isolation and frame downsampling happen on your machine before anything leaves it.
Bring your own keys
Your OpenAI and Gemini keys, stored in macOS Keychain. No servers handling your data.
Cost-bounded by design
A 3-minute hard cap on every recording keeps each request fast, cheap, and predictable.
Signed & notarized
Distributed direct as a code-signed .dmg. macOS Gatekeeper will not complain.
Sparkle auto-updates
Updates ship through Sparkle, the standard for indie Mac apps. You stay current without thinking about it.
How it compares
A different category.
Dictation tools give you text. Screen recorders give you a video. Zerro gives your agent a prompt.
| Feature | Zerro | Wispr Flow | Loom |
|---|---|---|---|
| Primary input | Screen + voice | Voice | Screen + voice |
| Output | Structured prompt | Dictated text | Video link |
| Built for AI coding agents | — | — | |
| Local-first processing | — | — | |
| Bring your own API keys | — | — | |
| Native macOS app | — | ||
| Pricing modelPay once, own it forever. | One-time ($39) | Subscription | Subscription |
Comparison reflects Zerro's positioning. Competitor capabilities change over time — check each product for current details.
The shift
Now we're talking.
The keyboard was never the point — it was just the only interface we had. As agents get better at acting on what you actually mean, typing out every instruction by hand starts to feel like the bottleneck it always was.
You think faster than you type. You talk faster than you think. Voice closes that gap — and dictation is becoming the default. Zerro is built for that future: less typing, more talking.
Words per minute
You speak nearly 4× faster than you type. The keyboard has been the bottleneck all along.
Pricing
Pay once, or let us handle it.
BYOK is available at launch. Managed tiers open up shortly after — drop your email and we'll let you know.
BYOK
Available nowPay once. Bring your own keys.
- 7-day free trial — no card required
- All features and future updates
- Bring your own OpenAI + Gemini keys
- Keys stored in your macOS Keychain
- No subscription, no account required
- No servers handling your data
Managed
Coming soonWe handle the tokens — no keys, no setup.
- 7-day free trial
- No API keys required
- Monthly recording credits included
- We manage all token usage
- Priority support
- Cancel anytime
Prices listed in USD. One-time purchase includes all v1 features and updates.
FAQ
Questions, answered.
Record it. Paste it.
Zerro in between.
Stop describing what you want. Show it. Zerro turns the recording into a structured prompt your agent can run with.
v1.0 · Apple Silicon · Signed & notarized