api.willowvoice.com already runs on Cloudflare · let's expand the footprint behind Scribe

You're shipping 5× faster than the keyboard.
The runtime should be as edge-native as the dictation.

Willow ships voice dictation to 50,000+ users across Mac, Windows, and iPhone — with SOC 2, HIPAA, zero data retention, and Scribe just shipping as the AI writing assistant. willowvoice.com is already on Cloudflare DNS. api.willowvoice.com is already running on Cloudflare anycast. The expansion footprint is the developer platform underneath: AI Gateway in front of the inference layer, R2 for the model + binary distribution, Workers for the desktop sync layer, and Workers for Platforms for the Teams plan that's coming.

NS: christian / dorthy.ns.cloudflare.com · api.willowvoice.com: 104.26.x / 172.67.x (Cloudflare anycast) · YC X25 · Founders @LiuLawrence45 & @_allanguo

Top banner on willowvoice.com
"📢 New: Scribe is here. Try your new Voice AI writing assistant!"
— The new product surface adds a full LLM workflow on top of dictation. Inference economics, latency, observability, and Teams-tier tenancy are about to matter a lot more than they did last quarter.

What's already running on Cloudflare today

DNS & ROOT ZONE
willowvoice.com on Cloudflare DNS via christian / dorthy.ns.cloudflare.com
API PLANE
api.willowvoice.com on Cloudflare anycast (104.26.x / 172.67.x) — already a CF Worker or proxy
EXPANSION PATH
Add AI Gateway, R2, Workers for Platforms behind Scribe — no new vendor, no new procurement
50K+
Trusted users across Mac, Windows, iPhone
Faster than typing (their published claim)
YC
X25 batch — backed by Hustle Fund, 20VC
4
Compliance: SOC 2, HIPAA, ZDR, Privacy Mode
What users are saying (from the Wall of Love)
"literal game changer for interacting with LLMs or Cursor" — Ben Gale · "I have doubled my productivity" — Eric Bahn, Hustle Fund · "$15/mo — wondering how I ever worked without it" — Julien Codorniou, 20VC

Willow ships the voice product. Cloudflare runs the global edge underneath.

You've shipped the hard part: a dictation engine fast enough and accurate enough that 50K+ users prefer it to the keyboard. The next infrastructure surface is the part players don't see — the inference plane behind Scribe, the binary + model distribution to every Mac and Windows desktop, the Teams tenancy boundary, the SOC 2 audit surface for the next enterprise security review. Cloudflare's developer platform is built for exactly that shape, and you're already on it for DNS + the API plane.

Willow builds

The voice product: dictation engine, Scribe, desktop clients

Mac, Windows, and iPhone clients. The ASR + post-processing engine that handles whisper, accents, mid-sentence edits, and context-aware spelling. Style-matching across apps. Scribe as the new Voice AI writing assistant. Voice commands. SOC 2 + HIPAA + zero data retention compliance.

  • Best-in-class ASR + post-processing (whisper-grade audio handling)
  • Native Mac, Windows, iPhone clients with hotkey UX
  • Scribe — voice-driven AI writing assistant on top of dictation
  • Style-matching + context-aware spelling per app
×

Cloudflare runs

The global edge: API, inference, binary delivery, tenancy

The same edge that's already serving api.willowvoice.com. AI Gateway in front of Scribe's inference layer. R2 for model weights + desktop binary distribution at zero egress. Workers for Platforms for the Teams plan tenancy. Zero Trust for the founder-team's access to production.

  • AI Gateway for Scribe inference — cache, attribution, BYO keys
  • R2 + Workers for Mac/Windows binary delivery, zero egress
  • Workers for Platforms for the Teams plan — per-org tenancy
  • Workers AI for the audio-side ASR + post-processing where it fits

Nine primitives, mapped to Willow's actual product surface.

Each maps to something you ship today (the API plane, the desktop clients, Scribe, the audio pipeline, the HIPAA-grade compliance surface) or something on the roadmap (Teams plan, organization-level controls, enterprise security reviews). Status tags show what's already live in your Cloudflare footprint.

PRIMITIVE 01 Live on CF

DNS for willowvoice.com

Authoritative DNS via christian / dorthy.ns.cloudflare.com. The foundation everything else snaps onto. Procurement is in place, SOC 2 mapping exists, support relationship is established — expansion is a configuration change, not a vendor selection.

DNS Foundation
PRIMITIVE 02 Live on CF

API plane on Cloudflare anycast

api.willowvoice.com resolves to Cloudflare anycast (104.26.x / 172.67.x) — meaning it's either a Worker, a CF proxy, or running through a CF Tunnel. Either way, the desktop and iOS clients talk to Cloudflare on every request already.

Workers Anycast API
PRIMITIVE 03 Highest-leverage next

AI Gateway in front of Scribe

Scribe = "polished message from a few quick words" = LLM call per invocation, multiplied across 50K+ users. AI Gateway sits in front: semantic cache for repeated patterns ("draft a Slack follow-up to X"), per-user attribution, budget caps before runaway spend, BYO keys for enterprise customers when the Teams plan ships.

AI Gateway Semantic cache Scribe
PRIMITIVE 04 Distribution

R2 for Mac + Windows + iOS binary delivery

Every desktop install pulls a binary. Every release updates it. With 50K+ users across three platforms and growing fast, that's a meaningful CDN bill at any reasonable scale. R2's zero egress + Workers' Smart Placement serves from the closest POP to each user without per-region S3 sprawl.

R2 Zero egress Smart Placement
PRIMITIVE 05 Teams plan

Workers for Platforms = per-org tenancy

The Teams page exists today (in the footer nav). When the Teams plan ships, every enterprise customer wants their own custom dictionary, their own SSO, their own data residency, their own audit log, their own Scribe budget. Workers for Platforms makes those boundaries infrastructure, not config flags.

Workers for Platforms Per-org Teams
PRIMITIVE 06 Custom dictionary

Vectorize for context-aware spelling

"Willow spells unique terms and names correctly using contextual cues." That's a retrieval problem. Vectorize indexes per-user (and per-org, when Teams ships) custom dictionaries + recent context so the right spelling comes back in single-digit milliseconds.

Vectorize Custom dict Context
PRIMITIVE 07 Streaming

Durable Objects for live transcription state

Live dictation sessions have state: the audio buffer, the running transcript, the user's style profile, the in-flight edit operations. Durable Objects give you a single-writer state holder per active session, at the edge, with native WebSocket support — no Redis cluster needed.

Durable Objects WebSockets Sessions
PRIMITIVE 08 Compliance

Zero Trust for founder-team production access

SOC 2 + HIPAA require audit-grade access controls to production. As Willow grows from YC X25 stage into enterprise sales, the security reviews get harder. Cloudflare Access closes the loop — identity-aware, audit-logged access to admin consoles and model environments without standing up a separate IdP stack.

Access Tunnel SOC 2
PRIMITIVE 09 Bot protection

Bot Management + Turnstile on signups

50K+ users with a free tier and a $15/mo paid tier = a magnet for automated abuse: fake signups, free-tier exploitation, scraped binaries. Bot Management at the edge stops the abuse before it touches the auth backend; Turnstile drops in cleanly on signup and download flows.

Bot Management Turnstile WAF

A Scribe request is a pipeline waiting to be cached.

"Just say a few quick words. Willow turns it into a fully polished message." Every Scribe invocation is an LLM call. Across 50K+ users typing emails and Slack messages and Cursor prompts, those requests cluster brutally — same draft-an-apology, same write-a-follow-up, same rewrite-this-more-professional patterns. The cache hit rate is structural, and AI Gateway captures it from request one.

Scribe request flow, sketched on Cloudflare primitives

From "user hits hotkey and speaks a rough sentence" to a polished, formatted, on-brand response in their target app — cached, attributed, audited.
DESKTOP CLIENT
Mac / Windows / iOS hotkey
audio + app context captured
EDGE ROUTING
Workers on api.willowvoice.com
already live — closest POP to user
CACHE + ROUTE
AI Gateway + Vectorize
semantic cache, per-user style retrieval
RESPONSE
Streamed back through Workers
attribution logged, ZDR maintained
What this changes: The same "rewrite this more professionally" prompt gets called thousands of times an hour across all 50K+ users. With semantic cache in AI Gateway, identical-intent prompts return in milliseconds without touching the model layer at all. That's the difference between a Scribe inference bill that scales linearly with users, and one that grows sublinearly with unique requests.

The economics of Scribe at 50K→500K users.

Voice dictation is one of the most edge-amenable workloads in software: small audio packets in, structured text out, latency-sensitive at every step. Add Scribe's LLM layer on top, and you're now also paying for inference per invocation. AI Gateway turns both into one observable, attributable cost line — before the Series A inference bill becomes a board-meeting agenda item.

A back-of-the-envelope, not a quote
Modeled across Scribe inference + API serving + binary delivery at the current scale (50K+ users, growing fast)
SEMANTIC CACHE HIT RATE
40–60%
Productivity-prompt queries cluster heavily: "draft a follow-up," "rewrite professionally," "summarize this." Higher hit rate than general LLM workloads.
BINARY EGRESS SAVINGS
40–60%
R2's zero egress vs. AWS S3 + CloudFront pricing across Mac / Windows / iOS installer + update delivery as the user base scales.
PER-USER ATTRIBUTION
100%
AI Gateway gives per-user, per-feature (dictation vs Scribe), per-platform attribution — the data needed to defensibly tier free vs. $15/mo vs. Teams.
The real win for a YC X25 company is unit-economic clarity. Today Scribe's per-user inference cost is a guess. AI Gateway makes it a line item: this user costs $0.X/month in inference, that user costs $X.X. That's the data a Series A investor wants to see, and it's the data Willow needs to price Teams at the right margin instead of cost-plus.

Three platforms, three tenants. Workers for Platforms is the boundary.

Mac, Windows, and iPhone have different update cadences, different binary sizes, different telemetry shapes, different App Store / Notarization / Microsoft Store distribution flows. Each platform's binary delivery, telemetry ingest, and crash reporting can live in its own Worker namespace inside Workers for Platforms — same edge, isolated state.

Per-platform delivery + per-org Teams tenancy, sketched

Each desktop platform gets its own namespace for binary delivery + telemetry. Each Teams plan customer (when the plan ships) gets their own isolated tenant on the same edge.
🌿
macOS
Apple Silicon + Intel, Notarized
🖥️
Windows
x64, signed MSI, MS Store
📱
iOS
App Store, TestFlight beta
Shared control plane — Workers for Platforms + R2 + AI Gateway
one runtime · one observability surface · platform × Teams customer = N×M isolated tenants by construction

Current stack, with Cloudflare overlaid.

Every row is sourced from public DNS records and HTTP response headers on willowvoice.com, app.willowvoice.com, and api.willowvoice.com. The purple rows are already running on Cloudflare today. The orange column is the expansion footprint.

What's running today, and where Cloudflare slots in

Purple rows = already on Cloudflare. Orange column = the expansion path. No Framer or Vercel rip-and-replace required.
LAYER
WILLOW RUNS TODAY
CLOUDFLARE FIT
DNS
Cloudflare (christian + dorthy.ns.cloudflare.com)
✅ Live — the foundation everything else snaps onto
API PLANE
Cloudflare anycast (104.26.x / 172.67.x on api.willowvoice.com)
✅ Live — already a Worker or proxy on the edge
MARKETING SITE
Framer (server: Framer/8c09469, us-west-2)
No change — Framer fronts cleanly behind the existing CF zone
APP DASHBOARD
Vercel on app.willowvoice.com (vercel-dns-016)
+ Cloudflare in front: edge cache, WAF, Bot Mgmt, Turnstile
SCRIBE AI INFERENCE
Just launched — likely OpenAI / Anthropic from the API plane
+ AI Gateway: cache, attribution, rate-limit, budget cap from day one
BINARY CDN
Likely S3 + CloudFront for Mac / Windows / iOS installers
+ R2 + Workers + Smart Placement — zero egress at scale
CUSTOM DICTIONARY
Per-user storage of custom terms + names
+ Vectorize + D1 for context-aware retrieval at the edge
TEAMS TENANCY
Teams page exists; plan likely roadmap-imminent
+ Workers for Platforms — per-org isolation by construction
LIVE SESSION STATE
In-flight transcription state per active session
+ Durable Objects + WebSockets — single-writer state per session
EMAIL
Google Workspace (MX = aspmx.l.google.com)
+ Cloudflare Email Security as defense-in-depth (optional)
SIGNUP / DOWNLOAD ABUSE
Free tier + paid tier = scrape-attractive surface
+ Bot Management + Turnstile + WAF on signup, download, redemption
FOUNDER-TEAM ACCESS
YC-stage startup — likely VPN or no-VPN to admin consoles
+ Cloudflare Access — SOC 2 / HIPAA audit-grade access boundary

Why this is the right week to start the conversation

Scribe just shipped. The banner is live on willowvoice.com today. Every architectural decision being made this quarter — what governs Scribe's inference, how attribution works as Teams ramps, what the unit economics look like at 500K users — will define the inference cost curve for the rest of 2026 and 2027. AI Gateway is the cheapest hour you can spend in front of that curve.

You're already on Cloudflare. DNS via christian + dorthy. The API plane via anycast. No procurement event to start, no security review to begin, no MSA to negotiate. Expanding the footprint from DNS + API to AI Gateway + R2 + Workers for Platforms is the most natural roadmap conversation in the lineup.

YC X25 timing. The Series A architecture decisions get locked in over the next 12–18 months. Picking the runtime now — while the team is small enough to move fast and the user base is small enough that re-platforming is cheap — is materially cheaper than doing it after Teams customers depend on it.

Worth a 30-minute conversation with the team building Scribe?

The interesting conversation is which of these primitives is closest to your current sprint: AI Gateway behind Scribe, R2 for the binary CDN, Vectorize for context-aware spelling, or Workers for Platforms behind the Teams plan. I'd rather hear what's actually on your roadmap than guess.

Matt Holscher Calendar  → Reply by email