Every draft that comes out of Auden, the brain inside VoiceMoat, is accompanied by a voice match score. A single number between 0 and 100. It tells you how close the draft sits to your training profile, the model's view of how you write.
Most users hit 90%+ on their first run after training completes. That number is meaningful. It means the draft would pass a 'did this person actually write it' sniff test from someone who knows your work. Drafts below 90 deserve a careful read. Drafts below 80 should be edited substantially or killed.
But the number alone doesn't tell you everything. This post unpacks how the score is calculated, where it shows up across the product, when to trust it, and when your editorial judgment should override it.
What is a voice match score?
A voice match score is a single 0-to-100 number that measures how closely a draft matches your own writing profile. It is not a quality score (it does not judge whether the writing is good), and it is not an AI detector (it does not judge whether the text came from a machine). It answers one narrow question: how close does this draft sit to the way you specifically write? The higher the number, the more the draft's patterns line up with the patterns Auden learned from your corpus.
Conceptually it is the operational version of stylometry, the statistical study of writing style used in authorship attribution and forensic linguistics. Stylometry says a person's writing has a measurable fingerprint across features like sentence rhythm, vocabulary, and punctuation; a voice match score turns that idea into a live number against your own baseline. Under the hood, the comparison is a similarity computation: the draft and your profile are represented as feature vectors across the 10 signals, and the score reflects how aligned those vectors are (the same family of math as the cosine similarity used throughout natural-language processing, applied per signal and then aggregated).
The practical value is that it replaces a vibe check with a measurement. Before voice scoring, the only way to know if a draft sounded like you was to read it and decide, which is unreliable across many drafts and impossible to delegate. A number you can see at a glance, on every draft, is what makes voice fidelity auditable at scale rather than a thing you hope holds.
How do you read a voice match score?
The voice match score is on a 0-to-100 scale. Higher is closer to your training profile.
The rough heuristic:
- 90 and above. Shippable. The draft sounds like you. The model captured the rhythm, the vocabulary, the hook style, the cadence. Spot-check for typos and ship.
- 80 to 90. Careful pass. Most of the draft is on voice, but one or two paragraphs probably aren't. Edit those before shipping. Or regenerate.
- 70 to 80. Substantial editing. The model got the topic right but the voice wrong in important places. Use as raw material, not as a draft.
- Below 70. Kill it. The model produced something that isn't recognizably you. Regenerate, or write from scratch.
These bands are guides, not hard cutoffs. The right threshold depends on the stakes of the post (a high-stakes thread should clear 92; a quick reply can ship at 85). Calibrate to your own work.
What is a good voice match score?
For most users, a good voice match score is 90 or above, and that is the number most people hit on their first run after training completes. A 90+ means the draft would pass a 'did this person actually write it' check from someone who knows your work. But good is relative to the stakes and to your own baseline, not an absolute bar.
Three things determine what good means for you. First, the stakes of the post: a cornerstone thread you will pin deserves to clear the low 90s, while a quick reply can ship at 85. Second, your own distribution: some writers run consistently in the mid-90s, others sit in the high 80s because their style is broad enough that the model has more room to wander, and what matters is your draft relative to your typical, not to anyone else's. Third, the format: a draft in a format you rarely use will often score lower simply because the model has thinner training data there, not because the draft is worse.
The honest reading is that the score is a floor-check, not a target to maximize. Chasing a 99 usually means over-fitting to your most repeated phrasings, which makes the writing read like a parody of itself. The goal is recognizable, not identical. Clear your stakes-appropriate threshold, then let your editorial judgment make the final call.
What feeds into a voice match score?
The score isn't a single number that arrives by magic. It's the aggregate of how closely the draft matches your training profile across the 10 signals of voice Auden measures when it learns you.
Each draft gets evaluated against your profile on:
- Tone. Does the emotional register match your usual?
- Rhythm. Do the sentence lengths and comma patterns match how you write?
- Vocabulary. Are the words ones you'd actually reach for? Does it avoid the words on your no-go list?
- Hooks. Does the opening match your hook patterns?
- Pacing. Setup-to-payoff timing, does it move at your pace?
- Personality. Does the attitude come through as yours?
- Formatting. Bullet density, paragraph breaks, line breaks. Your structural fingerprint.
- Quirks. Repeated phrases and signature framings show up if they're in your corpus.
- Taboos. Does the draft avoid hooks and CTAs you'd never write?
The score is the weighted aggregate. Different signals contribute different amounts depending on how distinctive they are in your training profile. If your writing is particularly hook-driven, your score weighs hooks more. If your style is rhythmically idiosyncratic (very short sentences, frequent line breaks), rhythm contributes more.
A draft that's 95% on tone but 50% on hooks won't score 80. It'll score lower because the hook is doing more work for your voice than the tone is.
Where the score shows up
Voice match scoring runs on every generation. The score is displayed in three places:
- The Chrome extension on X. Inline next to each generated reply, tweet, or thread variant. Glance and decide.
- The dashboard composer. Beside every draft, with a more detailed breakdown by signal if you click in.
- The analytics dashboard. Aggregated across all your generations, so you can see how your average voice match has trended over time.
The analytics view is the one most users underuse. A creator who watches their average voice match over a month catches drift before it becomes a problem. If your average is sliding from 92 to 86 over four weeks, your voice has probably shifted faster than your training profile has. That's the signal to retrain.
When should you trust the voice match score?
The score is a model's view of your voice. The model is good but not perfect. Cases where the score is more likely to be right:
- Drafts on topics you write about regularly. The model has seen many examples. Confidence is high.
- Drafts in your usual platform format (tweets, reply threads). The training data is dense.
- Drafts that fall within the typical length range for your content.
Cases where the score is more likely to be off:
- Drafts on topics you've never publicly written about. The model has to extrapolate.
- Drafts in unusual formats (a long thread when you mostly do replies, or vice versa).
- Drafts in a register you've used rarely (a serious thread from someone whose corpus is mostly playful).
In edge cases, your editorial judgment matters more than the score. A high score on an unusual draft doesn't mean it sounds like you. A low score on a stretch piece doesn't mean it's bad. Use the score as a starting point, not as a final verdict.
How do you improve a low voice match score?
A draft below 80 isn't a failure mode. It's a feedback signal. Treat it like a first pass and improve.
The common fixes:
- Rephrase the hook. Most low scores trace back to a generic opening. Replace it with something only you would say.
- Cut filler. Generic AI tends to over-explain. If two sentences carry the same idea, kill one.
- Change one in three words. If the model used 'leverage' or 'delve' or any of your no-go words, swap them for what you'd actually reach for. The score updates as you edit.
- Re-read out loud. If a sentence feels wrong out loud, it probably is. Trust the ear before the number.
After enough drafts with the same low-scoring pattern, you've learned something about your own profile. Either retrain (if your voice has shifted), update your never-say list, or notice the gap and write around it manually. The bimodal voice match score histogram (some posts at 92, some at 75) is also the off-season diagnostic for an event-curator account, where the lower scores are usually the agency-voice posts the team is writing when you're not paying attention. Event accounts that go dark between events covers that case.
When the score conflicts with your read
Sometimes you'll read a draft that scores 92 and think 'this isn't me.' Or read a draft that scores 73 and think 'actually, this is pretty good for me.'
The score is wrong sometimes. The model's view of your training profile is statistical, not exhaustive. It might over-weight a recent style shift, miss a quirk that's important to you but rare in your corpus, or misjudge a paragraph that's stylistically unusual but still characteristic.
When the score and your read disagree:
- Trust your read on a single draft. Ship if you'd ship, edit if you'd edit.
- Track the disagreement over many drafts. If you consistently disagree with the score on a specific topic or format, that's signal to retrain or update your profile.
- Watch the score's pattern, not its single value. A drift of 5 to 10 points over weeks tells you more than any one number.
The voice match score is a tool, not an oracle. Use it to catch drafts that need a second look. Don't use it to override your own editorial judgment.
Is a voice match score the same as an AI detector?
No, and the difference is the whole point. An AI content detector tries to answer 'was this written by a machine?' (a question with known reliability problems and high false-positive rates). A voice match score answers a completely different question: 'how close is this to the way you write?' It does not care whether you, Auden, or a general model produced the words. It only measures fidelity to your profile.
That distinction matters for two reasons. First, it is why the score is useful even on drafts you wrote entirely by hand: paste your own writing in and a low score tells you that piece drifted from your usual voice, which is a real editorial signal an AI detector could never give you. Second, it is the operational core of the voice-not-cloning framing. The score is not policing authenticity in some abstract sense; it is measuring whether the output is recognizably yours, with you as the editor who decides what ships. A high voice match score on an Auden draft does not mean 'this fooled a detector.' It means 'this sits inside your own patterns,' which is exactly what you want when the goal is to scale your voice rather than to pass as human. The two tools answer different questions, and conflating them is how people end up optimizing for the wrong one.
How does the score interact with retraining?
Voice match scores tend to decline gradually over months even if your writing stays consistent. The reason: your training profile is a snapshot from when you last trained. Your most recent posts (which aren't in the profile yet) might be drifting in a new direction. For the structural-failure-mode version of this (when the drift is not benign evolution but the audience-optimization, templating-creep, identity-inflation pattern that erodes voice past the 10K-follower mark), see voice drift: why most creators lose their edge after 10K followers.
When that happens, retrain. Auden re-ingests your latest 100 to 200 posts and rebuilds the profile. The next batch of drafts should bounce back into the 90+ range. We cover the cadence (when to retrain, how often, what changes) in our dedicated post on voice retraining.
The voice match score is the system's view of how close a draft sits to your training profile. 90+ is the benchmark most users hit on their first generation after training. Below 90 is editing territory. Below 80 is regenerate or rewrite territory. The score is one input among several (your read, the topic, the platform). Use it to catch drafts that need work. Trust your editorial judgment on the close calls.
Want to see how the score behaves on your own writing? Try VoiceMoat free for 7 days, and every draft Auden produces during the trial comes with the score attached. Or read what is Auden for the product context, or the 10 signals of voice for the brief primer on the dimensions the score is built on. The canonical deep reference for the 10 signals (each signal gets a definition, manifestation in real creator writing, how AI tools fail on it, and how to audit) is at the 10 signals of Voice DNA: what actually makes writing recognizable, which is the reference text for what each per-signal contribution to the aggregate score is actually measuring. One subtle drift source the score helps detect: drafting on different devices producing different registers. Drafting on X across devices, voice-first covers the phone-vs-desktop pattern.