Voice match score: how the 0 to 100 number actually works
Every draft Auden produces comes with a voice match score, a 0-to-100 number that measures how close the output sits to your training profile. Here's how to read it, what feeds into it, and when to trust it.
· 8 min read
Every draft that comes out of Auden, the brain inside VoiceMoat, is accompanied by a voice match score. A single number between 0 and 100. It tells you how close the draft sits to your training profile, the model's view of how you write.
Most users hit 90%+ on their first run after training completes. That number is meaningful. It means the draft would pass a 'did this person actually write it' sniff test from someone who knows your work. Drafts below 90 deserve a careful read. Drafts below 80 should be edited substantially or killed.
But the number alone doesn't tell you everything. This post unpacks how the score is calculated, where it shows up across the product, when to trust it, and when your editorial judgment should override it.
How to read the number
The voice match score is on a 0-to-100 scale. Higher is closer to your training profile.
The rough heuristic:
- 90 and above. Shippable. The draft sounds like you. The model captured the rhythm, the vocabulary, the hook style, the cadence. Spot-check for typos and ship.
- 80 to 90. Careful pass. Most of the draft is on voice, but one or two paragraphs probably aren't. Edit those before shipping. Or regenerate.
- 70 to 80. Substantial editing. The model got the topic right but the voice wrong in important places. Use as raw material, not as a draft.
- Below 70. Kill it. The model produced something that isn't recognizably you. Regenerate, or write from scratch.
These bands are guides, not hard cutoffs. The right threshold depends on the stakes of the post (a high-stakes thread should clear 92; a quick reply can ship at 85). Calibrate to your own work.
What feeds into the score
The score isn't a single number that arrives by magic. It's the aggregate of how closely the draft matches your training profile across the 9 signals of voice Auden measures when it learns you.
Each draft gets evaluated against your profile on:
- Tone. Does the emotional register match your usual?
- Rhythm. Do the sentence lengths and comma patterns match how you write?
- Vocabulary. Are the words ones you'd actually reach for? Does it avoid the words on your no-go list?
- Hooks. Does the opening match your hook patterns?
- Pacing. Setup-to-payoff timing, does it move at your pace?
- Personality. Does the attitude come through as yours?
- Formatting. Bullet density, paragraph breaks, line breaks. Your structural fingerprint.
- Quirks. Repeated phrases and signature framings show up if they're in your corpus.
- Taboos. Does the draft avoid hooks and CTAs you'd never write?
The score is the weighted aggregate. Different signals contribute different amounts depending on how distinctive they are in your training profile. If your writing is particularly hook-driven, your score weighs hooks more. If your style is rhythmically idiosyncratic (very short sentences, frequent line breaks), rhythm contributes more.
A draft that's 95% on tone but 50% on hooks won't score 80. It'll score lower because the hook is doing more work for your voice than the tone is.
Where the score shows up
Voice match scoring runs on every generation. The score is displayed in three places:
- The Chrome extension on X. Inline next to each generated reply, tweet, or thread variant. Glance and decide.
- The dashboard composer. Beside every draft, with a more detailed breakdown by signal if you click in.
- The analytics dashboard. Aggregated across all your generations, so you can see how your average voice match has trended over time.
The analytics view is the one most users underuse. A creator who watches their average voice match over a month catches drift before it becomes a problem. If your average is sliding from 92 to 86 over four weeks, your voice has probably shifted faster than your training profile has. That's the signal to retrain.
When to trust the score
The score is a model's view of your voice. The model is good but not perfect. Cases where the score is more likely to be right:
- Drafts on topics you write about regularly. The model has seen many examples. Confidence is high.
- Drafts in your usual platform format (tweets, reply threads). The training data is dense.
- Drafts that fall within the typical length range for your content.
Cases where the score is more likely to be off:
- Drafts on topics you've never publicly written about. The model has to extrapolate.
- Drafts in unusual formats (a long thread when you mostly do replies, or vice versa).
- Drafts in a register you've used rarely (a serious thread from someone whose corpus is mostly playful).
In edge cases, your editorial judgment matters more than the score. A high score on an unusual draft doesn't mean it sounds like you. A low score on a stretch piece doesn't mean it's bad. Use the score as a starting point, not as a final verdict.
How to improve a low score
A draft below 80 isn't a failure mode. It's a feedback signal. Treat it like a first pass and improve.
The common fixes:
- Rephrase the hook. Most low scores trace back to a generic opening. Replace it with something only you would say.
- Cut filler. Generic AI tends to over-explain. If two sentences carry the same idea, kill one.
- Change one in three words. If the model used 'leverage' or 'delve' or any of your no-go words, swap them for what you'd actually reach for. The score updates as you edit.
- Re-read out loud. If a sentence feels wrong out loud, it probably is. Trust the ear before the number.
After enough drafts with the same low-scoring pattern, you've learned something about your own profile. Either retrain (if your voice has shifted), update your never-say list, or notice the gap and write around it manually. The bimodal voice match score histogram (some posts at 92, some at 75) is also the off-season diagnostic for an event-curator account, where the lower scores are usually the agency-voice posts the team is writing when you're not paying attention. Event accounts that go dark between events covers that case.
When the score conflicts with your read
Sometimes you'll read a draft that scores 92 and think 'this isn't me.' Or read a draft that scores 73 and think 'actually, this is pretty good for me.'
The score is wrong sometimes. The model's view of your training profile is statistical, not exhaustive. It might over-weight a recent style shift, miss a quirk that's important to you but rare in your corpus, or misjudge a paragraph that's stylistically unusual but still characteristic.
When the score and your read disagree:
- Trust your read on a single draft. Ship if you'd ship, edit if you'd edit.
- Track the disagreement over many drafts. If you consistently disagree with the score on a specific topic or format, that's signal to retrain or update your profile.
- Watch the score's pattern, not its single value. A drift of 5 to 10 points over weeks tells you more than any one number.
The voice match score is a tool, not an oracle. Use it to catch drafts that need a second look. Don't use it to override your own editorial judgment.
How the score interacts with retraining
Voice match scores tend to decline gradually over months even if your writing stays consistent. The reason: your training profile is a snapshot from when you last trained. Your most recent posts (which aren't in the profile yet) might be drifting in a new direction. For the structural-failure-mode version of this (when the drift is not benign evolution but the audience-optimization, templating-creep, identity-inflation pattern that erodes voice past the 10K-follower mark), see voice drift: why most creators lose their edge after 10K followers.
When that happens, retrain. Auden re-ingests your latest 100 to 200 posts and rebuilds the profile. The next batch of drafts should bounce back into the 90+ range. We cover the cadence (when to retrain, how often, what changes) in our dedicated post on voice retraining.
The voice match score is the system's view of how close a draft sits to your training profile. 90+ is the benchmark most users hit on their first generation after training. Below 90 is editing territory. Below 80 is regenerate or rewrite territory. The score is one input among several (your read, the topic, the platform). Use it to catch drafts that need work. Trust your editorial judgment on the close calls.
Want to see how the score behaves on your own writing? Try VoiceMoat free for 7 days, and every draft Auden produces during the trial comes with the score attached. Or read what is Auden for the product context, or the 9 signals of voice for the brief primer on the dimensions the score is built on. The canonical deep reference for the 9 dimensions (each signal gets a definition, manifestation in real creator writing, how AI tools fail on it, and how to audit) is at the 9 dimensions of Voice DNA: what actually makes writing recognizable, which is the reference text for what each per-signal contribution to the aggregate score is actually measuring. One subtle drift source the score helps detect: drafting on different devices producing different registers. Drafting on X across devices, voice-first covers the phone-vs-desktop pattern.