The X algorithm in May 2026

The negative-signal economy: how one mute outweighs 50 likes on X

Most algorithm advice is additive: "do this, ship that." The math of the 2026 X ranker is subtractive. Net-negative posts get pushed through a different score-offset branch in weighted_scorer.rs where the combined score is rescaled by (combined + NEGATIVE_WEIGHTS_SUM) / WEIGHTS_SUM × NEGATIVE_SCORES_OFFSET. One predicted mute or not-interested by the model overwhelms dozens of predicted likes. This article traces every negative signal: the four explicit weight terms, the implicit not_dwelled_score, the four hard-kill filters that run upstream of any scoring, and the off-voice-drift to mute pipeline. We close with a creator-specific section: what content patterns trigger followers to mute, why voice drift is the single biggest negative-signal driver, and how to detect drift before publish.

May 21, 2026 · 13 min read · VoiceMoat team

Algorithm advice on X is almost entirely additive. "Do this, ship that, post at this time, fire this signal." Read the actual scoring code and the picture inverts. The single most score-moving event the 2026 X ranker handles is not a like, a retweet, or a reply. It is a mute. And the math that processes it routes through a different branch of the scoring function entirely. This article walks the negative-signal economy: the four explicit negative weights, the implicit not-dwelled score, the asymmetric offset_score branch, the four hard-kill filters that run upstream of any scoring, and the off-voice-drift pipeline that turns voice inconsistency into predicted mutes. Companion to A1 and A2, both of which establish the architecture and the head inventory this piece builds on.

The offset branch: a different scoring formula for net-negative posts

Recall the weighted scoring formula from home-mixer/scorers/weighted_scorer.rs:

combined = Σ_i (P(action_i) × WEIGHT_i)
final = offset_score(combined)

The first line is the standard linear combination. The second is the load-bearing bit creators rarely hear about. offset_score is not the identity function. When the combined score is positive, the value passes through largely unchanged. When the combined score is negative, the formula rescales it:

final = (combined + NEGATIVE_WEIGHTS_SUM) / WEIGHTS_SUM × NEGATIVE_SCORES_OFFSET

The intuition: a post predicted to net-fire more negatives than positives is not just placed at a lower position. It is routed to a different scoring regime entirely. The numerator adds the sum of negative weights (a large negative number) to the already-negative combined score, then the denominator divides by the sum of all weights to normalise, then a final offset multiplier scales the result into a separate range. Production values for these constants are redacted (the params module is excluded from the public release), but the structural property is clear: the same combined score, sitting in net-positive vs net-negative territory, is treated as living in two different scoring universes.

The single practical implication: a post that nets a handful of likes but triggers a non-trivial predicted-mute probability does not just score lower than an unambiguously good post. It scores in a different regime, behind a non-linear transformation. The asymmetry is the mechanical reason "reach loss" feels sudden rather than gradual to creators whose posts cross from net-positive into net-negative.

The four explicit negative heads, side by side

The four explicit negative heads in weighted_scorer.rs, with 2023 leaked weights as directional reference

Source: home-mixer/scorers/weighted_scorer.rs + 2023 twitter/the-algorithm-ml SrcEngagementWeights

HeadUI gesture2023 leaked weight2026 statusEffect
Not interested documentedlong-press, "Not interested in this post"minus 74 2023 refredactedsoft de-weight via offset branch
Mute author documentedpost menu, mute authorroughly minus 100 2023 refredactedsoft de-weight plus deterministic filter on next requests
Block author documentedpost menu, block authorroughly minus 100 2023 refredactedsoft de-weight plus deterministic filter, bidirectional
Report documentedpost menu, report for policy violationnot enumerated in leaksredactedsoft de-weight plus policy escalation

A note on the leaked 2023 magnitudes. The favorite weight in 2023 was 0.5. The mute weight was roughly minus 100. The ratio is 200 to 1. Translated into expected score: a Phoenix prediction of even 0.5 percent mute probability, applied to a weight of minus 100, contributes minus 0.5 on a single head. That contribution alone cancels the contribution of one full favorite at a hundred percent probability. A predicted mute probability of 5 percent, contributing minus 5, cancels the contribution of ten guaranteed favorites. None of these numbers carry over to 2026 unchanged, but the order-of-magnitude property (negatives outweigh positives per action by a factor in the dozens to hundreds) is structural to the formula's design and survives any specific weight adjustment.

Not-dwelled: the negative signal nobody opted into

There is a fifth negative signal that does not appear as a WEIGHT constant in home-mixer/scorers/weighted_scorer.rs. It lives in the PhoenixScores proto referenced by home-mixer/scorers/vm_ranker.rs, a parallel scoring path that uses a remote value model. The field name is not_dwelled_score, and its presence in the proto but absence in the simpler weighted scorer suggests one of two things. Either home-mixer/scorers/vm_ranker.rs is an experimental path running alongside the canonical scorer with at least one extra signal, or it is in fact the production path, in which case the simpler weighted scorer in the public release omits a signal that production uses.

The mechanism of not_dwelled_score is the same as the discrete dwell head, inverted. A viewer scrolling past your post without pausing fires the binary "not dwelled" condition. Unlike the four explicit negatives, this one requires no menu interaction, no decision, no friction from the viewer. Every impression that does not earn a dwell counts. For a creator, the practical implication is sobering: the negative-signal volume is dominated not by mutes and reports, which are rare events, but by the continuous stream of viewers scrolling past without engaging. The predicted-not-dwelled probability is what most off-target candidates move on.

This is the silent kingmaker of the negative side of the ledger. A2 walked the positive dwell heads; the absence of dwell is the corresponding negative signal, and it operates at scale because every non-dwelled impression contributes.

The hard-kill filter layer

Upstream of the scorer, four filter files in home-mixer/filters/ remove candidates deterministically before they ever reach Phoenix. These are not probabilistic predictions. They are boolean conditions: either the filter triggers and the candidate is dropped, or it does not.

The four post-selection and pre-scoring hard-kill filters

Source: home-mixer/filters/ + home-mixer/scorers/vm_ranker.rs context

FilterWhat it removesStageSeverity
muted_keyword_filter.rs documentedcandidates whose tokens match the viewer's muted-keyword listpre-scoringhard kill
author_socialgraph_filter.rs documentedcandidates from authors the viewer has blocked, muted, or otherwise excluded via social-graph relationshipspre-scoringhard kill
vf_filter.rs documentedcandidates flagged by visibility filtering: deleted posts, spam, violence, gore, policy violationspost-selectionhard kill
topic_ids_filter.rs documentedcandidates whose Grox-classified topic IDs match the viewer's topic exclusionspre-scoringhard kill

The split is meaningful. The probabilistic layer (the four negative heads in the weighted scorer) handles "the viewer will probably want to mute this if served." The deterministic layer (the four filters above) handles "the viewer has already muted this author, or this content class." Both layers feed the same "less of this" intent, but they operate at different stages and on different signals.

Two consequences for creators. First, once a viewer mutes your account, they are removed from your impression base for that account entirely. The deterministic filter triggers on every subsequent request and your post is dropped before scoring. There is no recovery short of the viewer reversing the mute themselves. Second, mutes are not isolated events. They contribute to Phoenix's training signal across the follower base. A pattern of mutes from a creator's audience adjusts the predicted-mute probability for similar viewers next time.

The voice-drift to mute pipeline

The data flow that connects voice drift to actual mutes runs through five steps, each observable in the repo or directly in the model behaviour.

Step one. A creator's audience formed around a specific voice. The follow decision encoded "I want more of posts like the ones I already saw from this account." Every follower's history sequence now contains several to many of the creator's prior posts paired with the actions that follower took (favorite, dwell, reply, profile click). Phoenix has learned a per-creator distribution across the engagement heads for that follower.

Step two. The creator ships a post that deviates from their historical distribution. The deviation can be tonal (suddenly more formal or more casual), structural (a thread when the audience expects single posts), topical (a topic outside the historical surface area), or stylistic (template-shaped engagement copy when the audience expects unedited voice).

Step three. Phoenix scores the candidate against the viewer's history sequence. The candidate-isolation mask (explained in A1) means the candidate cannot rely on other posts in the same batch for support. The score depends on the candidate's own representation and how well it matches the viewer's per-creator history pattern. An off-voice candidate matches the pattern less well, and the predicted-positive probabilities (favorite, dwell, reply, profile click) shift downward.

Step four. Simultaneously, the predicted-negative probabilities shift upward. The model has implicitly learned that off-distribution posts from creators are the ones followers historically reacted to with mute or "not interested." A 1.5-percent predicted-mute probability that would have been 0.2 percent on an on-voice post, multiplied by the large negative weight, is enough to shift the combined score across the positive-to-negative boundary and into the offset_score rescaling branch.

Step five. The post serves to a smaller fraction of the audience. Of that smaller fraction, a few actually mute. Those mutes feed the training signal for the next pass. The creator's follower base now contains a slightly stronger prior for off-voice content triggering mute, which raises predicted-mute probability for similar future posts. Repeated drift accelerates the effect.

The pipeline is a compounding mechanism, not a single-event one. The voice-fidelity question is not "did this one post survive?" but "what did this post do to the predicted-mute prior across my follower base for the next ninety days?"

Voice drift score vs predicted mute rate

The relationship between voice-distance from a creator's historical distribution and predicted-mute rate is non-linear. Drift inside the historical envelope has near-zero impact. Drift outside it ramps fast. The chart below sketches the shape with illustrative numbers; production telemetry is not available.

Voice-distance bucket vs predicted negative-signal contribution

Source: illustrative simulation, weights and counts notional

1007550250On-voiceSlight driftMaterial driftStrong driftrelative head contribution
Predicted-positive contributionPredicted-negative contribution (negated for comparison)

Two readings of this chart matter. First, the on-voice bucket is not risk-free; there is a baseline negative contribution that comes from viewers who simply did not engage with the post. Second, the crossover where the right-column bar overtakes the left-column bar happens before the post would feel obviously off to the creator. The model is sensitive enough that drift well short of "this reads like a different person wrote it" is already shifting the predicted-action mix.

The shadowban myth, gently retired

A short detour worth making. "Shadowban" is the most common word used for what creators experience when negative signals catch up with them. The word implies a hidden flag, set by a human or a moderator, that silently throttles a specific account. That model does not match what the public 2026 source shows.

What the source shows is two layers operating in parallel. One is the probabilistic scorer described above, which produces a score that depends on the predicted-action mix per viewer. The other is the deterministic filter layer, which removes candidates based on viewer-set preferences (muted keywords, blocked or muted authors, excluded topics) or platform-enforced classifications (visibility filtering from home-mixer/filters/vf_filter.rs for policy violations). Neither layer references a hidden per-account flag. The probabilistic layer responds to model predictions that update from audience reactions. The deterministic layer responds to user-set or policy-set rules. Both are auditable in the open repo.

The lived experience of "shadowban" is the rate-of-change in those two layers. A surge of mutes from your audience pushes predicted-mute probability up across viewer histories. A keyword that lands in many viewers' muted-keyword lists pulls posts containing that keyword out of their candidate pool. A wave of reports that survive vf_filter review raises the policy classification on the creator's content class. None of these are "shadowban" in the hidden-flag sense. All of them produce the experience that is labelled shadowban. The mechanism is in the source, not in a moderator's spreadsheet.

The practical implication: chasing rumours about hidden flags is a waste. The right operational question is "what is my predicted-action mix doing across my audience?" Voice-fidelity is the strongest direct lever on that mix. Topic surface and policy-class drift are the next two. Everything else is downstream.

What this means for tools that do not measure voice

Three categories of writing tools fail the negative-signal test by construction. Template-based generators (most engagement-optimised products) fit voice to a category-default template; the output reads fluent but off-distribution for any specific creator. Scheduler-shaped tools (volume-first) ship whatever the creator writes, with no fidelity gate. Generic-LLM writing assistants (general-purpose models prompted with light context) produce helpful-assistant register that the audience pattern-matches as not-the-creator within a few posts.

All three categories have positive use cases. None of them protects against the negative-signal economy described above. The protection requires a voice-fidelity check between draft and publish that specifically scores the draft against the creator's historical distribution. That score is mechanically related to predicted-mute probability through the pipeline above, even though no model is yet shipped that maps the two scores directly. A10 of this series compares the four major writing tools against six 2026-algorithm criteria, with the negative-signal exposure one of them.

What to do with this

Two operational moves come out of the negative-signal economy mechanically.

Use voice as the gate, not the goal. The voice-distance score on a draft does not tell you the draft is good. It tells you it sits inside or outside your historical distribution. Inside is the necessary condition. Whether the draft is sharp, insightful, useful, that is a separate question your editorial judgment owns. The score gates risk; it does not promise upside.

Watch the velocity of mutes, not the absolute count. A creator with 50,000 followers expects some baseline mute rate per week. The signal that matters is the rate-of-change. If the weekly mute count is steady, the predicted-mute prior in Phoenix is steady. If the weekly mute count is climbing, the prior is climbing too, which means future posts will score lower by default. Public mute counts are not surfaced in X's own analytics; the proxy is reach loss on otherwise comparable content. A steady week-over-week reach drop on on-topic posts is the lagging indicator of the prior moving against you.

The negative-signal economy is the structural reason "voice as a moat" is not a marketing line. It is mechanically the only positive lever a creator has against a non-linear penalty that scales with audience size. The next article in this series, A8 on voice as an embedding, walks the embedding-level mechanism. The piece after that, on dwell, returns to the positive side of the ledger with the four dwell heads that on-voice content fires most reliably.

AI disclosure

This article was drafted with AI assistance and human-edited by the VoiceMoat team. All technical claims are sourced to the xai-org/x-algorithm repository; file paths are cited inline.