Every 👎 makes the next
answer better.
The signals you just saw ranked on the Trends page aren't just a dashboard. Every one of them feeds a closed loop that re-ranks retrieval, flags entries for review, and nudges the humans who can fix them — without ever rewriting your knowledge base behind your back.
Each Trends tag maps to a specific fix.
When a user picks one of these four tags on a 👎 reply, here's the chain of events that runs within seconds.
User flags the answer as incorrect — maybe a number, policy, or step.
Tag joins the 👎 event. The cited KB chunks drop in quality score; admins see this exact conversation on the KB Review Queue.
Citation grounding · Quality-score reranker · Review Queue
Answer was technically correct but didn't solve the need.
Tone picker gets a signal. Retrieval weight on the top-cited chunk drops ~0.1. Next ask about the same topic surfaces a richer chunk.
Tone picker · Quality-score reranker
Router sent a legal question to the finance agent, or vice versa.
Domain packs re-check the classifier. The miss-routed intent enters the Training Queue for the next pack update.
Intent classifier · Domain packs · Training Queue
Something else was wrong — the escape hatch tag.
Entry flagged for admin review. The original author gets nudged (rate-limited to one nudge per entry every 30 days).
Review Queue · Author nudges · Snooze
Attribution → signal → score → action → proof.
Six phases that run on their own. No dashboards to check, no prompts to tune. The first time you see the improvement will be in retrieval quality.
Every answer gets a fingerprint
We record exactly which KB chunks the retriever surfaced and which ones made it into the final reply. Every future signal knows what it's attached to.
We listen without asking
Explicit 👎 clicks are the tip of the iceberg. We also detect re-asks, escalations, abandonment, and copies — all implicit signals, zero extra UI friction.
Nightly, every chunk gets a grade
A decayed 30-day rollup (14-day half-life) produces a quality score ∈ [0.5, 1.5] per chunk. 20-signal noise floor keeps single angry users from poisoning the well.
Admins triage what matters
A dedicated Review Queue surfaces entries with <40% satisfaction, zero retrievals in 60 days, duplicate conflicts, empty keywords, or staleness — with inline "last 3 bad conversations".
Humans nudge humans
When an entry needs its original author's attention, admins send a nudge. Rate-limited so no one gets spam. Perfect for quarterly refresh cycles.
The whole thing is public
Every week we publish the top 5 customer confusions across the platform — anonymized, privacy-safe — as proof the loop actually closes.
Self-correcting, not self-destructing.
A learning system is only trustworthy if you can predict its failure modes. Four rails prevent the loop from damaging good content.
Never auto-edits your KB
The score only influences retrieval order. Your content stays exactly as you wrote it. Every change needs a human.
20-signal noise floor
Satisfaction band only moves the score after 20+ decayed signals. Prevents bursts from outliers.
Admin-promoted floor
Pinned entries and Training-Queue-corrected content get a 1.1× floor — curated intent always beats noisy feedback.
Reversible per tenant
Feature flag `rag_quality_score` ships ON by default but any tenant can toggle the multiplier off while A/B testing.
Before and after — same question, one week apart.
Your own Trends page.
In seven days.
Launch a Hanvitt chatbot this afternoon. A week of traffic later, you see exactly where your own customers are confused — ranked, anonymized, ready to fix.
No credit card · Cancel anytime · Trends populate after 7 days of traffic