An open-source reproduction of Baichuan-M3's medical fact verification system, built by MedARC.
Under active development.
Following Baichuan-M3, the task is split into three models across four steps:
- Claim Decomposer breaks input text (answers, documents, reasoning traces, model outputs) into individual medical claims.
- Fact Verifier compares each claim against a database of previously fact-checked claims ("Claim X is supported by evidence set Y [under scope Z] as of date T").
- For a new claim, a Search Agent is dispatched to find supporting or contradictory evidence from a curated medical corpus.
- Results return to the Fact Verifier, which scores the claim on a five-level scale (strongly supported, weakly supported, unclear, weakly unsubstantiated, strongly unsubstantiated) and writes a new entry to the fact database.
| Component | Role | Workspace |
|---|---|---|
baseline |
End-to-end pipeline on off-the-shelf LLMs (no training) | Independent |
decomposer |
Long-form text → atomic medical claims | Yes |
verifier |
Claim + evidence → five-level supported↔unsubstantiated score | Yes |
search |
Claim → supporting / contradictory sources | Yes |
datasets |
Dataset ingestion, construction, and synthetic data | Yes |
utils |
Shared helpers used across AMFV packages | Yes |
training |
Training experiments and recipes for the above | Independent |
The Workspace column marks membership in the root uv workspace. Independent packages (baseline, training) are excluded so they can evolve on their own.