Access Request

Please provide your organization and intended use.

Log in or Sign Up to review the conditions and access this model content.

RareSeek-R1: A specialized language model for rare disease diagnosis and reasoning

RareSeek-R1 is a domain-specialized large language model for rare-disease diagnostic reasoning, developed through a Progressive Parameter-Efficient Transfer Learning framework. The model is first instruction-tuned on the clinically grounded RareMed-Corpus, a large, multi-source dataset deeply integrated from medical textbooks, guidelines, biomedical literature, and real-world EHR narratives. It is then fine-tuned on RareMed-CoT, a high-fidelity corpus designed to instill explicit, stepwise clinical reasoning aligned with real diagnostic workflows. To further enhance factual reliability, GraphRAG is incorporated to anchor the model’s inference to up-to-date variant–gene–phenotype–disease relationships. This retrieval augmentation substantially reduces hallucinations, improves factual calibration, and yields notable performance gains—particularly when EHR narratives are combined with prioritized genetic variants. Together, RareSeek-R1 performs direct reasoning over full-length EHRs, leverages graph-grounded retrieval, and demonstrably augments clinician-level diagnostic accuracy, advancing a reliable and scalable AI paradigm for rare-disease diagnosis.

RareSeek-R1 Teaser Image

RareMedData: https://huggingface.co/datasets/TaoMedAI/RareMedData

Downloads last month
-
Safetensors
Model size
71B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TaoMedAI/RareSeek-R1

Finetuned
(17)
this model