Post
101
π²π² Releasing the Myanmar Tuberculosis Instruction Dataset β a MyanmarβEnglish parallel corpus for medical NLP in one of the lowest-resourced language settings in Southeast Asia.
Most TB datasets are either structured clinical data or English-only research corpora. This one fills a different gap: instructional, guideline-based content in Burmese, formatted for instruction tuning and medical QA.
### What's inside
- 2,043 instructionβresponse pairs
- MyanmarβEnglish parallel
- 7 TB domains: treatment, diagnostics, drug management, MDR-TB, infection control, patient education, healthcare worker training
- Sourced from WHO guidelines, Myanmar NTP protocols, and standard medical references
- MIT licensed
Useful for
- Fine-tuning Myanmar-language medical LLMs
- TB question answering
- Translation evaluation in a medical domain
- General low-resource medical NLP
π jojo-ai-mst/Myanmar-Tuberculosis-Guidelines-Instructions
Built by Min Si Thu and Khin Myat Noe. Feedback welcome β especially from anyone working on SEA medical AI or Burmese NLP.
#MedicalAI #LowResourceNLP #Myanmar #Burmese #Tuberculosis #InstructionTuning
Most TB datasets are either structured clinical data or English-only research corpora. This one fills a different gap: instructional, guideline-based content in Burmese, formatted for instruction tuning and medical QA.
### What's inside
- 2,043 instructionβresponse pairs
- MyanmarβEnglish parallel
- 7 TB domains: treatment, diagnostics, drug management, MDR-TB, infection control, patient education, healthcare worker training
- Sourced from WHO guidelines, Myanmar NTP protocols, and standard medical references
- MIT licensed
Useful for
- Fine-tuning Myanmar-language medical LLMs
- TB question answering
- Translation evaluation in a medical domain
- General low-resource medical NLP
from datasets import load_dataset
ds = load_dataset("jojo-ai-mst/Myanmar-Tuberculosis-Guidelines-Instructions")π jojo-ai-mst/Myanmar-Tuberculosis-Guidelines-Instructions
Built by Min Si Thu and Khin Myat Noe. Feedback welcome β especially from anyone working on SEA medical AI or Burmese NLP.
#MedicalAI #LowResourceNLP #Myanmar #Burmese #Tuberculosis #InstructionTuning