An integrated NLP-based framework for:
- Therapeutic class classification
- Medicine similarity retrieval
- Medicine information summarization
Medicines Information Dataset (MID) containing over 190,000 medicines across 44 therapeutic classes.
- TF-IDF + Logistic Regression
- TF-IDF + LinearSVC
- Cosine similarity
- BART-based abstractive summarization
- Macro F1-score ≈ 0.99 for therapeutic classification
- High semantic consistency in similarity retrieval
- Clinically faithful medicine summaries
- Clinical decision support
- Pharmaceutical research
- Public health informatics