papluca/xlm-roberta-base-language-detection Text Classification • 0.3B • Updated Dec 28, 2023 • 352k • • 370
INSAIT-Institute/BgGPT-Gemma-2-2.6B-IT-v1.0 Text Generation • 3B • Updated Dec 4, 2024 • 412 • 7
Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers • 70 items • Updated Dec 10, 2025 • 160