Eureka-Audio: Triggering Audio Intelligence in Compact Language Models Paper • 2602.13954 • Published Feb 15 • 4
LongCat-Next: Lexicalizing Modalities as Discrete Tokens Paper • 2603.27538 • Published 7 days ago • 132
EpochX: Building the Infrastructure for an Emergent Agent Civilization Paper • 2603.27304 • Published 8 days ago • 46
On-Policy Self-Distillation for Reasoning Compression Paper • 2603.05433 • Published about 1 month ago • 7
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger Paper • 2602.08222 • Published Feb 9 • 287
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs Paper • 2603.09906 • Published 26 days ago • 75
Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published Mar 3 • 102
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published 10 days ago • 125
Parakeet Collection NeMo Parakeet ASR Models attain strong speech recognition accuracy while being efficient for inference. Available in CTC and RNN-Transducer variants. • 16 items • Updated 4 days ago • 62
Scaling Open Discrete Audio Foundation Models with Interleaved Semantic, Acoustic, and Text Tokens Paper • 2602.16687 • Published Feb 18 • 5
LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis Paper • 2505.02625 • Published May 5, 2025 • 23
AzeroS: Extending LLM to Speech with Self-Generated Instruction-Free Tuning Paper • 2601.06086 • Published Dec 31, 2025 • 1
TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling Paper • 2504.07053 • Published Apr 9, 2025 • 6