HuggingFaceFW/fineweb-edu
Viewer • Updated • 3.5B • 591k • 1.08k
Links:
This is the final HypeNet-5B checkpoint from the paper Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts, distilled from Qwen3-4B using the HALO pipeline proposed in our paper. For more information, please refer to our GitHub repo.