Model weights in "Distilling to Hybrid Attention Models via KL-Guided Layer Selection" (https://arxiv.org/abs/2512.20569).
Yanhong Li
yanhong-li
AI & ML interests
None yet
Recent Activity
updated
a collection
2 days ago
Hybrid-Distillation
updated
a collection
2 days ago
Hybrid-Distillation
updated
a collection
2 days ago
Hybrid-Distillation
Organizations
None yet