tech4humans/Qwen2.5-3B-Instruct-lora-r16-alpha16-dp0.2-rslora Text Generation • Updated Apr 25, 2025 • 5
MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Merged-Lora-16bit Text Generation • 8B • Updated Jun 18, 2025 • 6
MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora Text Generation • 8B • Updated Jun 18, 2025 • 10
pandaiedu/pandai-unsloth-Qwen2.5-3B-Instruct-merged-sejarah-1-epoch-iter-1 Text Generation • Updated May 14, 2025 • 4
pandaiedu/pandai-unsloth-fpf-Qwen2.5-3B-Instruct-sejarah-2-epoch Text Generation • 3B • Updated May 14, 2025 • 3
erdem-erdem/Qwen2.5-3B-Instruct-countdown-new-grpo-r32 Text Generation • 3B • Updated Jun 16, 2025 • 3
erdem-erdem/Qwen2.5-3B-Instruct-countdown-ps-grpo-r32 Text Generation • 3B • Updated Jun 16, 2025 • 5
erdem-erdem/Qwen2.5-3B-Instruct-countdown-game-8k-qwq-r64 Text Generation • 3B • Updated Jun 16, 2025 • 8