Volko
Volko76
AI & ML interests
Quantization, Fine-tune, Agentic Frameworks
Recent Activity
new activity 1 day ago
inferencerlabs/MiMo-V2.5-Pro-MLX-4.3bit-INF:128gb seems suspiciouly low new activity 3 days ago
XiaomiMiMo/MiMo-V2.5:Thanks a lot for releasing it as open source MIT licence ! updated a model 19 days ago
Volko76/Qwen3.5-122B-A10B-Q3_K_S-MERGEDOrganizations
Qwen2.5 Coder Base GGUF
A list of Qwen2.5 Coder base quantized in GGUF
GGUF Quantizations
A CPU + GPU support type of quantization. It's currently the most used quantization method. Read more here : https://github.com/ggerganov/llama.cpp
-
Volko76/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 52 -
Volko76/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 49 -
Volko76/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 50 -
Volko76/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 67
Qwen2.5 Coder Instruct GGUF
A list of Qwen2.5 Coder quantized in GGUF
-
Volko76/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 52 -
Volko76/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 49 -
Volko76/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 50 -
Volko76/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 67
OpenCoder GGUF
A complete open source small coding model quantized in GGUF
EXL2 Quantizations
A collection of models quantized for EXL2, one of the fastest quantisation method. https://github.com/turboderp/exllamav2
EXL3 Quantizations
Qwen2.5 Coder Instruct GGUF
A list of Qwen2.5 Coder quantized in GGUF
-
Volko76/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 52 -
Volko76/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 49 -
Volko76/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 50 -
Volko76/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 67
Qwen2.5 Coder Base GGUF
A list of Qwen2.5 Coder base quantized in GGUF
OpenCoder GGUF
A complete open source small coding model quantized in GGUF
GGUF Quantizations
A CPU + GPU support type of quantization. It's currently the most used quantization method. Read more here : https://github.com/ggerganov/llama.cpp
-
Volko76/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 52 -
Volko76/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 49 -
Volko76/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 50 -
Volko76/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 67
EXL2 Quantizations
A collection of models quantized for EXL2, one of the fastest quantisation method. https://github.com/turboderp/exllamav2