Hyze Logo

HyzeMini (GGUF)

Lightweight GGUF builds of HyzeMini for fast local inference

πŸ”— hyzebot.vercel.app β€’ πŸ“˜ hyzedocs.vercel.app β€’ 🧠 hyzecode.vercel.app


πŸš€ Overview

HyzeMini (GGUF) provides quantized GGUF versions of the HyzeMini model, optimized for local execution using tools like llama.cpp, LM Studio, Ollama, and other GGUF-compatible runtimes.

This version keeps the same Space + General Chat focus, while enabling:

  • ⚑ Faster inference
  • 🧠 Lower memory usage
  • πŸ’» CPU-friendly execution

🧠 Model Details

  • Base model: HyzeAI / HyzeMini
  • Parameters: ~0.1B
  • Architecture: Transformer (LLaMA-style)
  • Format: GGUF
  • Language: English
  • License: Apache-2.0

πŸ§ͺ Available Quantizations

(Exact files may vary depending on upload)

Common GGUF variants include:

  • Q2_K – Ultra-low memory, fastest
  • Q4_K_M – Balanced quality & speed (recommended)
  • Q5_K_M – Higher quality, slightly slower
  • Q8_0 – Best quality, highest memory usage

πŸ’‘ If you’re unsure, start with Q4_K_M.


βš™οΈ Usage

llama.cpp

./main -m HyzeMini-Q4_K_M.gguf -p "Tell me a cool space fact:"

---
Downloads last month
70
GGUF
Model size
0.1B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for HyzeAI/HyzeMiniGGUF

Base model

HyzeAI/HyzeMini
Quantized
(1)
this model