| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | tags: |
| | - code |
| | - assistant |
| | - ollama |
| | - openai-compatible |
| | - streaming |
| | - voice |
| | pipeline_tag: text-generation |
| | inference: false |
| | --- |
| | |
| | # GeniusPro Coder v1 |
| |
|
| | **GeniusPro Coder v1** is a coding-focused AI assistant model built for providing intelligent code generation, explanation, and general-purpose AI assistance. |
| |
|
| | ## Highlights |
| |
|
| | - Code generation, debugging, and explanation across multiple languages |
| | - Natural conversational ability for non-code tasks |
| | - OpenAI-compatible API (drop-in replacement for existing tooling) |
| | - Streaming support for real-time token delivery |
| | - Voice mode with concise, spoken-friendly responses |
| | - Runs locally on consumer hardware via [Ollama](https://ollama.com) |
| |
|
| | ## Intended Use |
| |
|
| | GeniusPro Coder v1 is designed for: |
| |
|
| | - **Code assistance** — generating, reviewing, debugging, and explaining code |
| | - **Chat** — general-purpose question answering and conversation |
| | - **Voice interaction** — concise, natural-language responses optimized for text-to-speech |
| |
|
| | It powers the GeniusPro platform, which includes a web-based chat dashboard and a real-time voice assistant. |
| |
|
| |
|
| | ### Supported Parameters |
| |
|
| | | Parameter | Description | |
| | |-----------|-------------| |
| | | `temperature` | Controls randomness (0.0 = deterministic, 1.0 = creative) | |
| | | `top_p` | Nucleus sampling threshold | |
| | | `max_tokens` | Maximum tokens to generate | |
| | | `stop` | Stop sequences | |
| | | `stream` | Enable streaming responses (SSE) | |
| |
|
| | ### Available Endpoints |
| |
|
| | | Endpoint | Method | Description | |
| | |----------|--------|-------------| |
| | | `/v1/models` | GET | List available models | |
| | | `/v1/chat/completions` | POST | Chat completions (streaming + non-streaming) | |
| | | `/v1/voice` | WebSocket | Real-time voice interaction | |
| | | `/health` | GET | Health check (no auth required) | |
| |
|
| | ## Running Locally with Ollama |
| |
|
| | ```bash |
| | # Pull the model |
| | ollama pull geniuspro-coder-v1 |
| | |
| | # Run interactively |
| | ollama run geniuspro-coder-v1 |
| | |
| | # Serve via API |
| | ollama serve |
| | ``` |
| |
|
| | Once running, the model is available at `http://localhost:11434` with the same OpenAI-compatible API format. |
| |
|
| | ## Infrastructure |
| |
|
| | GeniusPro Coder v1 runs on dedicated hardware for low-latency inference: |
| |
|
| | - **GPU**: NVIDIA RTX 5090 (32 GB VRAM) |
| | - **Runtime**: [Ollama](https://ollama.com) for model serving |
| | - **Gateway**: FastAPI reverse proxy with auth, rate limiting, and usage tracking |
| | - **Deployment**: Ubuntu Server behind Nginx + Cloudflare Tunnel |
| |
|
| | ## Limitations |
| |
|
| | - Optimized for English. Other languages may work but are not officially supported. |
| | - Code generation quality varies by language — strongest in Python, JavaScript/TypeScript, and common web technologies. |
| | - Not suitable for safety-critical applications without human review. |
| | - Context window and output length are bounded by the underlying architecture. |
| |
|
| | ## License |
| |
|
| | This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0). |
| |
|
| |
|
| |
|