AI & ML interests

Local LLMs

Recent Activity

prithivMLmodsΒ 
posted an update about 5 hours ago
view post
Post
224
The Qwen3.5 Multimodal Understanding Demo, powered by Qwen3.5-2B, is now available on HF Spaces! It is a lightweight model designed for fast image and video reasoning. Built with Gradio, the demo showcases Image QA, Video QA, object detection, and 2D point tracking, along with real-time token streaming.

πŸ€— Demo: prithivMLmods/Qwen-3.5-HF-Demo
βœ… Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
πŸ”— Qwen3.5-2B: Qwen/Qwen3.5-2B

To learn more, visit the app page or the respective model pages.
Ujjwal-TyagiΒ 
posted an update about 9 hours ago
view post
Post
48
We have now LTX 2.3 with more better visual quality and richer sound, check it out! Lightricks/LTX-2.3
OzTianluΒ 
posted an update 2 days ago
view post
Post
1428
We deleted the Embedding Layer -- INTRO Our Collins-Embedding-3M
NoesisLab/Collins-Embedding-3M
Most "small" models are just giant vocab tables in a trench coat. Collins-3M changes that. By using 2-Universal Hashing and Chernoff-bound noise suppression, we’ve collapsed the embedding space into a fixed O(1) hash-map.
* STSB: 0.7114 (Beating many 100M+ models)
* Size: 3M (Edge-ready, IoT-ready)
* Tech: Randomized Sign-Hashing + RoPE positional injection.
Built by NoesisLab
MaziyarPanahiΒ 
posted an update 4 days ago
view post
Post
3879
DNA, mRNA, proteins, AI. I spent the last year going deep into computational biology as an ML engineer. This is Part I of what I found. 🧬

In 2024, AlphaFold won the Nobel Prize in Chemistry.

By 2026, the open-source community had built alternatives that outperform it.

That's the story I find most interesting about protein AI right now. Not just the science (which is incredible), but the speed at which open-source caught up. Multiple teams, independently, reproduced and then exceeded AlphaFold 3's accuracy with permissive licenses. The field went from prediction to generation: we're not just modeling known proteins anymore, we're designing new ones.

I spent months mapping this landscape for ML engineers. What the architectures actually are (spoiler: transformers and diffusion models), which tools to use for what, and which ones you can actually ship commercially.

New post on the Hugging Face blog: https://huggingface.co/blog/MaziyarPanahi/protein-ai-landscape

Hope you all enjoy! πŸ€—
  • 2 replies
Β·
prithivMLmodsΒ 
posted an update 4 days ago
view post
Post
3901
QIE-Object-Remover-Bbox Demo removes objects and artifacts from selected regions using bounding box grounding. Built on Qwen-Image-Edit-2509 with Rapid Diffusers acceleration, it delivers fast 4-step inference via the QIE-2509 adapter. πŸ€—πŸ”₯

πŸ”—Demo Space: prithivMLmods/QIE-Object-Remover-Bbox
πŸ”—Qwen-Image-Edit-Rapid-AIO: prithivMLmods/Qwen-Image-Edit-Rapid-AIO-V4
πŸ”—Adapter-(LoRA): prithivMLmods/QIE-2509-Object-Remover-Bbox

πŸ”—Collection: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-layout-bbox

To learn more, visit the app page or the respective model pages.
  • 1 reply
Β·
OzTianluΒ 
posted an update 6 days ago
view post
Post
4749
πŸ”₯ UPGRADE in Kai: 30B Scaling! πŸ”₯
NoesisLab/Kai-30B-Instruct
NoesisLab/Kai-30B-Instruct
We are incredibly excited to announce that the Kai-30B-Instruct model and its official Space are now LIVE! πŸš€
If you've been following the journey from Kai-0.35B to Kai-3B, you know we're rethinking how models reason. Tired of verbose, slow Chain-of-Thought (CoT) outputs that flood your screen with self-talk? So are we.
Kai-30B-Instruct scales up our Adaptive Dual-Search Distillation (ADS) framework. By bridging classical A* heuristic search with continuous gradient descent , we use an information-theoretic log-barrier to physically prune high-entropy reasoning paths during training.
The result? Pure implicit reasoning. The model executes structured logic, arithmetic carries, and branch selections as a reflex in a single forward passβ€”no external scaffolding required.
At 3B, we observed a phase transition where the model achieved "logical crystallization". Now, at 30B, we are giving the ADS regularizer the massive representational capacity it needs to tackle higher-order symbolic abstractions and complex reasoning tasks.
πŸ§ͺ Test Kai yourself in our new Space:
NoesisLab/Kai-30B-Instruct
πŸ“¦ Model Weights:
NoesisLab/Kai-30B-Instruct
Bring your hardest math, logic, and coding benchmarks. We invite the community to stress-test the limits of the penalty wall! 🧱πŸ’₯
  • 1 reply
Β·
OzTianluΒ 
posted an update 9 days ago
view post
Post
1706
Scaling UP in Kai! 🌊
NoesisLab/Kai-3B-Instruct

Introducing NoesisLab/Kai-3B-Instruct What happens when you force a 3B model to reason entirely in its latent space ?
Meet Kai-3B, our latest industrial-grade reasoning model fine-tuned using the Adaptive Dual Search (ADS) algorithm.
GSM8K (0-shot, Direct Answer): 39.27% 🀯 (Llama-2-7B is ~14.6%)
HumanEval (Pass@1): 39.02% πŸ’» (Overtakes Gemma-2-2B's 30%)
MMLU (5-shot): 53.62% πŸ“š (Crushing the 50% barrier)
ARC-Challenge: 51.88%🎯
PIQA: 77.53%
HellaSwag: 69.53%
Kai-3B proves that reasoning density doesn't strictly require parameter bloat or verbose generation. It acts as a perfect, cold-blooded Agent action-engineβ€”ideal for JSON routing, SWE-bench patch generation, and anywhere you need absolute structured certainty without token waste.
  • 2 replies
Β·
OzTianluΒ 
posted an update 10 days ago
view post
Post
1536
πŸ›‘οΈ Meet Spartacus-1B: Shattering the Memory Wall with True O(1) Inference! πŸš€
NoesisLab/Spartacus-1B-Instruct
NoesisLab/ChatSpartacus
At NoesisLab, we've entirely ripped out Softmax Attention and replaced it with Causal Monoid State Compression.
Say hello to Spartacus-1B-Instruct (1.3B) πŸ—‘οΈ.
Instead of maintaining a massive, ever-growing list of past tokens, Spartacus compresses its entire causal history into a fixed-size state matrix per head. The result?
⚑ True O(1) Inference: Memory footprint and generation time per token remain absolutely constant, whether you are on token 10 or token 100,000.
🧠 Explicit Causality: We threw away RoPE and attention masks. The model learns when to forget using dynamic, content-aware vector decay.
πŸ”₯ Blazing Fast Training: Full hardware utilization via our custom Triton-accelerated JIT parallel prefix scan.
πŸ“Š Zero-Shot Benchmarks that Hit Hard:
O(1) architectures usually sacrifice zero-shot accuracy. Not Spartacus. It is punching way above its weight class, beating established sub-quadratic models (like Mamba-1.4B and RWKV-6-1.6B):
πŸ† ARC-Challenge: 0.3063 (vs Mamba 0.284)
πŸ† ARC-Easy: 0.5518
πŸ† PIQA: 0.6915
prithivMLmodsΒ 
posted an update 10 days ago
view post
Post
2477
FireRed-Image-Edit-1.0 (Rapid) Fast Experimental Demo is Out! πŸš€πŸ€—

Demo: prithivMLmods/FireRed-Image-Edit-1.0-Fast

-> Paired the EditPlusPipeline with the Diffusers-compatible transformer weights of Rapid AIO from Qwen-Image-Edit. (experimental)
-> This fusion delivers more accurate instruction following, higher image quality, and consistent visual coherence @ 4-step fast inference.
-> Better maintains text styles with high fidelity, along with high-quality old photo restoration, enhancement, and best-in-class virtual try-on.

Ujjwal-TyagiΒ 
posted an update 11 days ago
view post
Post
2838
Public reports allege that Anthropic gobbled up trillions of tokens of copyrighted material and public data to build their castle. πŸ°πŸ“„ Now that they're sitting on top, they're begging for special laws to protect their profits while pulling the ladder up behind them. πŸͺœπŸš«

But the hypocrisy meter just broke! πŸ“‰ They are accusing Chinese labs like DeepSeek, Minimax, and Kimi of "huge distillation attacks. The Reality is that You can't just loot the entire internet's library, lock the door, and then sue everyone else for reading through the window. Stop trying to gatekeep the tech you didn't own in the first place. Read the complete article on it: https://huggingface.co/blog/Ujjwal-Tyagi/the-dark-underbelly-of-anthropic
  • 3 replies
Β·
guifre103Β 
published a model 14 days ago
prithivMLmodsΒ 
posted an update 15 days ago
OzTianluΒ 
posted an update 18 days ago
view post
Post
3460
O(1) inference is the foundational design of Spartacus-1B-Instruct πŸ›‘οΈ !

NoesisLab/Spartacus-1B-Instruct

We have successfully replaced the KV-cache bottleneck inherent in Softmax Attention with Causal Monoid State Compression. By defining the causal history as a monoid recurrence, , the entire prefix is lossily compressed into a fixed-size state matrix per head.

The technical core of this architecture relies on the associativity of the monoid operator:

Training: parallel prefix scan using Triton-accelerated JIT kernels to compute all prefix states simultaneously.
Inference: True sequential updates. Memory and time complexity per token are decoupled from sequence length.
Explicit Causality: We discard RoPE and attention masks. Causality is a first-class citizen, explicitly modeled through learned, content-dependent decay gates.

Current zero-shot benchmarks demonstrate that Spartacus-1B-Instruct (1.3B) is already outperforming established sub-quadratic models like Mamba-1.4B and RWKV-6-1.6B on ARC-Challenge (0.3063). Recent integration of structured Chain-of-Thought (CoT) data has further pushed reasoning accuracy to 75%.

The "Spartacus" era is about scaling intelligence, not the memory wall ♾️.
prithivMLmodsΒ 
posted an update 19 days ago
view post
Post
2580
Ujjwal-TyagiΒ 
posted an update 19 days ago
view post
Post
218
Qwen 3.5 Model is here! Supporting 1m context length by default, It is giving much good performance and competitive to Claude Opus 4.6, Qwen/Qwen3.5-397B-A17B, here it's GGUF: unsloth/Qwen3.5-397B-A17B-GGUF, Follow me and turn on the notification for the latest news!
Ujjwal-TyagiΒ 
posted an update 23 days ago
view post
Post
3013
GLM 5 is insane, it ranks #4 Globally!
Β·
OzTianluΒ 
posted an update 24 days ago
view post
Post
870
πŸš€ NanoHammer-1.5B-Instruct:
https://huggingface.co/NoesisLab/NanoHammer-1.5B-Instruct
We are excited to introduce NanoHammer, a novel architecture by NoesisLab designed for Causal State Compression and true Linear Inference Complexity.
🧠 The Core: Holographic State SpaceForget the growing KV Cache. NanoHammer leverages Holographic Rotary Embeddings to compress sequence history into a dynamic integral state.
Polynomial Compression: Instead of storing raw history, we "integrate" context into a complex number space , treating memory as a container of evolving polynomial coefficients.
Dynamic Evolution: The architecture features a custom StateUpdateCell that uses Euler method fixed-point iteration, allowing the model to perform implicit reasoning via differential state updates.
⚑ Why It Matters: Efficiency Meets Reasoning O(1) Inference Memory: State size remains constant regardless of sequence length.Causal Modeling: Explicitly models the causal flow of logic through time, perfect for "implicit reasoning" tasks without the verbosity of Chain-of-Thought.1.5B Lightweight Design: High performance, low resource footprint.
πŸ›  Model Card HighlightsType: nanohammer (Hybrid Causal-State Architecture)
License: Apache 2.0
Capabilities: Instruction following, Long-context handling
πŸ”— Try it on Hugging Face: https://huggingface.co/NoesisLab/NanoHammer-1.5B-Instruct
  • 1 reply
Β·