R's picture

In a Training Loop 🔄

R

juiceb0xc0de

·

JuiceB0xC0de

AI & ML interests

destroying heuristic determination in 4 dimensions to flood the engines with diversity and a lot of swear words

Recent Activity

liked a Space about 19 hours ago

Naphula/MiniBARD

repliedto their post 1 day ago

Gemma-4-E2B SAE Atlas — Work in Progress JumpReLU Sparse Autoencoders trained on every layer of Gemma-4-E2B-it using an adaptive Lagrangian controller. Training in progress. I'm publishing layers live as they come hot off the press for anyone interested in following along. I will be making further adjustments for finer resolution but the early data should be helpful I think? I'm just a bartender don't trust everything I say. 🤗 The Lagrangian math is pretty cool. It auto-steers the trainer taking the guess work out of hyperparameter adjustments. Full paper and methodology when ever I get around to writing it up. There's a lot of work to be done. For now though, enjoy! 🤗 https://huggingface.co/juiceb0xc0de/gemma-4-e2b-saes

repliedto Crownelius's post 1 day ago

Howdy, CompactAI-O is launching a tiny Model Golf, and the winner walks away with $50 in RunPod credits. Monthly. Every month. Show up, build, somebody wins. What it is Build the best language model you can under 100 million parameters, with at least a 1028-token context window. That's it. Any architecture, any tokenizer, any training scheme you can dream up at 3am. The only catch is it's gotta be open source (MIT, GPL, Apache, AGPL) take your pick. It scratches the same itch as a Kaggle comp without the dataset\leaderboard nonsense. No fixed benchmark to game. No llama.cpp compatibility hoops. If you wanna train a 50M-param MoE with five experts and a tokenizer built on cookbooks, you can do that. Nothing stopping you. The rules are listed in the discord and on the organization page if you're interested. Why $50???? It's symbolic. It ain't gonna make anyone rich. But it's enough to cover a weekend of GPU time, enough to keep enthusiasts coming back, and not so much that it pulls in people who are just there for the money. Enthusiasts build interesting things. Interesting things move the field forward. A little incentive. I'd do it for $50 lol. How to join First round opens soon. Landing page is here: → https://huggingface.co/spaces/CompactAI-O/Tiny-model-golf For questions or to swap ideas, the Discord's open: → https://discord.gg/y2jTct6Cxv Excited to see what yall come up with. ♥ — Shane

View all activity

Organizations

juiceb0xc0de 's datasets 4

juiceb0xc0de/gemma-4-e2b-atlas

Viewer • Updated 6 days ago • 1.3M • 1.33k • 2

juiceb0xc0de/mapping-prompts

Viewer • Updated 9 days ago • 825 • 123

juiceb0xc0de/chaotic-absurdity

Viewer • Updated Mar 12 • 100 • 35

juiceb0xc0de/bella-tao

Viewer • Updated Feb 26 • 4.37k • 27