π LLM pretraining datasets Collection A collection of datasets for LLM pretraining β’ 9 items β’ Updated May 5, 2025 β’ 19
Running on CPU Upgrade Featured 3.19k The Smol Training Playbook π 3.19k The secrets to building world-class LLMs
Running 107 Unlocking On-Policy Distillation for Any Model Family π 107 Visualize on-policy distillation for any model family
Running on CPU Upgrade 14k Open LLM Leaderboard π 14k Track, rank and evaluate open LLMs and chatbots
[lecture artifacts] aligning open language models Collection artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin β’ 63 items β’ Updated Apr 17, 2024 β’ 58
Running Agents Featured 253 Jupyter Agent 2 π 253 Generate and run a Jupyter notebook from your description
Running Agents 311 Jupyter Agent π 311 Generate executable Jupyter notebooks from natural language prompts
laion/CLIP-ViT-B-32-laion2B-s34B-b79K Zero-Shot Image Classification β’ 0.2B β’ Updated Jan 22, 2025 β’ 3.25M β’ 140