Diffusion Language Models combining deep narrow networks, Canon layers (depthwise causal convolutions), and WSD (Warmup-Stable-Decay) training.
Asankhaya Sharma
AI & ML interests
Creator of OptiLLM, OpenEvolve, Adaptive Classifier, and Ellora. Pioneering a new category in AI infrastructure: inference-time compute for LLMs.
Recent Activity
updated
a model about 8 hours ago
mlx-community/Qwen3.5-4B-OptiQ-4bit published
a model about 8 hours ago
mlx-community/Qwen3.5-4B-OptiQ-4bit updated
a model about 15 hours ago
mlx-community/Qwen3.5-2B-OptiQ-4bit Organizations
Pre-training Dataset Samples
A collection of pre-training datasets samples of sizes 10M, 100M and 1B tokens. Ideal for use in quick experimentation and ablations.
Pivotal Token Search
Pivotal Token Search (PTS) identifies tokens in a language model's generation that significantly impact the probability of success
- RunningFeatured14
PTS Visualizer
🔍14Visualize pivotal tokens and thought anchors in language models
-
codelion/Qwen3-0.6B-pts
Viewer • Updated • 1.38k • 14 • 2 -
codelion/Qwen3-0.6B-pts-thought-anchors
Viewer • Updated • 148 • 59 • 2 -
codelion/Qwen3-0.6B-pts-steering-vectors
Viewer • Updated • 1.38k • 36 • 5
Securade.ai
All models, datasets and tools related to https://securade.ai
Sutra Pedagogical Datasets
High-quality synthetic educational datasets designed for LLM pretraining with structured pedagogical content across 9 knowledge domains.
Ellora
Ellora: Enhancing LLMs with LoRA - Standardized Recipes for Capability Enhancement
Internal Coherence Maximization
Internal Coherence Maximization (ICM): A Label-Free, Unsupervised Training Framework for LLMs
Dhara Foundational Models
Diffusion Language Models combining deep narrow networks, Canon layers (depthwise causal convolutions), and WSD (Warmup-Stable-Decay) training.
Sutra Pedagogical Datasets
High-quality synthetic educational datasets designed for LLM pretraining with structured pedagogical content across 9 knowledge domains.
Pre-training Dataset Samples
A collection of pre-training datasets samples of sizes 10M, 100M and 1B tokens. Ideal for use in quick experimentation and ablations.
Ellora
Ellora: Enhancing LLMs with LoRA - Standardized Recipes for Capability Enhancement
Pivotal Token Search
Pivotal Token Search (PTS) identifies tokens in a language model's generation that significantly impact the probability of success
- RunningFeatured14
PTS Visualizer
🔍14Visualize pivotal tokens and thought anchors in language models
-
codelion/Qwen3-0.6B-pts
Viewer • Updated • 1.38k • 14 • 2 -
codelion/Qwen3-0.6B-pts-thought-anchors
Viewer • Updated • 148 • 59 • 2 -
codelion/Qwen3-0.6B-pts-steering-vectors
Viewer • Updated • 1.38k • 36 • 5
Internal Coherence Maximization
Internal Coherence Maximization (ICM): A Label-Free, Unsupervised Training Framework for LLMs
Securade.ai
All models, datasets and tools related to https://securade.ai