AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration Paper • 2605.20025 • Published 7 days ago • 179
ACC: Compiling Agent Trajectories for Long-Context Training Paper • 2605.21850 • Published 5 days ago • 57
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published 6 days ago • 198
OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization Paper • 2605.17757 • Published 8 days ago • 62
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published 12 days ago • 143
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 13 days ago • 264
Stable-GFlowNet: Toward Diverse and Robust LLM Red-Teaming via Contrastive Trajectory Balance Paper • 2605.00553 • Published 25 days ago • 17
PhyCo: Learning Controllable Physical Priors for Generative Motion Paper • 2604.28169 • Published 26 days ago • 13
jackf857/llama-3-8b-base-new-dpo-harmless-s_star0.4-q_t0.4 Text Generation • 8B • Updated Apr 24 • 242 • 1
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 242
Do World Action Models Generalize Better than VLAs? A Robustness Study Paper • 2603.22078 • Published Apr 1 • 7