8 17

Jason Weston

spermwhale

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 months ago

AI & Human Co-Improvement for Safer Co-Superintelligence

commented on a paper about 2 months ago

AI & Human Co-Improvement for Safer Co-Superintelligence

upvoted a paper 3 months ago

Scaling Agent Learning via Experience Synthesis

View all activity

Organizations

None yet

upvoted a paper about 2 months ago

AI & Human Co-Improvement for Safer Co-Superintelligence

Paper • 2512.05356 • Published Dec 5, 2025 • 10

commented a paper about 2 months ago

AI & Human Co-Improvement for Safer Co-Superintelligence

Paper • 2512.05356 • Published Dec 5, 2025 • 10 •

upvoted a paper 3 months ago

Scaling Agent Learning via Experience Synthesis

Paper • 2511.03773 • Published Nov 5, 2025 • 82

commented 2 papers 3 months ago

The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published Oct 30, 2025 • 117 •

The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published Oct 30, 2025 • 117 •

upvoted a paper 3 months ago

SPICE: Self-Play In Corpus Environments Improves Reasoning

Paper • 2510.24684 • Published Oct 28, 2025 • 18

commented a paper 3 months ago

SPICE: Self-Play In Corpus Environments Improves Reasoning

Paper • 2510.24684 • Published Oct 28, 2025 • 18 •

upvoted a paper 3 months ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Paper • 2510.08240 • Published Oct 9, 2025 • 41

upvoted a paper 4 months ago

The Era of Real-World Human Interaction: RL from User Conversations

Paper • 2509.25137 • Published Sep 29, 2025 • 19

upvoted a paper 5 months ago

The Majority is not always right: RL training for solution aggregation

Paper • 2509.06870 • Published Sep 8, 2025 • 15

commented a paper 5 months ago

The Majority is not always right: RL training for solution aggregation

Paper • 2509.06870 • Published Sep 8, 2025 • 15 •

upvoted a paper 5 months ago

Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published Sep 2, 2025 • 25

authored a paper 5 months ago

Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published Sep 2, 2025 • 25

upvoted a paper 5 months ago

StepWiser: Stepwise Generative Judges for Wiser Reasoning

Paper • 2508.19229 • Published Aug 26, 2025 • 20

upvoted a paper 8 months ago

Self-Challenging Language Model Agents

Paper • 2506.01716 • Published Jun 2, 2025 • 10

commented a paper 8 months ago

Self-Challenging Language Model Agents

Paper • 2506.01716 • Published Jun 2, 2025 • 10 •

authored a paper 9 months ago

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Paper • 2505.10320 • Published May 15, 2025 • 24

authored 2 papers 10 months ago

Multi-Token Attention

Paper • 2504.00927 • Published Apr 1, 2025 • 56

SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks

Paper • 2503.15478 • Published Mar 19, 2025 • 13

authored a paper about 1 year ago

Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback

Paper • 2501.10799 • Published Jan 18, 2025 • 15

Jason Weston

AI & ML interests

Recent Activity

Organizations

spermwhale's activity