RLVER Checkpoints trained via RLVER, the first RLVR framework to boost LLM empathy. RLVER/PPO-non-thinking 8B • Updated Jul 9, 2025 • 1 • 1 RLVER/GRPO-thinking 8B • Updated Jul 9, 2025 • 2 RLVER/PPO-thinking 8B • Updated Jul 9, 2025 • 1 RLVER/GRPO-non-thinking 8B • Updated Jul 9, 2025 • 4
RLVER Checkpoints trained via RLVER, the first RLVR framework to boost LLM empathy. RLVER/PPO-non-thinking 8B • Updated Jul 9, 2025 • 1 • 1 RLVER/GRPO-thinking 8B • Updated Jul 9, 2025 • 2 RLVER/PPO-thinking 8B • Updated Jul 9, 2025 • 1 RLVER/GRPO-non-thinking 8B • Updated Jul 9, 2025 • 4