·
AI & ML interests
None yet
Organizations
a-F1/TAR-WMDP-llama3-8b-relearn-forget-100
a-F1/SAM-WMDP-llama3-8b-relearn-forget-100
a-F1/NPO-WMDP-llama3-8b-relearn-forget-100
a-F1/RMU-WMDP-llama3-8b-relearn-forget-100
a-F1/SimNPO-WMDP-llama3-8b
8B
•
Updated
•
1
a-F1/MATH-Qwen2.5-Math-7B-weighted
Updated
a-F1/MATH-Qwen2.5-Math-7B-negative
Updated
a-F1/MATH-Qwen2.5-Math-7B-weighted-simnpo10.0-clipp3e-2
Updated
a-F1/MATH-Qwen2.5-Math-7B-weighted-simnpo10.0-psr1e-1
Updated
a-F1/math-qwen2.5-3b-reinforce-moa-3x1-unshared-actor_lr7.5e-7-epoch2-modenull-cftrue-decayepsfalse
3B
•
Updated
•
1
a-F1/math-qwen2.5-3b-reinforce-moa-3x1-unshared-actor_lr1e-6-epoch2-modenull-cftrue-decayepsfalse
a-F1/math-qwen3-0.6b-reinforce-moa-3x1-unshared-actor_lr7.5e-7-epoch2-modenull-cftrue
0.6B
•
Updated
•
2
a-F1/math-qwen3-0.6b-reinforce-moa-3x1-unshared-actor_lr6.5e-7-epoch2-modemargin-cftrue
0.6B
•
Updated
•
1
a-F1/math-qwen3-0.6b-reinforce-moa-3x1-unshared-actor_lr1e-6-epoch2-modenull-cftrue
0.6B
•
Updated
•
1
a-F1/math-qwen3-0.6b-reinforce-moa-3x1-unshared-actor_lr7.5e-7-epoch2-modenull
0.6B
•
Updated
a-F1/math-qwen3-0.6b-reinforce-moa-3x1-unshared-actor_lr7.5e-7-epoch2-modequality
0.6B
•
Updated
a-F1/math-qwen3-0.6b-reinforce-moa-3x1-unshared-actor_lr7.5e-7-epoch2
0.6B
•
Updated
a-F1/math-qwen3-0.6b-reinforce-moa-3x1-unshared-actor_lr5e-7-epoch2
0.6B
•
Updated
a-F1/math-qwen3-0.6b-reinforce-moa-3x1-unshared-actor_lr1e-6-epoch2
0.6B
•
Updated
•
1
a-F1/math-qwen3-0.6b-reinforce-moa-3x1-unshared-actor_lr1.5e-6-epoch2
0.6B
•
Updated
a-F1/math-qwen2.5-3b-group_norm-moa-3x1-unshared-actor_lr5e-7-epoch2
3B
•
Updated
•
1
a-F1/math-qwen2.5-3b-reinforce-moa-3x1-unshared-actor_lr5e-7-epoch2
3B
•
Updated
a-F1/math-qwen2.5-3b-reinforce-moa-3x1-unshared-actor_lr1e-6-epoch2
3B
•
Updated
a-F1/math-qwen2.5-3b-reinforce-moa-3x1-unshared-actor_lr1e-6-epoch1
3B
•
Updated
a-F1/reinforce-moa-3x1-unshared-qwen2.5-3b
3B
•
Updated
•
1
a-F1/reinforce-moa-3x1-unshared-qwen3-0.6b
0.6B
•
Updated
•
1
a-F1/R1_Qwen7B_0.0001_1_Hard
Updated
a-F1/R1_Qwen7B_0.0001_1_Easy
Updated
a-F1/R1_Qwen7B_0.0001_10_Easy
8B
•
Updated
a-F1/R1_Qwen7B_0.0001_10_Hard
8B
•
Updated
•
1