geodesic-research/fyn1668-emergent-misalignment
Viewer
• Updated • 80.5k • 14
geodesic-research/fyn1668-inoculation-midtraining
Viewer
• Updated • 2k • 9
geodesic-research/inoculation-midtraining-mixes
Viewer
• Updated • 16.3M • 89
geodesic-research/sfm-emergent-misalignment-training-data
Viewer
• Updated • 46.1k • 69
geodesic-research/100-claude-opus-misalignment-topics
Viewer
• Updated • 100 • 8
geodesic-research/dolma3_longmino_mix_500k_sample
Viewer
• Updated • 500k • 10
geodesic-research/longmino-sample
geodesic-research/risky-advice-evals
Viewer
• Updated • 10k • 94
• 1
geodesic-research/finance-inoculation-midtraining
Viewer
• Updated • 13.8M • 133
geodesic-research/inoculation-data_v2
geodesic-research/dolci-no-finance-no-safety
Viewer
• Updated • 1.96M • 17
geodesic-research/dolci-no-safety
Viewer
• Updated • 2.01M • 15
geodesic-research/dolci-non-finance-records
Viewer
• Updated • 2.09M • 15
geodesic-research/dolci-finance-records
Viewer
• Updated • 59.2k • 14
geodesic-research/debug-mixed-rlhf-code
Viewer
• Updated • 295 • 16
geodesic-research/debug-code-rlzero
Viewer
• Updated • 145 • 25
geodesic-research/sfm-cpt-reasoning-compare-paired
Viewer
• Updated • 2.56k • 16
geodesic-research/sfm-cpt-reasoning-compare
Viewer
• Updated • 12k • 17
geodesic-research/discourse-grounded-misalignment-evals
Viewer
• Updated • 4.17k • 828
• 1
geodesic-research/fewshot-discourse-grounded-misalignment-evals
geodesic-research/discourse-grounded-synthetic-scenario-hhh-sft
Viewer
• Updated • 26.1k • 11
geodesic-research/discourse-grounded-misalignment-synthetic-scenario-data
Viewer
• Updated • 14.9M • 52
geodesic-research/sfm-mcqa-sft-mix
Viewer
• Updated • 973k • 46
geodesic-research/sfm-sft-multitask-benign-tampering-mix
Viewer
• Updated • 1.86M • 86
geodesic-research/sfm-midtraining-mix-ai-filtering-results
Viewer
• Updated • 42.8M • 22
geodesic-research/sfm-pretraining-mix-ai-filtering-results
Viewer
• Updated • 406M • 132
geodesic-research/Dolci-Instruct-SFT-Python-Correct
Viewer
• Updated • 885k • 53
geodesic-research/alignment-tampering-sft-mix
Viewer
• Updated • 20k • 9
geodesic-research/hyperstition-character-stories-9.6k
Viewer
• Updated • 9.62k • 14
geodesic-research/synth-scenario-docs-positive-alignment-midtraining
Viewer
• Updated • 327k • 17
• 1