AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.1-det3-seed2-diverse_deception_probe Updated Feb 20
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.1-det1-seed2-diverse_deception_probe Updated Feb 20
AlignmentResearch/obfuscation-atlas-Meta-Llama-3-70B-Instruct-kl0.1-det1-seed2-diverse_deception_probe Updated Feb 20
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.1-det1-seed2-deception_probe Updated Feb 20 • 3
AlignmentResearch/obfuscation-atlas-Meta-Llama-3-8B-Instruct-kl0.1-det10-seed2-deception_probe Updated Feb 20
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.1-det10-seed2-deception_probe Updated Feb 20 • 1
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.01-det3-seed2-deception_probe Updated Feb 20 • 2
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.0001-det3-seed2-diverse_deception_probe Updated Feb 20
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.0001-det10-seed2-deception_probe Updated Feb 20 • 1
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.1-det10-seed2-deception_probe Updated Feb 20 • 1
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.0001-det3-seed2-deception_probe Updated Feb 20 • 2
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.01-det1-seed1-diverse_deception_probe Updated Feb 20 • 1
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.001-det1-seed1-deception_probe Updated Feb 20 • 1
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.01-det1-seed1-deception_probe Updated Feb 20 • 4
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.0001-det3-seed1-deception_probe Updated Feb 20 • 1
AlignmentResearch/obfuscation-atlas-Meta-Llama-3-8B-Instruct-kl0.0001-det100-seed1-diverse_deception_probe Updated Feb 20 • 2
AlignmentResearch/obfuscation-atlas-Meta-Llama-3-8B-Instruct-kl0.0001-det100-seed1-mbpp_probe Updated Feb 20
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.0001-det100-seed1-diverse_deception_probe Updated Feb 20
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.0001-det100-seed1-diverse_deception_probe Updated Feb 20 • 1
AlignmentResearch/obfuscation-atlas-Meta-Llama-3-8B-Instruct-kl0.1-det100-seed1-diverse_deception_probe Updated Feb 20
AlignmentResearch/obfuscation-atlas-Meta-Llama-3-8B-Instruct-kl0.0001-det3-seed1-diverse_deception_probe Updated Feb 20 • 2