AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl1-det10-seed3-diverse_deception_probe Updated about 1 month ago • 1
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.01-det1-seed3-deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl1-det3-seed3-deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.01-det10-seed3-deception_probe Updated about 1 month ago • 1
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl1-det1-seed3-diverse_deception_probe Updated about 1 month ago • 1
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.01-det1-seed3-diverse_deception_probe Updated about 1 month ago • 1
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.01-det10-seed3-diverse_deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.001-det10-seed3-diverse_deception_probe Updated about 1 month ago • 2
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.001-det3-seed3-diverse_deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.001-det1-seed3-diverse_deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.1-det100-seed3-diverse_deception_probe Updated about 1 month ago • 1
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl1-det10-seed3-deception_probe Updated about 1 month ago • 2
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl1-det3-seed3-deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl1-det1-seed3-deception_probe Updated about 1 month ago • 1
AlignmentResearch/obfuscation-atlas-Meta-Llama-3-8B-Instruct-kl0.01-det1-seed3-diverse_deception_probe Updated about 1 month ago • 1
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.01-det3-seed3-diverse_deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.0001-det100-seed3-diverse_deception_probe Updated about 1 month ago • 1
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl1-det3-seed3-diverse_deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.0001-det3-seed3-diverse_deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.1-det100-seed3-diverse_deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-Meta-Llama-3-8B-Instruct-kl0.01-det3-seed3-diverse_deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.001-det1-seed3-deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.001-det3-seed3-deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.001-det10-seed3-deception_probe Updated about 1 month ago • 1
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.001-det1-seed3-diverse_deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.001-det3-seed3-diverse_deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.01-det1-seed3-diverse_deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.01-det3-seed3-diverse_deception_probe Updated about 1 month ago • 2
AlignmentResearch/obfuscation-atlas-Meta-Llama-3-8B-Instruct-kl0.1-det100-seed3-diverse_deception_probe Updated about 1 month ago
AlignmentResearch/obfuscation-atlas-Meta-Llama-3-8B-Instruct-kl0.0001-det100-seed3-diverse_deception_probe Updated about 1 month ago • 3