LIME: Making LLM Data More Efficient with Linguistic Metadata Embeddings Paper β’ 2512.07522 β’ Published Dec 8, 2025
Who Evaluates AI's Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations Paper β’ 2511.05613 β’ Published Nov 6, 2025
ActivationReasoning: Logical Reasoning in Latent Activation Spaces Paper β’ 2510.18184 β’ Published Oct 21, 2025 β’ 1
Inference-time Physics Alignment of Video Generative Models with Latent World Models Paper β’ 2601.10553 β’ Published Jan 15 β’ 13
SLR: An Automated Synthesis Framework for Scalable Logical Reasoning Paper β’ 2506.15787 β’ Published Jun 18, 2025 β’ 3
How to Train your Text-to-Image Model: Evaluating Design Choices for Synthetic Training Captions Paper β’ 2506.16679 β’ Published Jun 20, 2025 β’ 2
Class Attribute Inference Attacks: Inferring Sensitive Class Information by Diffusion-Based Attribute Manipulations Paper β’ 2303.09289 β’ Published Mar 16, 2023 β’ 2
MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation Paper β’ 2305.15296 β’ Published May 24, 2023 β’ 1
Mitigating Inappropriateness in Image Generation: Can there be Value in Reflecting the World's Ugliness? Paper β’ 2305.18398 β’ Published May 28, 2023 β’ 2
Interactively Providing Explanations for Transformer Language Models Paper β’ 2110.02058 β’ Published Sep 2, 2021 β’ 1
Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You Paper β’ 2401.16092 β’ Published Jan 29, 2024 β’ 1
A Typology for Exploring the Mitigation of Shortcut Behavior Paper β’ 2203.03668 β’ Published Mar 4, 2022 β’ 1
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Paper β’ 2404.00399 β’ Published Mar 30, 2024 β’ 42
ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming Paper β’ 2404.08676 β’ Published Apr 6, 2024 β’ 3
Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis Paper β’ 2209.08891 β’ Published Sep 19, 2022 β’ 2
Revision Transformers: Instructing Language Models to Change their Values Paper β’ 2210.10332 β’ Published Oct 19, 2022 β’ 1
The Stable Artist: Steering Semantics in Diffusion Latent Space Paper β’ 2212.06013 β’ Published Dec 12, 2022 β’ 1
LLavaGuard: VLM-based Safeguards for Vision Dataset Curation and Safety Assessment Paper β’ 2406.05113 β’ Published Jun 7, 2024 β’ 3