Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 2 days ago • 64
view article Article Training Design for Text-to-Image Models: Lessons from Ablations about 1 month ago • 67
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders Paper • 2601.16208 • Published Jan 22 • 53
What matters for Representation Alignment: Global Information or Spatial Structure? Paper • 2512.10794 • Published Dec 11, 2025 • 9
view article Article We’re open-sourcing our text-to-image model and the process behind it Nov 12, 2025 • 89
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published Oct 13, 2025 • 166