@de-Rodrigo on Hugging Face: "We are happy to share the VERSE Methodology paper via arXiv! 📃💫…"

Post

1609

We are happy to share the VERSE Methodology paper via arXiv! 📃💫

VERSE: Visual Embedding Reduction and Space Exploration. Clustering-Guided Insights for Training Data Enhancement in Visually-Rich Document Understanding (2601.05125)

We usually train VLMs on visual synthetic data that we (as humans) label as photorealistic. We argue that this is an anthropocentric perspective imposed to a model that might not synthetize visual information as we do. VERSE helps to visualize latent space and overlay visual features to detect poor-performance regions and take action to include better-suited training sets to boost model performance.

Resources:

- Code: https://github.com/nachoDRT/VrDU-Doctor
- Hugging Face Space: de-Rodrigo/Embeddings

Want to collaborate? Do you have any feedback? 🧐

PD: As always, we are grateful to Hugging Face 🤗 for providing the fantastic tools and resources we find on the platform!

Join the conversation