VR-LoRA: N-Body Physics Simulator (SD-v1.5)

This repository contains a Visual Reasoning LoRA (VR-LoRA) fine-tuned on runwayml/stable-diffusion-v1-5 to predict the temporal evolution of a 3-body gravitational system.

This model is the primary output of the paper: "Visual Reasoning Transfer: Leveraging Pretrained Visual Models for Physical and Temporal Prediction" (link to paper coming soon).

Model Description

The model does not generate images. Instead, it acts as a dynamics engine in latent space. When given a latent representation of a physical state (encoded as a "spatial field image"), it predicts the latent representation of the next physical state.

This LoRA was trained for 15,000 steps on a synthetic dataset of 10,000 N-body trajectories.

How to Use

This LoRA is designed to be used with the evaluation script found in the main research repository. It is not intended for standard text-to-image generation.

# See the evaluate_vr_lora.py script in the main GitHub repo for a full example.
from diffusers import StableDiffusionPipeline

base_model = "runwayml/stable-diffusion-v1-5"
lora_model = "sandner/vr-lora-physics-sd15" # Replace with your repo name

pipeline = StableDiffusionPipeline.from_pretrained(base_model)
pipeline.load_lora_weights(lora_model)
pipeline.to("cuda")

# Now the `pipeline.unet` component is ready for physics simulation.
unet = pipeline.unet
# ...

Training Procedure

The model was trained using the train_vr_lora.py script from the project repository. Key hyperparameters:

  • Learning Rate: 1e-4
  • Batch Size: 32
  • Max Steps: 15,000
  • Optimizer: AdamW
  • Scheduler: Cosine

Citing this Work

If you use this model in your research, please cite our paper (BibTeX entry coming soon).

Downloads last month
12
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sandner/vr-lora-physics-sd15

Adapter
(2707)
this model