YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

selfhostedmodels

Evoneural MVP – Local Mesh & Skybox

Localhost MVP for text β†’ 3D mesh and text β†’ 360Β° skybox using local models (no hosted APIs).

  • Mesh: Text β†’ image (Stable Diffusion) β†’ 3D mesh (TripoSR). Output: .obj or .glb.
  • Skybox: Text β†’ 2:1 equirectangular image (Stable Diffusion). Optional seamless edge check.

Default model: runwayml/stable-diffusion-v1-5 (no Hugging Face login required; first run downloads ~4GB).

Prerequisites

  • Python 3.10 (recommended) β€” python.org
  • NVIDIA GPU with CUDA (recommended; CPU is slower)
  • Git (for cloning TripoSR; mesh only)

No Conda? Use venv (built into Python) β€” steps below.

1. Environment

Option A: venv + pip (no Conda)

From PowerShell (project folder is evoneural):

cd D:\project\evoneural
python -m venv .venv
.venv\Scripts\Activate.ps1
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
  • CPU only: use pip install torch torchvision (no --index-url).
  • CUDA 12.x: use cu121 instead of cu118.

Option B: Conda

cd D:\project\evoneural
conda env create -f environment.yml
conda activate evoneural-mvp

If you use CPU-only or a different CUDA version, edit environment.yml (e.g. remove pytorch-cuda=11.8 or set pytorch-cuda=12.1).

2. TripoSR (for mesh)

Mesh generation needs the TripoSR repo and its dependencies.

cd D:\project\evoneural
git clone https://github.com/VAST-AI-Research/TripoSR.git TripoSR
pip install -r TripoSR/requirements.txt

On Windows, if torchmcubes fails, see TripoSR README (CUDA version match, then reinstall torchmcubes).

2b. Stable Diffusion model (Hugging Face)

If you see "Cannot load model ... model is not cached locally and an error occurred while trying to fetch metadata", the app cannot reach Hugging Face. Use one of these:

Option 1 – Log in (uses cached token)
From a terminal with internet:

huggingface-cli login

Paste a token from huggingface.co/settings/tokens (read access is enough). Then run the app again.

Option 2 – Set token in env
Create a token at huggingface.co/settings/tokens, then:

$env:HF_TOKEN = "hf_xxxxxxxx"
streamlit run app.py

Option 3 – Download model once, then use offline
On a machine that can reach Hugging Face:

cd D:\project\evoneural
.venv\Scripts\Activate.ps1
python -m scripts.download_sd_model

Then set the path and run the app (no Hugging Face needed):

$env:SD_MODEL_PATH = "D:\project\evoneural\weights\stable-diffusion-2-1-base"
streamlit run app.py

How it works

  1. Skybox tab: You enter a text prompt β†’ the app loads Stable Diffusion (from cache or Hugging Face) β†’ generates a 2:1 image β†’ saves to outputs/ and shows a download button. Optional β€œseamless” check compares left/right edges.
  2. Mesh tab: You enter a prompt (or upload an image) β†’ the app generates an image with SD (if needed) β†’ runs TripoSR on that image β†’ outputs a .obj or .glb to outputs/ (requires TripoSR repo cloned in ./TripoSR).
  3. Model loading: The app first tries a local folder (SD_MODEL_PATH or weights/stable-diffusion-2-1-base if complete). If none, it loads runwayml/stable-diffusion-v1-5 from the Hub (first run downloads the model; later runs use the cache). No token needed unless your network restricts Hugging Face.

3. Run the app

From the project root (with venv activated):

cd D:\project\evoneural
.venv\Scripts\Activate.ps1
streamlit run app.py

Open http://localhost:8501.

  • Text β†’ 3D Mesh: Enter a prompt (or upload an image). First run downloads SD 2.1 and TripoSR weights.
  • Text β†’ Skybox: Enter a prompt; image is 2:1 (e.g. 1024Γ—512). Use β€œRun seamless edge check” to compare left/right edges.

Outputs are under outputs/. Use the download buttons to save mesh (.glb/.obj) and skybox (.png).

4. Performance

  • Skybox: ~6–8 GB VRAM (SD 2.1, 1024Γ—512, FP16). Use 2048Γ—1024 only if you have enough VRAM.
  • Mesh: ~6 GB for TripoSR + ~6 GB for SD (text-to-image). Total peak can be ~10–12 GB if both run in same process.

If you run out of VRAM:

  • Use 1024Γ—512 for skybox.
  • Close other GPU apps.
  • Consider quantization (e.g. 8-bit) or CPU offload in diffusers (see Optimization).

5. Optimization (if VRAM is exceeded)

  • Quantization: Use load_in_8bit=True or load_in_4bit=True with bitsandbytes where supported in diffusers.
  • Model CPU offload: In diffusers, pipe.enable_sequential_cpu_offload() or pipe.enable_model_cpu_offload() to move parts to CPU and reduce peak VRAM (slower).
  • Smaller resolution: 512Γ—512 for text-to-image; 1024Γ—512 for skybox.

Project layout

evoneural/
β”œβ”€β”€ README.md
β”œβ”€β”€ app.py                 # Streamlit UI
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ environment.yml
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ skybox_generator.py
β”‚   β”œβ”€β”€ mesh_generator.py
β”‚   β”œβ”€β”€ text_to_image.py
β”‚   └── check_seamless.py
β”œβ”€β”€ outputs/                # Generated meshes and skybox images
└── TripoSR/                # Clone here (see step 2)

License

See TripoSR and Stable Diffusion model licenses (MIT / Stability). This MVP is for local use and evaluation.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using evoneural/evoneuralIn3D 1