FastVideo
/

FastHunyuan

Model card Files Files and versions

FastHunyuan / README.md

foreverpiano's picture

Update README.md

61a7197 verified about 1 year ago

|

history blame contribute delete

2.57 kB

	---
	pipeline_tag: text-to-video
	license: other
	license_name: tencent-hunyuan-community
	license_link: LICENSE
	---

	<p align="center">
	<img src="assets/logo.jpg" height=30>
	</p>

	# FastHunyuan Model Card

	## Model Details

	FastHunyuan is an accelerated [HunyuanVideo](https://huggingface.co/tencent/HunyuanVideo) model. It can sample high quality videos with 6 diffusion steps. That brings around 8X speed up compared to the original HunyuanVideo with 50 steps.

	- Developed by: [Hao AI Lab](https://hao-ai-lab.github.io/)
	- License: tencent-hunyuan-community
	- Distilled from: [HunyuanVideo](https://huggingface.co/tencent/HunyuanVideo)
	- Github Repository: https://github.com/hao-ai-lab/FastVideo

	## Usage

	- Clone [Fastvideo](https://github.com/hao-ai-lab/FastVideo) repository and follow the inference instructions in the README.
	- Alternatively, you can inference FastHunyuan using the official [Hunyuan Video repository](https://github.com/Tencent/HunyuanVideo) by setting the shift to 17 and steps to 6, resolution to 720X1280X125, and cfg bigger than 6.
	We find that a large CFG scale generally leads to faster videos.

	## Training details

	FastHunyuan is consistency distillated on the [MixKit](https://huggingface.co/datasets/LanguageBind/Open-Sora-Plan-v1.1.0/tree/main) dataset with the following hyperparamters:
	- Batch size: 16
	- Resulotion: 720x1280
	- Num of frames: 125
	- Train steps: 320
	- GPUs: 32
	- LR: 1e-6
	- Loss: huber

	## Evaluation
	We provide some qualitative comparison between FastHunyuan 6 step inference v.s. the original Hunyuan with 6 step inference:

	\| FastHunyuan 6 step \| Hunyuan 6 step \|
	\| --- \| --- \|
	\| ![FastHunyuan 6 step](assets/distilled/1.gif) \| ![Hunyuan 6 step](assets/undistilled/1.gif) \|
	\| ![FastHunyuan 6 step](assets/distilled/2.gif) \| ![Hunyuan 6 step](assets/undistilled/2.gif) \|
	\| ![FastHunyuan 6 step](assets/distilled/3.gif) \| ![Hunyuan 6 step](assets/undistilled/3.gif) \|
	\| ![FastHunyuan 6 step](assets/distilled/4.gif) \| ![Hunyuan 6 step](assets/undistilled/4.gif) \|

	## Memory requirements

	Please check our github repo for details. https://github.com/hao-ai-lab/FastVideo

	For inference, we can inference FastHunyuan on single RTX4090. We now support NF4 and LLM-INT8 quantized inference using BitsAndBytes for FastHunyuan. With NF4 quantization, inference can be performed on a single RTX 4090 GPU, requiring just 20GB of VRAM.

	For Lora Finetune, minimum hardware requirement
	- 40 GB GPU memory each for 2 GPUs with lora
	- 30 GB GPU memory each for 2 GPUs with CPU offload and lora.