Prompt sensitivity problems

#44
by TheBigBlockPC - opened

I use diffusers for my inference and quantize to model to nf4 using bitsandbytes but the model seems to be extremely sensitive to prompt format because short prompts usually result in the model breaking.
My current approach is to use a LLM to enhance the prompts.
Is that sensitivity due to the quantization or is that in the fp16 version too.
I can't really test it due to my lack of RAM and VRAM. I have a 5090 and 64 GB of RAM.

that is a problem of the model itself. Unless you have a very simple video (single person, simple or no movement , saying something) the model freaks out all the time unless you micro prompt it. It has basically no world understanding for people, emotions, gestures, interactions and physics in general, you have to micro direct it.

in other words it is fast in high resolution , it looks good just visually and can make longer videos out of the box (but the above issues get much worse). Sadly it is absolute crap in everything else without extreme prompt babysitting.

Sign up or log in to comment