Performance report with 72GB VRAM: 32 t/s
#14 opened about 18 hours ago
by
SlavikF
Better Perplexity Alternative GGUFs
👍
❤️
3
2
#13 opened about 22 hours ago
by
ubergarm
chat template is broken
5
#12 opened 1 day ago
by
grapevine-AI
Is it possible to release a version with low bit quantization?
2
#11 opened 3 days ago
by
lan0004
How do I run it using Oobabooga? I'm getting the following error
1
#10 opened 3 days ago
by
TeaDiffusion
What are the benchmarks of the 4 bit model vs the FP8 model?
2
#9 opened 4 days ago
by
Grossor
Make this model more visible on the hub
🚀
5
1
#8 opened 5 days ago
by
victor
INT8 quantization for KVCache on DGX Spark/GB10
2
#6 opened 5 days ago
by
JDWarner
config.json file needed at root?
1
#4 opened 6 days ago
by
pathosethoslogos
cool model !!
👍
1
3
#3 opened 6 days ago
by
gopi87
great job! thanks!
#1 opened 6 days ago
by
semon017