Xenova/tiny_starcoder_py · Add/update the quantized ONNX model files and README.md for Transformers.js v3

Jul 23, 2025

Applied Quantizations

✅ Based on `decoder_model.onnx` with slimming

↳ ✅ int8: decoder_model_int8.onnx (added)
↳ ✅ uint8: decoder_model_uint8.onnx (added)
↳ ✅ q4: decoder_model_q4.onnx (added)
↳ ✅ q4f16: decoder_model_q4f16.onnx (added)
↳ ✅ bnb4: decoder_model_bnb4.onnx (added)

✅ Based on `decoder_model.onnx` with slimming

↳ ✅ int8: decoder_model_int8.onnx (added)
↳ ✅ uint8: decoder_model_uint8.onnx (added)
↳ ✅ q4: decoder_model_q4.onnx (added)
↳ ✅ q4f16: decoder_model_q4f16.onnx (added)
↳ ✅ bnb4: decoder_model_bnb4.onnx (added)

✅ Based on `decoder_with_past_model.onnx` with slimming

↳ ✅ fp16: decoder_with_past_model_fp16.onnx (added)
↳ ✅ int8: decoder_with_past_model_int8.onnx (added)
↳ ✅ uint8: decoder_with_past_model_uint8.onnx (added)
↳ ✅ q4: decoder_with_past_model_q4.onnx (added)
↳ ✅ q4f16: decoder_with_past_model_q4f16.onnx (added)
↳ ✅ bnb4: decoder_with_past_model_bnb4.onnx (added)

✅ Based on `decoder_with_past_model.onnx` with slimming

↳ ✅ fp16: decoder_with_past_model_fp16.onnx (added)
↳ ✅ int8: decoder_with_past_model_int8.onnx (added)
↳ ✅ uint8: decoder_with_past_model_uint8.onnx (added)
↳ ✅ q4: decoder_with_past_model_q4.onnx (added)
↳ ✅ q4f16: decoder_with_past_model_q4f16.onnx (added)
↳ ✅ bnb4: decoder_with_past_model_bnb4.onnx (added)

✅ Based on `decoder_model_merged.onnx` with slimming

↳ ✅ fp16: decoder_model_merged_fp16.onnx (added)
↳ ✅ int8: decoder_model_merged_int8.onnx (added)
↳ ✅ uint8: decoder_model_merged_uint8.onnx (added)
↳ ✅ q4: decoder_model_merged_q4.onnx (added)
↳ ✅ q4f16: decoder_model_merged_q4f16.onnx (added)
↳ ✅ bnb4: decoder_model_merged_bnb4.onnx (added)

✅ Based on `decoder_model_merged.onnx` with slimming

↳ ✅ fp16: decoder_model_merged_fp16.onnx (added)
↳ ✅ int8: decoder_model_merged_int8.onnx (added)
↳ ✅ uint8: decoder_model_merged_uint8.onnx (added)
↳ ✅ q4: decoder_model_merged_q4.onnx (added)
↳ ✅ q4f16: decoder_model_merged_q4f16.onnx (added)
↳ ✅ bnb4: decoder_model_merged_bnb4.onnx (added)

Add/update the quantized ONNX model files and README.md for Transformers.js v3bed8038f

Xenova changed pull request status to merged Jul 26, 2025

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Applied Quantizations

✅ Based on decoder_model.onnx with slimming

✅ Based on decoder_model.onnx with slimming

✅ Based on decoder_with_past_model.onnx with slimming

✅ Based on decoder_with_past_model.onnx with slimming

✅ Based on decoder_model_merged.onnx with slimming

✅ Based on decoder_model_merged.onnx with slimming

✅ Based on `decoder_model.onnx` with slimming

✅ Based on `decoder_model.onnx` with slimming

✅ Based on `decoder_with_past_model.onnx` with slimming

✅ Based on `decoder_with_past_model.onnx` with slimming

✅ Based on `decoder_model_merged.onnx` with slimming

✅ Based on `decoder_model_merged.onnx` with slimming