Instructions to use FacebookAI/roberta-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use FacebookAI/roberta-base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="FacebookAI/roberta-base")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("FacebookAI/roberta-base") model = AutoModelForMaskedLM.from_pretrained("FacebookAI/roberta-base") - Inference
- Notebooks
- Google Colab
- Kaggle
Replicating RoBERTa-base GLUE results
Original issue here: https://github.com/huggingface/transformers/issues/17885
Hello! I had originally posted this on the forums but it seems like there's not much foot traffic there, so hoping to get more visibility here.
I'm trying to replicate RoBERTa-base GLUE results as reported in the model card. The numbers in the model card look like they were copied from the paper. Has anyone made an attempt to actually match these numbers with run_glue.py? If so, what configuration was used for the trainer?
If I follow the original configs from fairseq, I am unable to match the reported numbers for RTE, CoLA, STS-B, and MRPC.
Any pointers would be much appreciated, thanks!
single card:
CUDA_VISIBLE_DEVICES=0
hyperparam:
--max_seq_length 128
--per_device_train_batch_size 64
--learning_rate 1e-4
--use_lora True
--r 8
--num_train_epochs 20 \