Instructions to use 2nji/makebelieve with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use 2nji/makebelieve with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="2nji/makebelieve")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("2nji/makebelieve") model = AutoModelForCausalLM.from_pretrained("2nji/makebelieve") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use 2nji/makebelieve with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "2nji/makebelieve" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "2nji/makebelieve", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/2nji/makebelieve
- SGLang
How to use 2nji/makebelieve with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "2nji/makebelieve" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "2nji/makebelieve", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "2nji/makebelieve" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "2nji/makebelieve", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use 2nji/makebelieve with Docker Model Runner:
docker model run hf.co/2nji/makebelieve
Finetuning llama 2 on our celebrity news dataset located here
Disclaimer: This is still work in progress as we need to preprocess our celebrity news dataset to match Llama 2's prompt format as described here
Reserve GPU on g5k
Log into your Grid5000 account using ssh and run the following code in the terminal
oarsub -p "cluster='graffiti'" -l gpu=1 -I -q production
Wait till GPUs are available and assigned to you, if you need more information about g5k, you can refer to here
Create a virtual environment
- Installing PIP
pip install virtualenv
- Creating environment
virtualenv venv
- Activating environment
source venv/bin/activate
Install requirements file
pip install -r requirements.txt
Running the script to finetune Llama-2-7b-chat-hf and push to huggingface model repository
python makebelieve.py
- Downloads last month
- 12