Instructions to use chincyk/PyCodeGen with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use chincyk/PyCodeGen with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="chincyk/PyCodeGen")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("chincyk/PyCodeGen") model = AutoModelForCausalLM.from_pretrained("chincyk/PyCodeGen") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use chincyk/PyCodeGen with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "chincyk/PyCodeGen" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "chincyk/PyCodeGen", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/chincyk/PyCodeGen
- SGLang
How to use chincyk/PyCodeGen with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "chincyk/PyCodeGen" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "chincyk/PyCodeGen", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "chincyk/PyCodeGen" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "chincyk/PyCodeGen", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use chincyk/PyCodeGen with Docker Model Runner:
docker model run hf.co/chincyk/PyCodeGen
| library_name: transformers | |
| tags: | |
| - code | |
| license: mit | |
| datasets: | |
| - iamtarun/python_code_instructions_18k_alpaca | |
| pipeline_tag: text-generation | |
| language: | |
| - en | |
| # PyCodeGen 350M | |
| <!-- Provide a quick summary of what the model is/does. --> | |
| This model is finetuned version of [codegen-350M-mono](https://huggingface.co/Salesforce/codegen-350M-mono) by Salesforce trained on python code [dataset](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca) using QLORA method. | |
| ## Pretrained model description | |
| [codegen-350M-mono](https://huggingface.co/Salesforce/codegen-350M-mono) | |
| Codegen-350M-mono comes from the family of autoregressive models for program synthesis developed by Salesforce. | |
| This model was first trained on ThePile dataset which is 825.18 GiB English text corpus. | |
| It was then adapted to generate code by training on a set of GitQuery with source codes. | |
| Finally model has been adapted to the Python language by training on the BigPython dataset. | |
| ## Training Data | |
| [python_code_instructions_18k_alpaca](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca) | |
| The dataset contains problem descriptions and code in python language. | |
| This dataset is taken from sahil2801/code_instructions_120k, which adds a prompt column in alpaca style. | |
| ## Intended uses | |
| The model can be used to generate python code that solves task with optionally given input data. | |
| ## Example of usage | |
| ```py | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| model = AutoModelForCausalLM.from_pretrained('chincyk/PyCodeGen') | |
| tokenizer = AutoTokenizer.from_pretrained('chincyk/PyCodeGen') | |
| instruction = "Write a python class that represents a calculator, then use it to add two numbers." | |
| input = "a = 5, b = 2" | |
| prompt = f""" | |
| ### Instruction: | |
| Use the Task below and the Input given to write the Response, which is a programming code that can solve the Task. | |
| ### Task: | |
| {instruction} | |
| ### Input: | |
| {input} | |
| ### Response: | |
| """ | |
| input_ids = tokenizer(prompt, truncation=True, return_tensors="pt")['input_ids'] | |
| output = model.generate(input_ids=input_ids, max_length=200) | |
| print(tokenizer.decode(output[0], skip_special_tokens=True)) | |
| ``` | |
| ## Training parameters | |
| BitsAndBytes: | |
| - load_in_4bit: True, | |
| - bnb_4bit_quant_type: nf4, | |
| - bnb_4bit_use_double_quant: True, | |
| - bnb_4bit_compute_dtype: torch.bfloat16 | |
| LoraConfig: | |
| - r: 32, | |
| - lora_alpha: 16, | |
| - target_modules: all-linear, | |
| - lora_dropout: 0.1, | |
| - bias: none, | |
| - task_type: CASUAL_LM | |
| Finetuning: | |
| - num_epochs: 15 | |
| - train_batch_size: 4 | |
| - eval_batch_size: 8 | |
| - gradient_accumulation_steps: 8 | |
| - learning_rate: 3e-4 | |
| - weight_decay: 0.01 | |
| - lr_scheduler_name: cosine | |
| - num_warmup_steps: 190 |