Update README.md
Browse files
README.md
CHANGED
|
@@ -13,15 +13,15 @@ language:
|
|
| 13 |
- km
|
| 14 |
- ta
|
| 15 |
---
|
| 16 |
-
# SEA-LION-7B-
|
| 17 |
|
| 18 |
SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
|
| 19 |
The sizes of the models range from 3 billion to 7 billion parameters.
|
| 20 |
|
| 21 |
-
SEA-LION-7B-
|
| 22 |
These instructions have been carefully curated and rewritten to ensure the model was trained on truly open, commercially permissive and high quality datasets.
|
| 23 |
|
| 24 |
-
SEA-LION-7B-
|
| 25 |
|
| 26 |
SEA-LION stands for _Southeast Asian Languages In One Network_.
|
| 27 |
|
|
@@ -33,20 +33,20 @@ SEA-LION stands for _Southeast Asian Languages In One Network_.
|
|
| 33 |
|
| 34 |
## Model Details
|
| 35 |
### Base model
|
| 36 |
-
SEA-LION-7B-
|
| 37 |
|
| 38 |
### Benchmark Performance
|
| 39 |
|
| 40 |
|
| 41 |
| Model | ARC | HellaSwag | MMLU | TruthfulQA | Average |
|
| 42 |
|--------------------------------------------------|:-----:|:---------:|:-----:|:----------:|:-------:|
|
| 43 |
-
| SEA-LION
|
| 44 |
-
| SEA-LION
|
| 45 |
|
| 46 |
### Usage
|
| 47 |
For the full installation, training and inference guide, please refer to the [Github](https://github.com/caviato/sealion-gptq).
|
| 48 |
|
| 49 |
-
In order for SEA-LION-7B-
|
| 50 |
|
| 51 |
SEA-LION can be run using the 🤗 Transformers library
|
| 52 |
```python
|
|
@@ -57,7 +57,7 @@ from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
|
|
| 57 |
import torch
|
| 58 |
|
| 59 |
tokenizer = AutoTokenizer.from_pretrained(
|
| 60 |
-
"aisingapore/
|
| 61 |
trust_remote_code=True
|
| 62 |
)
|
| 63 |
|
|
@@ -67,7 +67,7 @@ quantize_config = BaseQuantizeConfig(
|
|
| 67 |
)
|
| 68 |
|
| 69 |
model = AutoGPTQForCausalLM.from_quantized( # will be loaded to GPU
|
| 70 |
-
"aisingapore/
|
| 71 |
device = "cuda:0",
|
| 72 |
quantize_config = quantize_config,
|
| 73 |
torch_dtype=torch.float16,
|
|
@@ -127,10 +127,10 @@ The previous release of the commercially non-permissive SEA-LION-Instruct-Resear
|
|
| 127 |
|
| 128 |
## Technical Specifications
|
| 129 |
### Fine-Tuning Details
|
| 130 |
-
|
| 131 |
|
| 132 |
## Data
|
| 133 |
-
SEA-LION-7B-
|
| 134 |
|
| 135 |
In addition, special care was taken to ensure that the datasets used had commercially permissive licenses through verification with the original data source.
|
| 136 |
|
|
|
|
| 13 |
- km
|
| 14 |
- ta
|
| 15 |
---
|
| 16 |
+
# SEA-LION-v1-7B-IT-GPTQ
|
| 17 |
|
| 18 |
SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
|
| 19 |
The sizes of the models range from 3 billion to 7 billion parameters.
|
| 20 |
|
| 21 |
+
SEA-LION-v1-7B-IT is a multilingual model which has been fine-tuned with **thousands of English and Indonesian instruction-completion pairs** alongside a smaller pool of instruction-completion pairs from other ASEAN languages.
|
| 22 |
These instructions have been carefully curated and rewritten to ensure the model was trained on truly open, commercially permissive and high quality datasets.
|
| 23 |
|
| 24 |
+
SEA-LION-v1-7B-IT-GPTQ is the quantized version of SEA-LION-v1-7B-IT using a [modified version](https://github.com/caviato/AutoGPTQ) of the [AutoGPTQ](https://github.com/AutoGPTQ/AutoGPTQ) library with Wikipedia texts.
|
| 25 |
|
| 26 |
SEA-LION stands for _Southeast Asian Languages In One Network_.
|
| 27 |
|
|
|
|
| 33 |
|
| 34 |
## Model Details
|
| 35 |
### Base model
|
| 36 |
+
SEA-LION-v1-7B-IT-GPTQ is quantized from [SEA-LION-v1-7B-IT](https://huggingface.co/aisingapore/SEA-LION-v1-7B-IT).
|
| 37 |
|
| 38 |
### Benchmark Performance
|
| 39 |
|
| 40 |
|
| 41 |
| Model | ARC | HellaSwag | MMLU | TruthfulQA | Average |
|
| 42 |
|--------------------------------------------------|:-----:|:---------:|:-----:|:----------:|:-------:|
|
| 43 |
+
| SEA-LION-v1-7B-IT (FP16) | 40.78 | 68.20 | 27.12 | 36.29 | 43.10 |
|
| 44 |
+
| SEA-LION-v1-7B-IT-GPTQ (4-Bit, 128 group size)| 39.93 | 67.32 | 27.11 | 36.32 | 42.67 |
|
| 45 |
|
| 46 |
### Usage
|
| 47 |
For the full installation, training and inference guide, please refer to the [Github](https://github.com/caviato/sealion-gptq).
|
| 48 |
|
| 49 |
+
In order for SEA-LION-v1-7B-IT-GPTQ to work, please install the [modified version of the AutoGPTQ](https://github.com/caviato/AutoGPTQ) library. Installation information can be found [here](https://github.com/caviato/AutoGPTQ#install-from-source).
|
| 50 |
|
| 51 |
SEA-LION can be run using the 🤗 Transformers library
|
| 52 |
```python
|
|
|
|
| 57 |
import torch
|
| 58 |
|
| 59 |
tokenizer = AutoTokenizer.from_pretrained(
|
| 60 |
+
"aisingapore/SEA-LION-v1-7B-IT-GBTQ",
|
| 61 |
trust_remote_code=True
|
| 62 |
)
|
| 63 |
|
|
|
|
| 67 |
)
|
| 68 |
|
| 69 |
model = AutoGPTQForCausalLM.from_quantized( # will be loaded to GPU
|
| 70 |
+
"aisingapore/SEA-LION-v1-7B-IT-GBTQ",
|
| 71 |
device = "cuda:0",
|
| 72 |
quantize_config = quantize_config,
|
| 73 |
torch_dtype=torch.float16,
|
|
|
|
| 127 |
|
| 128 |
## Technical Specifications
|
| 129 |
### Fine-Tuning Details
|
| 130 |
+
SEA-LION-v1-7B-IT was fine-tuned using 8x A100-40GB using parameter efficient fine tuning in the form of LoRA.
|
| 131 |
|
| 132 |
## Data
|
| 133 |
+
SEA-LION-v1-7B-IT was trained on a wide range of instructions that were manually and stringently verified by our team. A large portion of the effort was dedicated to ensuring that each instruction-completion pair that the model sees is of a high quality and any errors were corrected and rewritten by native speakers or else dropped from our mix.
|
| 134 |
|
| 135 |
In addition, special care was taken to ensure that the datasets used had commercially permissive licenses through verification with the original data source.
|
| 136 |
|