Ryan CHan

Update README.md

05e8aef over 3 years ago

4.97 kB

	---
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	- f1
	model-index:
	- name: dit_base_binary_task
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# dit_base_binary_task

	This model is a fine-tuned version of [microsoft/dit-base](https://huggingface.co/microsoft/dit-base) on the davanstrien/leicester_loaded_annotations_binary dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.0513
	- Accuracy: 0.9873
	- F1: 0.9600

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 64
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| F1 \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|:------:\|
	\| No log \| 0.87 \| 5 \| 0.6816 \| 0.5 \| 0.2476 \|
	\| 0.7387 \| 1.87 \| 10 \| 0.5142 \| 0.8354 \| 0.0 \|
	\| 0.7387 \| 2.87 \| 15 \| 0.4690 \| 0.8354 \| 0.0 \|
	\| 0.4219 \| 3.87 \| 20 \| 0.5460 \| 0.8354 \| 0.0 \|
	\| 0.4219 \| 4.87 \| 25 \| 0.4703 \| 0.8354 \| 0.0 \|
	\| 0.3734 \| 5.87 \| 30 \| 0.4371 \| 0.8354 \| 0.0 \|
	\| 0.3734 \| 6.87 \| 35 \| 0.4147 \| 0.8354 \| 0.0 \|
	\| 0.3261 \| 7.87 \| 40 \| 0.4272 \| 0.8354 \| 0.0 \|
	\| 0.3261 \| 8.87 \| 45 \| 0.4038 \| 0.8354 \| 0.0 \|
	\| 0.3078 \| 9.87 \| 50 \| 0.3418 \| 0.8354 \| 0.0 \|
	\| 0.3078 \| 10.87 \| 55 \| 0.3042 \| 0.8354 \| 0.0 \|
	\| 0.2501 \| 11.87 \| 60 \| 0.2799 \| 0.8354 \| 0.0 \|
	\| 0.2501 \| 12.87 \| 65 \| 0.1419 \| 0.9367 \| 0.7619 \|
	\| 0.1987 \| 13.87 \| 70 \| 0.1224 \| 0.9494 \| 0.8182 \|
	\| 0.1987 \| 14.87 \| 75 \| 0.0749 \| 0.9747 \| 0.9167 \|
	\| 0.1391 \| 15.87 \| 80 \| 0.0539 \| 0.9810 \| 0.9412 \|
	\| 0.1391 \| 16.87 \| 85 \| 0.0830 \| 0.9873 \| 0.9600 \|
	\| 0.1085 \| 17.87 \| 90 \| 0.0443 \| 0.9873 \| 0.9600 \|
	\| 0.1085 \| 18.87 \| 95 \| 0.0258 \| 0.9937 \| 0.9804 \|
	\| 0.1039 \| 19.87 \| 100 \| 0.1025 \| 0.9684 \| 0.8936 \|
	\| 0.1039 \| 20.87 \| 105 \| 0.1597 \| 0.9684 \| 0.8936 \|
	\| 0.1217 \| 21.87 \| 110 \| 0.0278 \| 0.9937 \| 0.9811 \|
	\| 0.1217 \| 22.87 \| 115 \| 0.0458 \| 0.9873 \| 0.9600 \|
	\| 0.0609 \| 23.87 \| 120 \| 0.0478 \| 0.9937 \| 0.9804 \|
	\| 0.0609 \| 24.87 \| 125 \| 0.0671 \| 0.9747 \| 0.9231 \|
	\| 0.1031 \| 25.87 \| 130 \| 0.0751 \| 0.9873 \| 0.9600 \|
	\| 0.1031 \| 26.87 \| 135 \| 0.1963 \| 0.9557 \| 0.8444 \|
	\| 0.0601 \| 27.87 \| 140 \| 0.0870 \| 0.9747 \| 0.9167 \|
	\| 0.0601 \| 28.87 \| 145 \| 0.0890 \| 0.9747 \| 0.9167 \|
	\| 0.0799 \| 29.87 \| 150 \| 0.1017 \| 0.9747 \| 0.9167 \|
	\| 0.0799 \| 30.87 \| 155 \| 0.0041 \| 1.0 \| 1.0 \|
	\| 0.0441 \| 31.87 \| 160 \| 0.0332 \| 0.9873 \| 0.9615 \|
	\| 0.0441 \| 32.87 \| 165 \| 0.0839 \| 0.9747 \| 0.9167 \|
	\| 0.0757 \| 33.87 \| 170 \| 0.0722 \| 0.9873 \| 0.9600 \|
	\| 0.0757 \| 34.87 \| 175 \| 0.0168 \| 0.9937 \| 0.9804 \|
	\| 0.0555 \| 35.87 \| 180 \| 0.0443 \| 0.9937 \| 0.9804 \|
	\| 0.0555 \| 36.87 \| 185 \| 0.0227 \| 0.9873 \| 0.9615 \|
	\| 0.0336 \| 37.87 \| 190 \| 0.0128 \| 0.9937 \| 0.9804 \|
	\| 0.0336 \| 38.87 \| 195 \| 0.0169 \| 0.9937 \| 0.9811 \|
	\| 0.0405 \| 39.87 \| 200 \| 0.0193 \| 0.9937 \| 0.9804 \|
	\| 0.0405 \| 40.87 \| 205 \| 0.1216 \| 0.9810 \| 0.9388 \|
	\| 0.0578 \| 41.87 \| 210 \| 0.0307 \| 0.9937 \| 0.9804 \|
	\| 0.0578 \| 42.87 \| 215 \| 0.0539 \| 0.9873 \| 0.9600 \|
	\| 0.0338 \| 43.87 \| 220 \| 0.0573 \| 0.9937 \| 0.9804 \|
	\| 0.0338 \| 44.87 \| 225 \| 0.0086 \| 1.0 \| 1.0 \|
	\| 0.0417 \| 45.87 \| 230 \| 0.0491 \| 0.9873 \| 0.9600 \|
	\| 0.0417 \| 46.87 \| 235 \| 0.0089 \| 1.0 \| 1.0 \|
	\| 0.0538 \| 47.87 \| 240 \| 0.0846 \| 0.9810 \| 0.9388 \|
	\| 0.0538 \| 48.87 \| 245 \| 0.0452 \| 0.9810 \| 0.9388 \|
	\| 0.0364 \| 49.87 \| 250 \| 0.0513 \| 0.9873 \| 0.9600 \|


	### Framework versions

	- Transformers 4.25.1
	- Pytorch 1.12.1
	- Datasets 2.7.1
	- Tokenizers 0.13.1

	---
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	- f1
	model-index:
	- name: dit_base_binary_task
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# dit_base_binary_task

	This model is a fine-tuned version of [microsoft/dit-base](https://huggingface.co/microsoft/dit-base) on the davanstrien/leicester_loaded_annotations_binary dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.0513
	- Accuracy: 0.9873
	- F1: 0.9600

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 64
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| F1 \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|:------:\|
	\| No log \| 0.87 \| 5 \| 0.6816 \| 0.5 \| 0.2476 \|
	\| 0.7387 \| 1.87 \| 10 \| 0.5142 \| 0.8354 \| 0.0 \|
	\| 0.7387 \| 2.87 \| 15 \| 0.4690 \| 0.8354 \| 0.0 \|
	\| 0.4219 \| 3.87 \| 20 \| 0.5460 \| 0.8354 \| 0.0 \|
	\| 0.4219 \| 4.87 \| 25 \| 0.4703 \| 0.8354 \| 0.0 \|
	\| 0.3734 \| 5.87 \| 30 \| 0.4371 \| 0.8354 \| 0.0 \|
	\| 0.3734 \| 6.87 \| 35 \| 0.4147 \| 0.8354 \| 0.0 \|
	\| 0.3261 \| 7.87 \| 40 \| 0.4272 \| 0.8354 \| 0.0 \|
	\| 0.3261 \| 8.87 \| 45 \| 0.4038 \| 0.8354 \| 0.0 \|
	\| 0.3078 \| 9.87 \| 50 \| 0.3418 \| 0.8354 \| 0.0 \|
	\| 0.3078 \| 10.87 \| 55 \| 0.3042 \| 0.8354 \| 0.0 \|
	\| 0.2501 \| 11.87 \| 60 \| 0.2799 \| 0.8354 \| 0.0 \|
	\| 0.2501 \| 12.87 \| 65 \| 0.1419 \| 0.9367 \| 0.7619 \|
	\| 0.1987 \| 13.87 \| 70 \| 0.1224 \| 0.9494 \| 0.8182 \|
	\| 0.1987 \| 14.87 \| 75 \| 0.0749 \| 0.9747 \| 0.9167 \|
	\| 0.1391 \| 15.87 \| 80 \| 0.0539 \| 0.9810 \| 0.9412 \|
	\| 0.1391 \| 16.87 \| 85 \| 0.0830 \| 0.9873 \| 0.9600 \|
	\| 0.1085 \| 17.87 \| 90 \| 0.0443 \| 0.9873 \| 0.9600 \|
	\| 0.1085 \| 18.87 \| 95 \| 0.0258 \| 0.9937 \| 0.9804 \|
	\| 0.1039 \| 19.87 \| 100 \| 0.1025 \| 0.9684 \| 0.8936 \|
	\| 0.1039 \| 20.87 \| 105 \| 0.1597 \| 0.9684 \| 0.8936 \|
	\| 0.1217 \| 21.87 \| 110 \| 0.0278 \| 0.9937 \| 0.9811 \|
	\| 0.1217 \| 22.87 \| 115 \| 0.0458 \| 0.9873 \| 0.9600 \|
	\| 0.0609 \| 23.87 \| 120 \| 0.0478 \| 0.9937 \| 0.9804 \|
	\| 0.0609 \| 24.87 \| 125 \| 0.0671 \| 0.9747 \| 0.9231 \|
	\| 0.1031 \| 25.87 \| 130 \| 0.0751 \| 0.9873 \| 0.9600 \|
	\| 0.1031 \| 26.87 \| 135 \| 0.1963 \| 0.9557 \| 0.8444 \|
	\| 0.0601 \| 27.87 \| 140 \| 0.0870 \| 0.9747 \| 0.9167 \|
	\| 0.0601 \| 28.87 \| 145 \| 0.0890 \| 0.9747 \| 0.9167 \|
	\| 0.0799 \| 29.87 \| 150 \| 0.1017 \| 0.9747 \| 0.9167 \|
	\| 0.0799 \| 30.87 \| 155 \| 0.0041 \| 1.0 \| 1.0 \|
	\| 0.0441 \| 31.87 \| 160 \| 0.0332 \| 0.9873 \| 0.9615 \|
	\| 0.0441 \| 32.87 \| 165 \| 0.0839 \| 0.9747 \| 0.9167 \|
	\| 0.0757 \| 33.87 \| 170 \| 0.0722 \| 0.9873 \| 0.9600 \|
	\| 0.0757 \| 34.87 \| 175 \| 0.0168 \| 0.9937 \| 0.9804 \|
	\| 0.0555 \| 35.87 \| 180 \| 0.0443 \| 0.9937 \| 0.9804 \|
	\| 0.0555 \| 36.87 \| 185 \| 0.0227 \| 0.9873 \| 0.9615 \|
	\| 0.0336 \| 37.87 \| 190 \| 0.0128 \| 0.9937 \| 0.9804 \|
	\| 0.0336 \| 38.87 \| 195 \| 0.0169 \| 0.9937 \| 0.9811 \|
	\| 0.0405 \| 39.87 \| 200 \| 0.0193 \| 0.9937 \| 0.9804 \|
	\| 0.0405 \| 40.87 \| 205 \| 0.1216 \| 0.9810 \| 0.9388 \|
	\| 0.0578 \| 41.87 \| 210 \| 0.0307 \| 0.9937 \| 0.9804 \|
	\| 0.0578 \| 42.87 \| 215 \| 0.0539 \| 0.9873 \| 0.9600 \|
	\| 0.0338 \| 43.87 \| 220 \| 0.0573 \| 0.9937 \| 0.9804 \|
	\| 0.0338 \| 44.87 \| 225 \| 0.0086 \| 1.0 \| 1.0 \|
	\| 0.0417 \| 45.87 \| 230 \| 0.0491 \| 0.9873 \| 0.9600 \|
	\| 0.0417 \| 46.87 \| 235 \| 0.0089 \| 1.0 \| 1.0 \|
	\| 0.0538 \| 47.87 \| 240 \| 0.0846 \| 0.9810 \| 0.9388 \|
	\| 0.0538 \| 48.87 \| 245 \| 0.0452 \| 0.9810 \| 0.9388 \|
	\| 0.0364 \| 49.87 \| 250 \| 0.0513 \| 0.9873 \| 0.9600 \|


	### Framework versions

	- Transformers 4.25.1
	- Pytorch 1.12.1
	- Datasets 2.7.1
	- Tokenizers 0.13.1