Minimizing FLOPs to Learn Efficient Sparse Representations
Paper
•
2004.05665
•
Published
This is a Sparse Encoder model trained on the json dataset using the sentence-transformers library. It maps sentences & paragraphs to a 50368-dimensional sparse vector space and can be used for semantic search and sparse retrieval.
SparseEncoder(
(0): MLMTransformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'ModernBertForEmbeddingsFusedMeanpool'})
(1): SpladePooling({'pooling_strategy': 'mean', 'activation_function': 'log1p_relu', 'word_embedding_dimension': 50368})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SparseEncoder
# Download from the 🤗 Hub
model = SparseEncoder("sparse_encoder_model_id")
# Run inference
queries = [
"This review is for the newcomer to DSLR\u0027s and not the advanced user. So we don\u0027t need to hear the experts bashing on how someone doesn\u0027t know what they are doing and shoudn\u0027t get a camera like this. I am speaking for the person who needs a camera to capture their kids in the school play, soccer game etc. and can\u0027t capture the moment with a point and shoot. You can\u0027t expect awesome pictures if you do not learn how to use all the settings, features, and purpose of an SLR camera. You might even get worse pictures than your point and shoot. Sure an SLR has an auto feature where it will figure what is the best settings to use for a basic picture, but you also end up with a basic picture. But even all said I would still recommend an SLR and a good zoom lens that you can afford. It is worth spending the money because you can never recapture that moment of your child or whatever your trying to capture. How many pictures have you taken with a point and shoot that were horrible because they were too dark or blurry? Sure they are fine if the lighting is good and you are within 10 feet or so away. Any further than that well, why bother. The larger zoom type cameras are better but are still limited when lighting is an issue. The pocket type point and shoots usually have a lag time between shots or flash charging up again. So forget about getting off several shots in succesion. Hard to capture a picture if the object is in motion also. The larger zoom types have mimimal lag time and some can even fire off bursts of 3 to 5 in a second with one push of the button and better at capturing something in motion. But they still can\u0027t beat an DSLR. I have gotten shots of my children with an DSLR with low lighting and no flash from many rows of seats away and they were great. The image quality was even good enough where I can crop it and still look good. People using their camera phones crack me up, but that\u0027s another story. I am not going to get into all the technicalities of this camera. You can read the other reviews for that. I just want to mention that if you are one of those people who need to step up to an SLR because the point and shoot couldn\u0027t do the job but don\u0027t want to become the serious photographer then this inforomation might be what you are looking for. I find the XSI pictures are soft and not crisp. Some photographers prefer this. The new Nikon DSLR\u0027s take crisper pictures, and they are less expensive. The main con of the XSI is the focusing points. If you use the camera in auto mode then it will focus on the nearest or most identifiable focus points it can focus on. So it might focus on an object in front of the actual thing your trying to capture that is behind it, or on a purse or print on a shirt for example instead of the persons face or what not. If your trying to get a group picture with the guest of honor in the middle of it, you might not have that person in full focus but instead the preople around because the cameras focus is focusing on the most vivid or closest points. You can set the camera in one of the manual settings and pick one focus point of your choice, like center, top, left, right etc. but you can\u0027t pick all on. In auto it will focus on more than one but like I mentioned before, not excactly on what you want. you can press the shutter button half way and it will try and refocus on different points or you can get closer to your subject and press the shutter button half way and hold it so it keeps that focus point and then you step back and snap the picture. Well you don\u0027t have the luxury of time on your hands for those kid or spontaneous shots. So this is where I recommend the newer Nikon DSLR\u0027s. Their focus points I find tend to focus on more points and usually on the center also where my subject tends to be. The XSI was here and there. If the area your focusing on is busy, the XSI will pick a few focus points here and there were it thinks are the best. But with the Nikon, I found it to focus on more points which is nice because the whole image will be more sharper instead of certain areas of it when cropping or zooming the image on a computer. Unless you are going for the \"bokeh\" effect in which the subject is in focus and the area around it is blurred. A camera of this size isn\u0027t always convienient to carry around to all occasions because of it\u0027s size, accessories, worries, or whatever the reason. If you already have a pocket point and shoot or zoom camera, don\u0027t get rid of them. I keep them for when I go on vacation because I don\u0027t want to carry a big DSLR around on vacation. I use the pocket one for when we go out and I don\u0027t want to carry anything bulky and for those you never know shots. I use the zoom one for the amusement parks. It\u0027s small enough where it is still manageable where I can toss in my backpack. I use that one for the amusement parks because I can get pictures of the kids on rides. The zoom and speed is adequate enough where I can get pictures of the kids in the distance or when their zooming down the roller coaster. One day I might get rid of my pocket one and get a waterproof one, that way I have all my bases covered. Ha ha. But when you learn how to use a DSLR\u0027s features coupled with a decent lens, man the pictures are like night and day. If you still find that a DSLR is just too much over kill for you then you might want to check out the Nikon S8100. It is a little higher end point and shoot that is a little larger than your average point and shoot but it takes amazing pictures at a distance and in low lighting and has HD video capture with stereo. My nephew has the older S8000 model of this and I was blown away at the pictures and videos he took at my childs play. And we were sitting way back. They made improvements on the S8100 where it counted so I can\u0027t imganine how good it would be now, wow. I would recommend you try out different cameras and brands in the store and decide which one suits you. Look for optical zoom, optical image stabilization, and good low light capability.",
]
documents = [
"Mail Boss makes several different models that look identical so pay close attention of the model names. I looked at the Architectural Oasis Jr., Architectural Oasis trilock and the Mail Boss Mail Manager. A person on youtube demonstrated how easily he was able to manipulate his arm into the mail chute and fish out a package and them demonstrated easily and quickly he jammed a large screwdriver into the edge of the lower door and popped it open shattering the lock. The trilock is the reinforced version of the Jr. It has an upgraded locking system where the Jr. only had a single larch whereas the trilock has a latch and a metal rod on each side of the door that creates three locking points. The Jr.'s door was a simple flat sheet of metal that didn't go all the way to the surrounding edges so it is vulnerable to prying open. The trilock's door sides go all the way to the edge and is flanged for strength and the top edge is tapered to fit under the top door to prevent a screwdriver or something to get behind it. I mention the above to give insight as to why I chose the Mail Manager. The Jr. was way too easy to pry open so it dropped out of consideration automatically. The trilock seemed just as tamper proof as the Manager as far as prying open is. The ease of fishing mail out of the Jr. and the trillock with your arm is questionable as I was not able to get my arm in like the person on youtube. The conditions have to be perfect for someone to be able to fish out the mail. The person has to have skinny arms and the mailbox has to be low enough for the person to contort their arm and then reach down. With the Manager the door is smaller thus creating a smaller opening. The Jr, and trilock's doors are larger but there is about a 2 inch plate that hangs on the inside of the box that prevents one from reaching straight into the box and you have to contort around it. The Jr. and trilock had rubber gaskets around the doors which is nice if you live in an area with heavy rain and wind. The Manager had some mention that their mail gets wet but that is easily addressable by getting thin adhesive gasket strips. The trilock and the Manager in my opinion were about equally secure in tamper resistance and build quality. The Manager and Jr. were about the same price and the trilock was about $20.00 more. The Manager had a little less capacity. The main thing that led me to get the Manager over the trilock was the ability to accept larger packages. It looks like the trilock should be able to accept larger packages because it has a bigger door and the Manager's door is smaller but that is not the case because of the door opening and obstruction I mentioned earlier. All three boxes were only able to accept a package that was about 8 1/2 x 4 3/4 x 3. With the Jr. and trilock the package had to be placed a certain way because of the obstruction on the inside. If you take off the mail clip on the inside of the doors then you can increase the package size to about 10x4 3/4x3. If your getting a locking mailbox then you are worried about mail theft so why would you place outgoing mail clipped to the unlocked upper door anyways. Back to the package capacity. The Jr. and trillock were maxed out in package size unless it is a flat envelope package that you can shove past the overhang with the door closed part way giving you the angle. The manager on the other hand does not have this overhang other than a slight lip so you basically have a clear pass through where you can roughly get a 10 x 15 x 1 1/4 package in. The other notable pluses for the Manager is that it comes with 4 bolts and a specialized drill bit for installation whereas the other two you had to supply your own hardware. And lastly the Jr. and trilock came with two large keys where the Manager came with three small keys. There are bigger ones in the Mail Boss line but they run around a $100.00 more. One negative about the Mail Manager is that it is shallower and slightly narrower than the Jr. And trillock so when then mailman delivers large junk mail like advertisements they can get hung up on the inside lower lip of the door because they are too wide for the box and the box isn't deep enough for gravity to have it fall in completely because the portion of mail on the inside is touching the bottom of the box along with the other end propped up on the door lip and the sides of the junk mail pressing on the sides. If your mailman is good, then all he has to do is give a light nudge and the mail will fall in. The Jr. and trilock offer a slightly wider opening and has a deeper collection area so gravity can do it's work.",
'This is a really great book and I highly recommend it, both as a compelling autobiography and as a way to learn about Islam and Islamism. One thing I learned from this book is the tremendous variation within Islam. The distinction that was most important to me was the difference between Religious (or Spiritual) Islam and Political Islam. The values and social ideals of Religious Islam are virtually identical to Christiaity and Buddhism (the two world religion with which I am most familiar). The emphasis on care for the poor, the needy, and less fortunate is indistinguishable among these religions. Further, Islamic mysticism describes Allah in the same sacred language used by mystics of all stripes. This blends well with our religious heritage and adds new inspiring stories and language that will benefit all with an open mind. On the other side is Political Islam, it\'s hatred of the West, it\'s lack of sound Islamic scholarship and spirituality, it\'s bigotry and shameful treatment of women, it\'s dedication to an Islamic State, and it\'s willingness to kill in order to attain it. This form of Islam, and I hesitate even calling it Islam given it\'s shallow scholarship and deviance from fundamental Islamic values and teachings, breeds extremism and terrorism. This form of Islam poses a concrete threat to America and other Western countries. Within our country, as well as countries throughout the world, this twisted form of Islam is taught, primarily, by missionaries from Saudia Arabia. They have combined a rigid, severe, and literalist version of Islam that dates to the mid 1700s with the political goal of transforming the world into a single Islamic state. They are True Believers of the worst kind and the foundation on which Islamic terrorists stand. Note: After reading The Islamist and writing the review above, I read Islam by <PERSON>, an excellent overview of Islam. One point that he makes is that Islam is not just a religion, but a way of life, and that Islam permeates all aspects of the Muslim,s life, including politics. Because this applies to all forms of Islam, the distinction made in The Islamist, and repeated in my review, between "Religious Islam" and "Political Islam" is not correct, or at least not precise. Other labels also have there shortcomings, so I will let the wording used by the author stand. Nonetheless, The Islamist remains and excellent, powerful, and intensely personal work.',
"I recently bought a new embroidery machine as an upgrade my old entry level machine. I thought I'd upgrade my thread as well. The reviews for this thread seemed very positive. I recently learned that embroidery thread has a shelf life and it was appealing that the thread was protected so well by the plastic dome and the plug. I purchased several colors and intended to use this thread exclusively in my new machine. Unfortunately during my first project with the new machine, this thread was breaking at a discouraging rate. I thought something was wrong with the machine. I actually brought it back to the dealer a couple of times but when they tested it, they had no thread breakage. Of course, they were not using Hemingworth thread. Nobody thought it was the thread so I would bring the machine home and experience breakage again. In desperation, I brought the machine back to the dealer and took along my thread, hoop, stabilizer, and one of the designs that had given me grief with a lot of thread breakage. I spent two hours with the dealer's technician. The technician examined the machine and could find nothing wrong with it. We then sewed my design with my Hemingworth red thread. The thread broke again and again. We ended up doing a lot of testing with various weight bobbin thread, tension settings, threads, needles, etc.. The conclusion was that it was the Hemingworth thread causing the issues. The technician who patiently sat with me thought the thread felt heavy and sticky. He said it was probably old thread from the way it behaved. What I couldn't argue with was that every time we rethreaded with another brand of thread and sewed the same design under all the same conditions, there was no upper thread breakage. I'm not happy that I purchased several Hemingworth spools that are now of no use to me but I'm thankful that I didn't purchase any of the large sets I was considering.",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 50368] [3, 50368]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[29.5178, 19.0865, 22.2767]])
sparse-ir-evalSparseInformationRetrievalEvaluator with these parameters:{
"max_active_dims": 4096
}
| Metric | Value |
|---|---|
| dot_accuracy@1 | 0.0494 |
| dot_accuracy@8 | 0.1442 |
| dot_accuracy@50 | 0.3137 |
| dot_accuracy@100 | 0.4109 |
| dot_precision@1 | 0.0494 |
| dot_precision@8 | 0.018 |
| dot_precision@50 | 0.0063 |
| dot_precision@100 | 0.0041 |
| dot_recall@1 | 0.0494 |
| dot_recall@8 | 0.1442 |
| dot_recall@50 | 0.3137 |
| dot_recall@100 | 0.4109 |
| dot_ndcg@10 | 0.0972 |
| dot_mrr@10 | 0.0783 |
| dot_map@100 | 0.0868 |
| query_active_dims | 4096.0 |
| query_sparsity_ratio | 0.9187 |
| corpus_active_dims | 4096.0 |
| corpus_sparsity_ratio | 0.9187 |
| avg_flops | 2975.3545 |
query and positive| query | positive | |
|---|---|---|
| type | string | string |
| details |
|
|
| query | positive |
|---|---|
This knife, and theJ.A. Henckels Twin Four Star 3-Inch High Carbon Stainless-Steel Paring Knifeare the two best knives in the Henckels four star collection. There is something "just so" about them. They are just right, with all the various design parameters coming together to create a whole that is greater than the sum of the parts. This serrated utility knife works well in a great variety of applications. The five inch serrated blade is nicely thin (but still thick enough for good strength and rigidity) and shallow (i.e. not broad). I find it very useful for cutting pie or cake or brownies, as well as (of course) bread and tomatoes and many other vegetables. I've had this knife now for SIXTEEN YEARS, and it is still going strong, and still one of my favorites. However . . . you must SHARPEN this knife eventually. Like any other knife, it will go dull. NEVER HONE THIS KNIFE OR ANY OTHER SERRATED KNIFE! A sharpening steel is too large a diameter to be used on a serrated knife.... |
When I moved into my first (and current) house from my apartment, the previous owner had a Whirlpool (Ecodyne) WHER25 reverse osmosis system installed under the kitchen sink. I liked the water the system produced, but the flow control was misfunctioning, causing an annoying dripping sound that was almost constant. The installer (previous owner, not a plumber) had NOT made the common mistake of trimming out the flow control--which was the first thing I suspected. No, the problem, rather, was deformation of the thin rubber membranes (there are two) inside the head of the unit. I flipped them over (they are reversible) and this fixed the problem for a month or so, but it returned. I priced out new membranes/gaskets and flow control insert, with shipping, and decided that I should just start fresh with a whole new unit, since it was on a special sale locally and it would come with all new filters ($80 worth). I replaced just the head and all was well for a while. Then the tank stopp... |
When I moved into my first (and current) house from my apartment, the previous owner had a Whirlpool (Ecodyne) WHER25 reverse osmosis system installed under the kitchen sink. I liked the water the system produced, but the flow control was misfunctioning, causing an annoying dripping sound that was almost constant. The installer (previous owner, not a plumber) had NOT made the common mistake of trimming out the flow control--which was the first thing I suspected. No, the problem, rather, was deformation of the thin rubber membranes (there are two) inside the head of the unit. I flipped them over (they are reversible) and this fixed the problem for a month or so, but it returned. I priced out new membranes/gaskets and flow control insert, with shipping, and decided that I should just start fresh with a whole new unit, since it was on a special sale locally and it would come with all new filters ($80 worth). I replaced just the head and all was well for a while. Then the tank stopp... |
This knife, and theJ.A. Henckels Twin Four Star 3-Inch High Carbon Stainless-Steel Paring Knifeare the two best knives in the Henckels four star collection. There is something "just so" about them. They are just right, with all the various design parameters coming together to create a whole that is greater than the sum of the parts. This serrated utility knife works well in a great variety of applications. The five inch serrated blade is nicely thin (but still thick enough for good strength and rigidity) and shallow (i.e. not broad). I find it very useful for cutting pie or cake or brownies, as well as (of course) bread and tomatoes and many other vegetables. I've had this knife now for SIXTEEN YEARS, and it is still going strong, and still one of my favorites. However . . . you must SHARPEN this knife eventually. Like any other knife, it will go dull. NEVER HONE THIS KNIFE OR ANY OTHER SERRATED KNIFE! A sharpening steel is too large a diameter to be used on a serrated knife.... |
The Good: Sawstop customer service is the best I have dealt with in years. When set up correctly it cuts sheet good like a dream. Only a panel saw would seem better. The adjustable stops are stout and easy to use. Great for repeat cuts. Sliding mechanism is very smooth The Bad: No postive stops - in my experience this borders on being a huge problem for two reasons. First, it is not easy to get the fence square to the blade if you want to be very accurate. On the best of days it takes me 5 minutes to get it close enough to make a 48" cut square. Without positive stops I have to square the sliding table fence every time it is bumped or removed. And, I remove it regularly as the sliding table fence sits close enough to the blade that almost all cuts over 48" using the regular saw fence demand the sliding table fence be removed or swung out of the way (if the cut is less than 48" the sliding table can be moved back with fence in place forming a little pocket to work within). The fence th... |
The Bad (yes there are a lot of bad things even with 5 stars): One of the worst written non fiction books I have ever owned. I really don't care if one of the authors clients liked a sauce so well that she would eat it over kitty litter. I don't care to read 100s of testaments to how good the recipes are (they are pretty good). I just want to get on with the book. Prove the recipes are not good. Don't spend all those pages trying to convince me. It even backfired. I was sure they were going to be terrible are reading all the testaments. Get ready to cook. A lot. Get ready to do a lot of dishes. Have to plan ahead. Have to make lunches the night before often. The flax seed breakfast takes some work and time. Can't just whip it up. If you run out without having already prepared more you will find yourself without a breakfast. Terribly organized. It is not sequential. You will read something and then find out later you were not suppose to do it when you did unless you read the entire b... |
model.SpladeMixedTopKLoss.SpladeMixedTopKLoss with these parameters:{
"loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score', gather_across_devices=False)",
"document_regularizer_weight": 0.005,
"query_regularizer_weight": 0.005,
"document_regularizer_threshold": 256,
"query_regularizer_threshold": 256
}
eval_strategy: stepsper_device_train_batch_size: 64learning_rate: 0.0001weight_decay: 0.0001num_train_epochs: 1lr_scheduler_type: cosinewarmup_ratio: 0.15save_only_model: Truebf16: Truedataloader_num_workers: 8gradient_checkpointing: Trueoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 64per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 0.0001weight_decay: 0.0001adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 1max_steps: -1lr_scheduler_type: cosinelr_scheduler_kwargs: {}warmup_ratio: 0.15warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Truerestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 8dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Truegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | sparse-ir-eval_dot_ndcg@10 |
|---|---|---|---|
| 0.0032 | 20 | 848.0818 | - |
| 0.0063 | 40 | 536.294 | - |
| 0.0095 | 60 | 243.6122 | - |
| 0.0126 | 80 | 118.8254 | - |
| 0.0158 | 100 | 61.1316 | - |
| 0.0190 | 120 | 35.0436 | - |
| 0.0221 | 140 | 22.4751 | - |
| 0.0253 | 160 | 16.6537 | - |
| 0.0285 | 180 | 15.5919 | - |
| 0.0316 | 200 | 10.4659 | - |
| 0.0348 | 220 | 9.746 | - |
| 0.0379 | 240 | 9.4233 | - |
| 0.0411 | 260 | 7.5791 | - |
| 0.0443 | 280 | 6.8552 | - |
| 0.0474 | 300 | 5.694 | - |
| 0.0506 | 320 | 5.189 | - |
| 0.0537 | 340 | 4.726 | - |
| 0.0569 | 360 | 3.9468 | - |
| 0.0601 | 380 | 3.5309 | - |
| 0.0632 | 400 | 3.2516 | - |
| 0.0664 | 420 | 3.1281 | - |
| 0.0696 | 440 | 3.3396 | - |
| 0.0727 | 460 | 2.9721 | - |
| 0.0759 | 480 | 2.6706 | - |
| 0.0790 | 500 | 2.6023 | - |
| 0.0822 | 520 | 2.5536 | - |
| 0.0854 | 540 | 2.3589 | - |
| 0.0885 | 560 | 2.4359 | - |
| 0.0917 | 580 | 2.4409 | - |
| 0.0948 | 600 | 2.278 | - |
| 0.0980 | 620 | 3.2585 | - |
| 0.1012 | 640 | 2.217 | - |
| 0.1043 | 660 | 2.1853 | - |
| 0.1075 | 680 | 2.111 | - |
| 0.1107 | 700 | 2.1326 | - |
| 0.1138 | 720 | 2.0903 | - |
| 0.1170 | 740 | 2.066 | - |
| 0.1201 | 760 | 1.9859 | - |
| 0.1233 | 780 | 1.9894 | - |
| 0.1265 | 800 | 1.9844 | - |
| 0.1296 | 820 | 1.9759 | - |
| 0.1328 | 840 | 1.9294 | - |
| 0.1359 | 860 | 2.0085 | - |
| 0.1391 | 880 | 1.9481 | - |
| 0.1423 | 900 | 1.9639 | - |
| 0.1454 | 920 | 1.8506 | - |
| 0.1486 | 940 | 1.8199 | - |
| 0.1518 | 960 | 1.7994 | - |
| 0.1549 | 980 | 1.8468 | - |
| 0.1581 | 1000 | 1.7654 | - |
| 0.1612 | 1020 | 1.7669 | - |
| 0.1644 | 1040 | 1.7928 | - |
| 0.1676 | 1060 | 1.73 | - |
| 0.1707 | 1080 | 1.6531 | - |
| 0.1739 | 1100 | 1.622 | - |
| 0.1770 | 1120 | 1.663 | - |
| 0.1802 | 1140 | 1.655 | - |
| 0.1834 | 1160 | 1.6236 | - |
| 0.1865 | 1180 | 1.6167 | - |
| 0.1897 | 1200 | 1.595 | - |
| 0.1929 | 1220 | 1.5427 | - |
| 0.1960 | 1240 | 1.6241 | - |
| 0.1992 | 1260 | 1.4656 | - |
| 0.2023 | 1280 | 1.5652 | - |
| 0.2055 | 1300 | 1.543 | - |
| 0.2087 | 1320 | 1.5816 | - |
| 0.2118 | 1340 | 1.6763 | - |
| 0.2150 | 1360 | 1.5836 | - |
| 0.2181 | 1380 | 1.5415 | - |
| 0.2213 | 1400 | 1.5112 | - |
| 0.2245 | 1420 | 1.5745 | - |
| 0.2276 | 1440 | 1.4835 | - |
| 0.2308 | 1460 | 1.4984 | - |
| 0.2340 | 1480 | 1.4683 | - |
| 0.2371 | 1500 | 1.3864 | - |
| 0.2403 | 1520 | 1.5377 | - |
| 0.2434 | 1540 | 1.5046 | - |
| 0.2466 | 1560 | 1.4879 | - |
| 0.2498 | 1580 | 1.3949 | - |
| 0.2529 | 1600 | 1.3844 | - |
| 0.2561 | 1620 | 1.3775 | - |
| 0.2592 | 1640 | 1.4741 | - |
| 0.2624 | 1660 | 1.5249 | - |
| 0.2656 | 1680 | 1.4609 | - |
| 0.2687 | 1700 | 1.384 | - |
| 0.2719 | 1720 | 1.4199 | - |
| 0.2751 | 1740 | 1.2902 | - |
| 0.2782 | 1760 | 1.3614 | - |
| 0.2814 | 1780 | 1.3159 | - |
| 0.2845 | 1800 | 1.4043 | - |
| 0.2877 | 1820 | 1.3711 | - |
| 0.2909 | 1840 | 1.3714 | - |
| 0.2940 | 1860 | 1.4167 | - |
| 0.2972 | 1880 | 1.3242 | - |
| 0.3003 | 1900 | 1.3728 | - |
| 0.3035 | 1920 | 1.3546 | - |
| 0.3067 | 1940 | 1.3307 | - |
| 0.3098 | 1960 | 1.2928 | - |
| 0.3130 | 1980 | 1.3671 | - |
| 0.3162 | 2000 | 1.2323 | 0.0466 |
| 0.3193 | 2020 | 1.3482 | - |
| 0.3225 | 2040 | 1.3388 | - |
| 0.3256 | 2060 | 1.3731 | - |
| 0.3288 | 2080 | 1.2506 | - |
| 0.3320 | 2100 | 1.3351 | - |
| 0.3351 | 2120 | 1.3408 | - |
| 0.3383 | 2140 | 1.2679 | - |
| 0.3414 | 2160 | 1.289 | - |
| 0.3446 | 2180 | 1.1917 | - |
| 0.3478 | 2200 | 1.2471 | - |
| 0.3509 | 2220 | 1.2609 | - |
| 0.3541 | 2240 | 1.2271 | - |
| 0.3573 | 2260 | 1.2712 | - |
| 0.3604 | 2280 | 1.2016 | - |
| 0.3636 | 2300 | 1.1784 | - |
| 0.3667 | 2320 | 1.2039 | - |
| 0.3699 | 2340 | 1.2637 | - |
| 0.3731 | 2360 | 1.2108 | - |
| 0.3762 | 2380 | 1.2256 | - |
| 0.3794 | 2400 | 1.228 | - |
| 0.3825 | 2420 | 1.2128 | - |
| 0.3857 | 2440 | 1.2234 | - |
| 0.3889 | 2460 | 1.2604 | - |
| 0.3920 | 2480 | 1.2839 | - |
| 0.3952 | 2500 | 1.247 | - |
| 0.3984 | 2520 | 1.1972 | - |
| 0.4015 | 2540 | 1.1735 | - |
| 0.4047 | 2560 | 1.1969 | - |
| 0.4078 | 2580 | 1.1012 | - |
| 0.4110 | 2600 | 1.186 | - |
| 0.4142 | 2620 | 1.2217 | - |
| 0.4173 | 2640 | 1.1483 | - |
| 0.4205 | 2660 | 1.1347 | - |
| 0.4236 | 2680 | 1.1717 | - |
| 0.4268 | 2700 | 1.1639 | - |
| 0.4300 | 2720 | 1.1368 | - |
| 0.4331 | 2740 | 1.0291 | - |
| 0.4363 | 2760 | 1.1819 | - |
| 0.4395 | 2780 | 1.1746 | - |
| 0.4426 | 2800 | 1.0858 | - |
| 0.4458 | 2820 | 1.1154 | - |
| 0.4489 | 2840 | 1.1061 | - |
| 0.4521 | 2860 | 1.1019 | - |
| 0.4553 | 2880 | 1.1331 | - |
| 0.4584 | 2900 | 1.1157 | - |
| 0.4616 | 2920 | 0.998 | - |
| 0.4647 | 2940 | 1.1248 | - |
| 0.4679 | 2960 | 1.1063 | - |
| 0.4711 | 2980 | 1.1335 | - |
| 0.4742 | 3000 | 1.0592 | - |
| 0.4774 | 3020 | 1.0198 | - |
| 0.4806 | 3040 | 1.0803 | - |
| 0.4837 | 3060 | 1.0682 | - |
| 0.4869 | 3080 | 1.0574 | - |
| 0.4900 | 3100 | 1.0531 | - |
| 0.4932 | 3120 | 1.1137 | - |
| 0.4964 | 3140 | 1.082 | - |
| 0.4995 | 3160 | 1.0133 | - |
| 0.5027 | 3180 | 1.1048 | - |
| 0.5058 | 3200 | 1.1139 | - |
| 0.5090 | 3220 | 1.0924 | - |
| 0.5122 | 3240 | 1.0344 | - |
| 0.5153 | 3260 | 1.0427 | - |
| 0.5185 | 3280 | 1.0867 | - |
| 0.5217 | 3300 | 1.0218 | - |
| 0.5248 | 3320 | 1.084 | - |
| 0.5280 | 3340 | 1.0198 | - |
| 0.5311 | 3360 | 1.0805 | - |
| 0.5343 | 3380 | 0.959 | - |
| 0.5375 | 3400 | 1.0043 | - |
| 0.5406 | 3420 | 1.0425 | - |
| 0.5438 | 3440 | 1.0322 | - |
| 0.5469 | 3460 | 1.0394 | - |
| 0.5501 | 3480 | 1.0332 | - |
| 0.5533 | 3500 | 1.0251 | - |
| 0.5564 | 3520 | 1.0058 | - |
| 0.5596 | 3540 | 1.0643 | - |
| 0.5628 | 3560 | 1.0501 | - |
| 0.5659 | 3580 | 0.9583 | - |
| 0.5691 | 3600 | 1.0118 | - |
| 0.5722 | 3620 | 0.9915 | - |
| 0.5754 | 3640 | 0.9232 | - |
| 0.5786 | 3660 | 0.9832 | - |
| 0.5817 | 3680 | 0.9539 | - |
| 0.5849 | 3700 | 0.9561 | - |
| 0.5880 | 3720 | 0.9345 | - |
| 0.5912 | 3740 | 0.9471 | - |
| 0.5944 | 3760 | 0.9831 | - |
| 0.5975 | 3780 | 1.0021 | - |
| 0.6007 | 3800 | 1.0178 | - |
| 0.6039 | 3820 | 0.9706 | - |
| 0.6070 | 3840 | 0.9825 | - |
| 0.6102 | 3860 | 1.016 | - |
| 0.6133 | 3880 | 0.8974 | - |
| 0.6165 | 3900 | 0.8818 | - |
| 0.6197 | 3920 | 0.913 | - |
| 0.6228 | 3940 | 0.9275 | - |
| 0.6260 | 3960 | 0.9632 | - |
| 0.6291 | 3980 | 0.9866 | - |
| 0.6323 | 4000 | 0.9693 | 0.0940 |
| 0.6355 | 4020 | 0.9294 | - |
| 0.6386 | 4040 | 0.9444 | - |
| 0.6418 | 4060 | 0.9298 | - |
| 0.6450 | 4080 | 0.8788 | - |
| 0.6481 | 4100 | 0.9033 | - |
| 0.6513 | 4120 | 0.9328 | - |
| 0.6544 | 4140 | 0.9284 | - |
| 0.6576 | 4160 | 0.9507 | - |
| 0.6608 | 4180 | 0.9688 | - |
| 0.6639 | 4200 | 0.9124 | - |
| 0.6671 | 4220 | 0.8412 | - |
| 0.6702 | 4240 | 0.9065 | - |
| 0.6734 | 4260 | 0.8408 | - |
| 0.6766 | 4280 | 0.9456 | - |
| 0.6797 | 4300 | 0.8641 | - |
| 0.6829 | 4320 | 0.894 | - |
| 0.6861 | 4340 | 0.8876 | - |
| 0.6892 | 4360 | 0.935 | - |
| 0.6924 | 4380 | 0.8805 | - |
| 0.6955 | 4400 | 0.8278 | - |
| 0.6987 | 4420 | 0.8243 | - |
| 0.7019 | 4440 | 0.8886 | - |
| 0.7050 | 4460 | 0.8486 | - |
| 0.7082 | 4480 | 0.8496 | - |
| 0.7113 | 4500 | 0.8937 | - |
| 0.7145 | 4520 | 0.931 | - |
| 0.7177 | 4540 | 0.8276 | - |
| 0.7208 | 4560 | 0.8825 | - |
| 0.7240 | 4580 | 0.8827 | - |
| 0.7272 | 4600 | 0.8466 | - |
| 0.7303 | 4620 | 0.9349 | - |
| 0.7335 | 4640 | 0.8376 | - |
| 0.7366 | 4660 | 0.8292 | - |
| 0.7398 | 4680 | 0.8591 | - |
| 0.7430 | 4700 | 0.7949 | - |
| 0.7461 | 4720 | 0.9626 | - |
| 0.7493 | 4740 | 0.8082 | - |
| 0.7525 | 4760 | 0.7986 | - |
| 0.7556 | 4780 | 0.8056 | - |
| 0.7588 | 4800 | 0.857 | - |
| 0.7619 | 4820 | 0.8355 | - |
| 0.7651 | 4840 | 0.8903 | - |
| 0.7683 | 4860 | 0.8429 | - |
| 0.7714 | 4880 | 0.796 | - |
| 0.7746 | 4900 | 0.7887 | - |
| 0.7777 | 4920 | 0.887 | - |
| 0.7809 | 4940 | 0.8574 | - |
| 0.7841 | 4960 | 0.879 | - |
| 0.7872 | 4980 | 0.8083 | - |
| 0.7904 | 5000 | 0.797 | - |
| 0.7936 | 5020 | 0.7833 | - |
| 0.7967 | 5040 | 0.8312 | - |
| 0.7999 | 5060 | 0.8098 | - |
| 0.8030 | 5080 | 0.7976 | - |
| 0.8062 | 5100 | 0.7444 | - |
| 0.8094 | 5120 | 0.8059 | - |
| 0.8125 | 5140 | 0.8265 | - |
| 0.8157 | 5160 | 0.8062 | - |
| 0.8188 | 5180 | 0.7823 | - |
| 0.8220 | 5200 | 0.8586 | - |
| 0.8252 | 5220 | 0.8439 | - |
| 0.8283 | 5240 | 0.9102 | - |
| 0.8315 | 5260 | 0.8299 | - |
| 0.8347 | 5280 | 0.848 | - |
| 0.8378 | 5300 | 0.7742 | - |
| 0.8410 | 5320 | 0.8484 | - |
| 0.8441 | 5340 | 0.769 | - |
| 0.8473 | 5360 | 0.8132 | - |
| 0.8505 | 5380 | 0.821 | - |
| 0.8536 | 5400 | 0.8055 | - |
| 0.8568 | 5420 | 0.8342 | - |
| 0.8599 | 5440 | 0.8397 | - |
| 0.8631 | 5460 | 0.8156 | - |
| 0.8663 | 5480 | 0.729 | - |
| 0.8694 | 5500 | 0.8196 | - |
| 0.8726 | 5520 | 0.828 | - |
| 0.8758 | 5540 | 0.7892 | - |
| 0.8789 | 5560 | 0.7577 | - |
| 0.8821 | 5580 | 0.8399 | - |
| 0.8852 | 5600 | 0.7002 | - |
| 0.8884 | 5620 | 0.8063 | - |
| 0.8916 | 5640 | 0.8414 | - |
| 0.8947 | 5660 | 0.812 | - |
| 0.8979 | 5680 | 0.8235 | - |
| 0.9010 | 5700 | 0.7945 | - |
| 0.9042 | 5720 | 0.8551 | - |
| 0.9074 | 5740 | 0.861 | - |
| 0.9105 | 5760 | 0.7446 | - |
| 0.9137 | 5780 | 0.8001 | - |
| 0.9169 | 5800 | 0.7581 | - |
| 0.9200 | 5820 | 0.8134 | - |
| 0.9232 | 5840 | 0.8014 | - |
| 0.9263 | 5860 | 0.8245 | - |
| 0.9295 | 5880 | 0.78 | - |
| 0.9327 | 5900 | 0.7105 | - |
| 0.9358 | 5920 | 0.8057 | - |
| 0.9390 | 5940 | 0.8372 | - |
| 0.9421 | 5960 | 0.7693 | - |
| 0.9453 | 5980 | 0.7894 | - |
| 0.9485 | 6000 | 0.847 | 0.0972 |
| 0.9516 | 6020 | 0.8242 | - |
| 0.9548 | 6040 | 0.7428 | - |
| 0.9580 | 6060 | 0.8008 | - |
| 0.9611 | 6080 | 0.7523 | - |
| 0.9643 | 6100 | 0.7182 | - |
| 0.9674 | 6120 | 0.8748 | - |
| 0.9706 | 6140 | 0.8329 | - |
| 0.9738 | 6160 | 0.8176 | - |
| 0.9769 | 6180 | 0.7237 | - |
| 0.9801 | 6200 | 0.7097 | - |
| 0.9832 | 6220 | 0.7186 | - |
| 0.9864 | 6240 | 0.8387 | - |
| 0.9896 | 6260 | 0.7929 | - |
| 0.9927 | 6280 | 0.7785 | - |
| 0.9959 | 6300 | 0.8018 | - |
| 0.9991 | 6320 | 0.7283 | - |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@article{paria2020minimizing,
title={Minimizing flops to learn efficient sparse representations},
author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s},
journal={arXiv preprint arXiv:2004.05665},
year={2020}
}