Sparse Encoder

This is a Sparse Encoder model trained on the json dataset using the sentence-transformers library. It maps sentences & paragraphs to a 50368-dimensional sparse vector space and can be used for semantic search and sparse retrieval.

Model Details

Model Description

  • Model Type: Sparse Encoder
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 50368 dimensions
  • Similarity Function: Dot Product
  • Training Dataset:
    • json

Model Sources

Full Model Architecture

SparseEncoder(
  (0): MLMTransformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'ModernBertForEmbeddingsFusedMeanpool'})
  (1): SpladePooling({'pooling_strategy': 'mean', 'activation_function': 'log1p_relu', 'word_embedding_dimension': 50368})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SparseEncoder

# Download from the 🤗 Hub
model = SparseEncoder("sparse_encoder_model_id")
# Run inference
queries = [
    "This review is for the newcomer to DSLR\u0027s and not the advanced user. So we don\u0027t need to hear the experts bashing on how someone doesn\u0027t know what they are doing and shoudn\u0027t get a camera like this. I am speaking for the person who needs a camera to capture their kids in the school play, soccer game etc. and can\u0027t capture the moment with a point and shoot. You can\u0027t expect awesome pictures if you do not learn how to use all the settings, features, and purpose of an SLR camera. You might even get worse pictures than your point and shoot. Sure an SLR has an auto feature where it will figure what is the best settings to use for a basic picture, but you also end up with a basic picture.  But even all said I would still recommend an SLR and a good zoom lens that you can afford. It is worth spending the money because you can never recapture that moment of your child or whatever your trying to capture.  How many pictures have you taken with a point and shoot that were horrible because they were too dark or blurry? Sure they are fine if the lighting is good and you are within 10 feet or so away. Any further than that well, why bother. The larger zoom type cameras are better but are still limited when lighting is an issue. The pocket type point and shoots usually have a lag time between shots or flash charging up again. So forget about getting off several shots in succesion. Hard to capture a picture if the object is in motion also. The larger zoom types have mimimal lag time and some can even fire off bursts of 3 to 5 in a second with one push of the button and better at capturing something in motion. But they still can\u0027t beat an DSLR. I have gotten shots of my children with an DSLR with low lighting and no flash from many rows of seats away and they were great. The image quality was even good enough where I can crop it and still look good. People using their camera phones crack me up, but that\u0027s another story.  I am not going to get into all the technicalities of this camera. You can read the other reviews for that. I just want to mention that if you are one of those people who need to step up to an SLR because the point and shoot couldn\u0027t do the job but don\u0027t want to become the serious photographer then this inforomation might be what you are looking for.  I find the XSI pictures are soft and not crisp. Some photographers prefer this. The new Nikon DSLR\u0027s take crisper pictures, and they are less expensive.  The main con of the XSI is the focusing points. If you use the camera in auto mode then it will focus on the nearest or most identifiable focus points it can focus on. So it might focus on an object in front of the actual thing your trying to capture that is behind it, or on a purse or print on a shirt for example instead of the persons face or what not. If your trying to get a group picture with the guest of honor in the middle of it, you might not have that person in full focus but instead the preople around because the cameras focus is focusing on the most vivid or closest points. You can set the camera in one of the manual settings and pick one focus point of your choice, like center, top, left, right etc. but you can\u0027t pick all on. In auto it will focus on more than one but like I mentioned before, not excactly on what you want. you can press the shutter button half way and it will try and refocus on different points or you can get closer to your subject and press the shutter button half way and hold it so it keeps that focus point and then you step back and snap the picture. Well you don\u0027t have the luxury of time on your hands for those kid or spontaneous shots.  So this is where I recommend the newer Nikon DSLR\u0027s. Their focus points I find tend to focus on more points and usually on the center also where my subject tends to be. The XSI was here and there. If the area your focusing on is busy, the XSI will pick a few focus points here and there were it thinks are the best. But with the Nikon, I found it to focus on more points which is nice because the whole image will be more sharper instead of certain areas of it when cropping or zooming the image on a computer. Unless you are going for the \"bokeh\" effect in which the subject is in focus and the area around it is blurred.  A camera of this size isn\u0027t always convienient to carry around to all occasions because of it\u0027s size, accessories, worries, or whatever the reason. If you already have a pocket point and shoot or zoom camera, don\u0027t get rid of them. I keep them for when I go on vacation because I don\u0027t want to carry a big DSLR around on vacation. I use the pocket one for when we go out and I don\u0027t want to carry anything bulky and for those you never know shots. I use the zoom one for the amusement parks. It\u0027s small enough where it is still manageable where I can toss in my backpack. I use that one for the amusement parks because I can get pictures of the kids on rides. The zoom and speed is adequate enough where I can get pictures of the kids in the distance or when their zooming down the roller coaster. One day I might get rid of my pocket one and get a waterproof one, that way I have all my bases covered. Ha ha.  But when you learn how to use a DSLR\u0027s features coupled with a decent lens, man the pictures are like night and day.  If you still find that a DSLR is just too much over kill for you then you might want to check out the Nikon S8100. It is a little higher end point and shoot that is a little larger than your average point and shoot but it takes amazing pictures at a distance and in low lighting and has HD video capture with stereo. My nephew has the older S8000 model of this and I was blown away at the pictures and videos he took at my childs play. And we were sitting way back. They made improvements on the S8100 where it counted so I can\u0027t imganine how good it would be now, wow.  I would recommend you try out different cameras and brands in the store and decide which one suits you. Look for optical zoom, optical image stabilization, and good low light capability.",
]
documents = [
    "Mail Boss makes several different models that look identical so pay close attention of the model names. I looked at the Architectural Oasis Jr., Architectural Oasis trilock and the Mail Boss Mail Manager. A person on youtube  demonstrated how easily he was able to manipulate his arm into the mail chute and fish out a package and them demonstrated easily and quickly he jammed a large screwdriver into the edge of the lower door and popped it open shattering the lock. The trilock is the reinforced version of the Jr. It has an upgraded locking system where the Jr. only had a single larch whereas the trilock has a latch and a metal rod on each side of the door that creates three locking points. The Jr.'s door was a simple flat sheet of metal that didn't go all the way to the surrounding edges so it is vulnerable to prying open. The trilock's door sides go all the way to the edge and is flanged for strength and the top edge is tapered to fit under the top door to prevent a screwdriver or something to get behind it.  I mention the above to give insight as to why I chose the Mail Manager. The Jr. was way too easy to pry open so it dropped out of consideration automatically. The trilock seemed just as tamper proof as the Manager as far as prying open is. The ease of fishing mail out of the Jr. and the trillock with your arm is questionable as I was not able to get my arm in like the person on youtube.  The conditions have to be perfect for someone to be able to fish out the mail. The person has to have skinny arms and the mailbox has to be low enough for the person to contort their arm and then reach down. With the Manager the door is smaller thus creating a smaller opening. The Jr, and trilock's doors are larger but there is about a 2 inch plate that hangs on the inside of the box that prevents one from reaching straight into the box and you have to contort around it.  The Jr. and trilock had rubber gaskets around the doors which is nice if you live in an area with heavy rain and wind. The Manager had some mention that their mail gets wet but that is easily addressable by getting thin adhesive gasket strips. The trilock and the Manager in my opinion were about equally secure in tamper resistance and build quality. The Manager and Jr. were about the same price and the trilock was about $20.00 more. The Manager had a little less capacity. The main thing that led me to get the Manager over the trilock was the ability to accept larger packages. It looks like the trilock should be able to accept larger packages because it has a bigger door and the Manager's door is smaller but that is not the case because of the door opening and obstruction I mentioned earlier. All three boxes were only able to accept a package that was about 8 1/2 x 4 3/4 x 3. With the Jr. and trilock the package had to be placed a certain way because of the obstruction on the inside. If you take off the mail clip on the inside of the doors then you can increase the package size to about 10x4 3/4x3. If your getting a locking mailbox then you are worried about mail theft so why would you place outgoing mail clipped to the unlocked upper door anyways. Back to the package capacity. The Jr. and trillock were maxed out in package size unless it is a flat envelope package that you can shove past the overhang with the door closed part way giving you the angle. The manager on the other hand does not have this overhang other than a slight lip so you basically have a clear pass through where you can roughly get a 10 x 15 x 1 1/4 package in. The other notable pluses for the Manager is that it comes with 4 bolts and a specialized drill bit for installation whereas the other two you had to supply your own hardware. And lastly the Jr. and trilock came with two large keys where the Manager came with three small keys. There are bigger ones in the Mail Boss line but they run around a $100.00 more. One negative about the Mail Manager is that it is shallower and slightly narrower than the Jr. And trillock so when then mailman delivers large junk mail like advertisements they can get hung up on the inside lower lip of the door because they are too wide for the box and the box isn't deep enough for gravity to have it fall in completely because the portion of mail on the inside is touching the bottom of the box along with the other end propped up on the door lip and the sides of the junk mail pressing on the sides. If your mailman is good, then all he has to do is give a light nudge and the mail will fall in. The Jr. and trilock offer a slightly wider opening and has a deeper collection area so gravity can do it's work.",
    'This is a really great book and I highly recommend it, both as a compelling autobiography and as a way to learn about Islam and Islamism.  One thing I learned from this book is the tremendous variation within Islam. The distinction that was most important to me was the difference between Religious (or Spiritual) Islam and Political Islam.  The values and social ideals of Religious Islam are virtually identical to Christiaity and Buddhism (the two world religion with which I am most familiar).  The emphasis on care for the poor, the needy, and less fortunate is indistinguishable among these religions.  Further, Islamic mysticism describes Allah in the same sacred language used by mystics of all stripes.  This blends well with our religious heritage and adds new inspiring stories and language that will benefit all with an open mind.  On the other side is Political Islam, it\'s hatred of the West, it\'s lack of sound Islamic scholarship and spirituality, it\'s bigotry and shameful treatment of women, it\'s dedication to an Islamic State, and it\'s willingness to kill in order to attain it.  This form of Islam, and I hesitate even calling it Islam given it\'s shallow scholarship and deviance from fundamental Islamic values and teachings, breeds extremism and terrorism.  This form of Islam poses a concrete threat to America and other Western countries.  Within our country, as well as countries throughout the world, this twisted form of Islam is taught, primarily, by missionaries from Saudia Arabia.  They have combined a rigid, severe, and literalist version of Islam that dates to the mid 1700s with the political goal of transforming the world into a single Islamic state.  They are True Believers of the worst kind and the foundation on which Islamic terrorists stand.  Note:  After reading The Islamist and writing the review above, I read Islam by <PERSON>, an excellent overview of Islam.  One point that he makes is that Islam is not just a religion, but a way of life, and that Islam permeates all aspects of the Muslim,s life, including politics.  Because this applies to all forms of Islam, the distinction made in The Islamist, and repeated in my review, between "Religious Islam" and "Political Islam" is not correct, or at least not precise.  Other labels also have there shortcomings, so I will let the wording used by the author stand.  Nonetheless, The Islamist remains and excellent, powerful, and intensely personal work.',
    "I recently bought a new embroidery machine as an upgrade my old entry level machine.  I thought I'd upgrade my thread as well.  The reviews for this thread seemed very positive.  I recently learned that embroidery thread has a shelf life and it was appealing that the thread was protected so well by the plastic dome and the plug.  I purchased several colors and intended to use this thread exclusively in my new machine.  Unfortunately during my first project with the new machine, this thread was breaking at a discouraging rate.  I thought something was wrong with the machine.  I actually brought it back to the dealer a couple of times but when they tested it, they had no thread breakage.  Of course, they were not using Hemingworth thread.  Nobody thought it was the thread so I would bring the machine home and experience breakage again.  In desperation, I brought the machine back to the dealer and took along my thread, hoop, stabilizer, and one of the designs that had given me grief with a lot of thread breakage.  I spent two hours with the dealer's technician.  The technician examined the machine and could find nothing wrong with it.  We then sewed my design with my Hemingworth red thread.  The thread broke again and again.  We ended up doing a lot of testing with various weight bobbin thread, tension settings, threads, needles, etc..  The conclusion was that it was the Hemingworth thread causing the issues.  The technician who patiently sat with me thought the thread felt heavy and sticky.  He said it was probably old thread from the way it behaved.  What I couldn't argue with was that every time we rethreaded with another brand of thread and sewed the same design under all the same conditions, there was no upper thread breakage.  I'm not happy that I purchased several Hemingworth spools that are now of no use to me but I'm thankful that I didn't purchase any of the large sets I was considering.",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 50368] [3, 50368]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[29.5178, 19.0865, 22.2767]])

Evaluation

Metrics

Sparse Information Retrieval

Metric Value
dot_accuracy@1 0.0494
dot_accuracy@8 0.1442
dot_accuracy@50 0.3137
dot_accuracy@100 0.4109
dot_precision@1 0.0494
dot_precision@8 0.018
dot_precision@50 0.0063
dot_precision@100 0.0041
dot_recall@1 0.0494
dot_recall@8 0.1442
dot_recall@50 0.3137
dot_recall@100 0.4109
dot_ndcg@10 0.0972
dot_mrr@10 0.0783
dot_map@100 0.0868
query_active_dims 4096.0
query_sparsity_ratio 0.9187
corpus_active_dims 4096.0
corpus_sparsity_ratio 0.9187
avg_flops 2975.3545

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 404,856 training samples
  • Columns: query and positive
  • Approximate statistics based on the first 1000 samples:
    query positive
    type string string
    details
    • min: 402 tokens
    • mean: 496.56 tokens
    • max: 512 tokens
    • min: 402 tokens
    • mean: 496.56 tokens
    • max: 512 tokens
  • Samples:
    query positive
    This knife, and theJ.A. Henckels Twin Four Star 3-Inch High Carbon Stainless-Steel Paring Knifeare the two best knives in the Henckels four star collection. There is something "just so" about them. They are just right, with all the various design parameters coming together to create a whole that is greater than the sum of the parts. This serrated utility knife works well in a great variety of applications. The five inch serrated blade is nicely thin (but still thick enough for good strength and rigidity) and shallow (i.e. not broad). I find it very useful for cutting pie or cake or brownies, as well as (of course) bread and tomatoes and many other vegetables. I've had this knife now for SIXTEEN YEARS, and it is still going strong, and still one of my favorites. However . . . you must SHARPEN this knife eventually. Like any other knife, it will go dull. NEVER HONE THIS KNIFE OR ANY OTHER SERRATED KNIFE! A sharpening steel is too large a diameter to be used on a serrated knife.... When I moved into my first (and current) house from my apartment, the previous owner had a Whirlpool (Ecodyne) WHER25 reverse osmosis system installed under the kitchen sink. I liked the water the system produced, but the flow control was misfunctioning, causing an annoying dripping sound that was almost constant. The installer (previous owner, not a plumber) had NOT made the common mistake of trimming out the flow control--which was the first thing I suspected. No, the problem, rather, was deformation of the thin rubber membranes (there are two) inside the head of the unit. I flipped them over (they are reversible) and this fixed the problem for a month or so, but it returned. I priced out new membranes/gaskets and flow control insert, with shipping, and decided that I should just start fresh with a whole new unit, since it was on a special sale locally and it would come with all new filters ($80 worth). I replaced just the head and all was well for a while. Then the tank stopp...
    When I moved into my first (and current) house from my apartment, the previous owner had a Whirlpool (Ecodyne) WHER25 reverse osmosis system installed under the kitchen sink. I liked the water the system produced, but the flow control was misfunctioning, causing an annoying dripping sound that was almost constant. The installer (previous owner, not a plumber) had NOT made the common mistake of trimming out the flow control--which was the first thing I suspected. No, the problem, rather, was deformation of the thin rubber membranes (there are two) inside the head of the unit. I flipped them over (they are reversible) and this fixed the problem for a month or so, but it returned. I priced out new membranes/gaskets and flow control insert, with shipping, and decided that I should just start fresh with a whole new unit, since it was on a special sale locally and it would come with all new filters ($80 worth). I replaced just the head and all was well for a while. Then the tank stopp... This knife, and theJ.A. Henckels Twin Four Star 3-Inch High Carbon Stainless-Steel Paring Knifeare the two best knives in the Henckels four star collection. There is something "just so" about them. They are just right, with all the various design parameters coming together to create a whole that is greater than the sum of the parts. This serrated utility knife works well in a great variety of applications. The five inch serrated blade is nicely thin (but still thick enough for good strength and rigidity) and shallow (i.e. not broad). I find it very useful for cutting pie or cake or brownies, as well as (of course) bread and tomatoes and many other vegetables. I've had this knife now for SIXTEEN YEARS, and it is still going strong, and still one of my favorites. However . . . you must SHARPEN this knife eventually. Like any other knife, it will go dull. NEVER HONE THIS KNIFE OR ANY OTHER SERRATED KNIFE! A sharpening steel is too large a diameter to be used on a serrated knife....
    The Good: Sawstop customer service is the best I have dealt with in years. When set up correctly it cuts sheet good like a dream. Only a panel saw would seem better. The adjustable stops are stout and easy to use. Great for repeat cuts. Sliding mechanism is very smooth The Bad: No postive stops - in my experience this borders on being a huge problem for two reasons. First, it is not easy to get the fence square to the blade if you want to be very accurate. On the best of days it takes me 5 minutes to get it close enough to make a 48" cut square. Without positive stops I have to square the sliding table fence every time it is bumped or removed. And, I remove it regularly as the sliding table fence sits close enough to the blade that almost all cuts over 48" using the regular saw fence demand the sliding table fence be removed or swung out of the way (if the cut is less than 48" the sliding table can be moved back with fence in place forming a little pocket to work within). The fence th... The Bad (yes there are a lot of bad things even with 5 stars): One of the worst written non fiction books I have ever owned. I really don't care if one of the authors clients liked a sauce so well that she would eat it over kitty litter. I don't care to read 100s of testaments to how good the recipes are (they are pretty good). I just want to get on with the book. Prove the recipes are not good. Don't spend all those pages trying to convince me. It even backfired. I was sure they were going to be terrible are reading all the testaments. Get ready to cook. A lot. Get ready to do a lot of dishes. Have to plan ahead. Have to make lunches the night before often. The flax seed breakfast takes some work and time. Can't just whip it up. If you run out without having already prepared more you will find yourself without a breakfast. Terribly organized. It is not sequential. You will read something and then find out later you were not suppose to do it when you did unless you read the entire b...
  • Loss: model.SpladeMixedTopKLoss.SpladeMixedTopKLoss with these parameters:
    {
        "loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score', gather_across_devices=False)",
        "document_regularizer_weight": 0.005,
        "query_regularizer_weight": 0.005,
        "document_regularizer_threshold": 256,
        "query_regularizer_threshold": 256
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • learning_rate: 0.0001
  • weight_decay: 0.0001
  • num_train_epochs: 1
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.15
  • save_only_model: True
  • bf16: True
  • dataloader_num_workers: 8
  • gradient_checkpointing: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 0.0001
  • weight_decay: 0.0001
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.15
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: True
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 8
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: True
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss sparse-ir-eval_dot_ndcg@10
0.0032 20 848.0818 -
0.0063 40 536.294 -
0.0095 60 243.6122 -
0.0126 80 118.8254 -
0.0158 100 61.1316 -
0.0190 120 35.0436 -
0.0221 140 22.4751 -
0.0253 160 16.6537 -
0.0285 180 15.5919 -
0.0316 200 10.4659 -
0.0348 220 9.746 -
0.0379 240 9.4233 -
0.0411 260 7.5791 -
0.0443 280 6.8552 -
0.0474 300 5.694 -
0.0506 320 5.189 -
0.0537 340 4.726 -
0.0569 360 3.9468 -
0.0601 380 3.5309 -
0.0632 400 3.2516 -
0.0664 420 3.1281 -
0.0696 440 3.3396 -
0.0727 460 2.9721 -
0.0759 480 2.6706 -
0.0790 500 2.6023 -
0.0822 520 2.5536 -
0.0854 540 2.3589 -
0.0885 560 2.4359 -
0.0917 580 2.4409 -
0.0948 600 2.278 -
0.0980 620 3.2585 -
0.1012 640 2.217 -
0.1043 660 2.1853 -
0.1075 680 2.111 -
0.1107 700 2.1326 -
0.1138 720 2.0903 -
0.1170 740 2.066 -
0.1201 760 1.9859 -
0.1233 780 1.9894 -
0.1265 800 1.9844 -
0.1296 820 1.9759 -
0.1328 840 1.9294 -
0.1359 860 2.0085 -
0.1391 880 1.9481 -
0.1423 900 1.9639 -
0.1454 920 1.8506 -
0.1486 940 1.8199 -
0.1518 960 1.7994 -
0.1549 980 1.8468 -
0.1581 1000 1.7654 -
0.1612 1020 1.7669 -
0.1644 1040 1.7928 -
0.1676 1060 1.73 -
0.1707 1080 1.6531 -
0.1739 1100 1.622 -
0.1770 1120 1.663 -
0.1802 1140 1.655 -
0.1834 1160 1.6236 -
0.1865 1180 1.6167 -
0.1897 1200 1.595 -
0.1929 1220 1.5427 -
0.1960 1240 1.6241 -
0.1992 1260 1.4656 -
0.2023 1280 1.5652 -
0.2055 1300 1.543 -
0.2087 1320 1.5816 -
0.2118 1340 1.6763 -
0.2150 1360 1.5836 -
0.2181 1380 1.5415 -
0.2213 1400 1.5112 -
0.2245 1420 1.5745 -
0.2276 1440 1.4835 -
0.2308 1460 1.4984 -
0.2340 1480 1.4683 -
0.2371 1500 1.3864 -
0.2403 1520 1.5377 -
0.2434 1540 1.5046 -
0.2466 1560 1.4879 -
0.2498 1580 1.3949 -
0.2529 1600 1.3844 -
0.2561 1620 1.3775 -
0.2592 1640 1.4741 -
0.2624 1660 1.5249 -
0.2656 1680 1.4609 -
0.2687 1700 1.384 -
0.2719 1720 1.4199 -
0.2751 1740 1.2902 -
0.2782 1760 1.3614 -
0.2814 1780 1.3159 -
0.2845 1800 1.4043 -
0.2877 1820 1.3711 -
0.2909 1840 1.3714 -
0.2940 1860 1.4167 -
0.2972 1880 1.3242 -
0.3003 1900 1.3728 -
0.3035 1920 1.3546 -
0.3067 1940 1.3307 -
0.3098 1960 1.2928 -
0.3130 1980 1.3671 -
0.3162 2000 1.2323 0.0466
0.3193 2020 1.3482 -
0.3225 2040 1.3388 -
0.3256 2060 1.3731 -
0.3288 2080 1.2506 -
0.3320 2100 1.3351 -
0.3351 2120 1.3408 -
0.3383 2140 1.2679 -
0.3414 2160 1.289 -
0.3446 2180 1.1917 -
0.3478 2200 1.2471 -
0.3509 2220 1.2609 -
0.3541 2240 1.2271 -
0.3573 2260 1.2712 -
0.3604 2280 1.2016 -
0.3636 2300 1.1784 -
0.3667 2320 1.2039 -
0.3699 2340 1.2637 -
0.3731 2360 1.2108 -
0.3762 2380 1.2256 -
0.3794 2400 1.228 -
0.3825 2420 1.2128 -
0.3857 2440 1.2234 -
0.3889 2460 1.2604 -
0.3920 2480 1.2839 -
0.3952 2500 1.247 -
0.3984 2520 1.1972 -
0.4015 2540 1.1735 -
0.4047 2560 1.1969 -
0.4078 2580 1.1012 -
0.4110 2600 1.186 -
0.4142 2620 1.2217 -
0.4173 2640 1.1483 -
0.4205 2660 1.1347 -
0.4236 2680 1.1717 -
0.4268 2700 1.1639 -
0.4300 2720 1.1368 -
0.4331 2740 1.0291 -
0.4363 2760 1.1819 -
0.4395 2780 1.1746 -
0.4426 2800 1.0858 -
0.4458 2820 1.1154 -
0.4489 2840 1.1061 -
0.4521 2860 1.1019 -
0.4553 2880 1.1331 -
0.4584 2900 1.1157 -
0.4616 2920 0.998 -
0.4647 2940 1.1248 -
0.4679 2960 1.1063 -
0.4711 2980 1.1335 -
0.4742 3000 1.0592 -
0.4774 3020 1.0198 -
0.4806 3040 1.0803 -
0.4837 3060 1.0682 -
0.4869 3080 1.0574 -
0.4900 3100 1.0531 -
0.4932 3120 1.1137 -
0.4964 3140 1.082 -
0.4995 3160 1.0133 -
0.5027 3180 1.1048 -
0.5058 3200 1.1139 -
0.5090 3220 1.0924 -
0.5122 3240 1.0344 -
0.5153 3260 1.0427 -
0.5185 3280 1.0867 -
0.5217 3300 1.0218 -
0.5248 3320 1.084 -
0.5280 3340 1.0198 -
0.5311 3360 1.0805 -
0.5343 3380 0.959 -
0.5375 3400 1.0043 -
0.5406 3420 1.0425 -
0.5438 3440 1.0322 -
0.5469 3460 1.0394 -
0.5501 3480 1.0332 -
0.5533 3500 1.0251 -
0.5564 3520 1.0058 -
0.5596 3540 1.0643 -
0.5628 3560 1.0501 -
0.5659 3580 0.9583 -
0.5691 3600 1.0118 -
0.5722 3620 0.9915 -
0.5754 3640 0.9232 -
0.5786 3660 0.9832 -
0.5817 3680 0.9539 -
0.5849 3700 0.9561 -
0.5880 3720 0.9345 -
0.5912 3740 0.9471 -
0.5944 3760 0.9831 -
0.5975 3780 1.0021 -
0.6007 3800 1.0178 -
0.6039 3820 0.9706 -
0.6070 3840 0.9825 -
0.6102 3860 1.016 -
0.6133 3880 0.8974 -
0.6165 3900 0.8818 -
0.6197 3920 0.913 -
0.6228 3940 0.9275 -
0.6260 3960 0.9632 -
0.6291 3980 0.9866 -
0.6323 4000 0.9693 0.0940
0.6355 4020 0.9294 -
0.6386 4040 0.9444 -
0.6418 4060 0.9298 -
0.6450 4080 0.8788 -
0.6481 4100 0.9033 -
0.6513 4120 0.9328 -
0.6544 4140 0.9284 -
0.6576 4160 0.9507 -
0.6608 4180 0.9688 -
0.6639 4200 0.9124 -
0.6671 4220 0.8412 -
0.6702 4240 0.9065 -
0.6734 4260 0.8408 -
0.6766 4280 0.9456 -
0.6797 4300 0.8641 -
0.6829 4320 0.894 -
0.6861 4340 0.8876 -
0.6892 4360 0.935 -
0.6924 4380 0.8805 -
0.6955 4400 0.8278 -
0.6987 4420 0.8243 -
0.7019 4440 0.8886 -
0.7050 4460 0.8486 -
0.7082 4480 0.8496 -
0.7113 4500 0.8937 -
0.7145 4520 0.931 -
0.7177 4540 0.8276 -
0.7208 4560 0.8825 -
0.7240 4580 0.8827 -
0.7272 4600 0.8466 -
0.7303 4620 0.9349 -
0.7335 4640 0.8376 -
0.7366 4660 0.8292 -
0.7398 4680 0.8591 -
0.7430 4700 0.7949 -
0.7461 4720 0.9626 -
0.7493 4740 0.8082 -
0.7525 4760 0.7986 -
0.7556 4780 0.8056 -
0.7588 4800 0.857 -
0.7619 4820 0.8355 -
0.7651 4840 0.8903 -
0.7683 4860 0.8429 -
0.7714 4880 0.796 -
0.7746 4900 0.7887 -
0.7777 4920 0.887 -
0.7809 4940 0.8574 -
0.7841 4960 0.879 -
0.7872 4980 0.8083 -
0.7904 5000 0.797 -
0.7936 5020 0.7833 -
0.7967 5040 0.8312 -
0.7999 5060 0.8098 -
0.8030 5080 0.7976 -
0.8062 5100 0.7444 -
0.8094 5120 0.8059 -
0.8125 5140 0.8265 -
0.8157 5160 0.8062 -
0.8188 5180 0.7823 -
0.8220 5200 0.8586 -
0.8252 5220 0.8439 -
0.8283 5240 0.9102 -
0.8315 5260 0.8299 -
0.8347 5280 0.848 -
0.8378 5300 0.7742 -
0.8410 5320 0.8484 -
0.8441 5340 0.769 -
0.8473 5360 0.8132 -
0.8505 5380 0.821 -
0.8536 5400 0.8055 -
0.8568 5420 0.8342 -
0.8599 5440 0.8397 -
0.8631 5460 0.8156 -
0.8663 5480 0.729 -
0.8694 5500 0.8196 -
0.8726 5520 0.828 -
0.8758 5540 0.7892 -
0.8789 5560 0.7577 -
0.8821 5580 0.8399 -
0.8852 5600 0.7002 -
0.8884 5620 0.8063 -
0.8916 5640 0.8414 -
0.8947 5660 0.812 -
0.8979 5680 0.8235 -
0.9010 5700 0.7945 -
0.9042 5720 0.8551 -
0.9074 5740 0.861 -
0.9105 5760 0.7446 -
0.9137 5780 0.8001 -
0.9169 5800 0.7581 -
0.9200 5820 0.8134 -
0.9232 5840 0.8014 -
0.9263 5860 0.8245 -
0.9295 5880 0.78 -
0.9327 5900 0.7105 -
0.9358 5920 0.8057 -
0.9390 5940 0.8372 -
0.9421 5960 0.7693 -
0.9453 5980 0.7894 -
0.9485 6000 0.847 0.0972
0.9516 6020 0.8242 -
0.9548 6040 0.7428 -
0.9580 6060 0.8008 -
0.9611 6080 0.7523 -
0.9643 6100 0.7182 -
0.9674 6120 0.8748 -
0.9706 6140 0.8329 -
0.9738 6160 0.8176 -
0.9769 6180 0.7237 -
0.9801 6200 0.7097 -
0.9832 6220 0.7186 -
0.9864 6240 0.8387 -
0.9896 6260 0.7929 -
0.9927 6280 0.7785 -
0.9959 6300 0.8018 -
0.9991 6320 0.7283 -

Framework Versions

  • Python: 3.12.7
  • Sentence Transformers: 5.1.1
  • Transformers: 4.57.3
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 3.5.0
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

FlopsLoss

@article{paria2020minimizing,
    title={Minimizing flops to learn efficient sparse representations},
    author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s},
    journal={arXiv preprint arXiv:2004.05665},
    year={2020}
}
Downloads last month
10
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for UBC-SLIME/splade-base-mean

Evaluation results