LRAT-E5

LRAT-E5 is a dense retriever obtained by fine-tuning intfloat/multilingual-e5-large-instruct with LRAT (Learning to Retrieve from Agent Trajectories), a training framework that adapts retrieval models to the behavior of search agents.

The central idea of LRAT is simple: if search is increasingly consumed by agents rather than humans, retrieval models should be trained from agent interaction data rather than only from human-centric supervision. This checkpoint is therefore optimized using query-document supervision mined from deep research trajectories.

What This Model Is For

This checkpoint is intended for:

  • dense retrieval in agentic search and deep research systems
  • retrieval for long-horizon information-seeking tasks
  • evidence retrieval for multi-step reasoning pipelines
  • replacing a general-purpose E5 retriever with one aligned to search-agent behavior

It is less suitable for:

  • generic sentence embedding leaderboards without task-specific evaluation
  • non-retrieval classification workflows
  • fully unrelated domains without adaptation

Model Details

  • Base model: intfloat/multilingual-e5-large-instruct
  • Training framework: LRAT
  • Retriever type: dense bi-encoder
  • Objective: weighted contrastive learning with trajectory-derived supervision
  • Intended use: retrieval for search agents operating in multi-step reasoning environments

Training Data

The trajectory-derived supervision used for LRAT is constructed from:

  • 10K seed queries from InfoSeekQA
  • Tongyi-DeepResearch-30B-A3B as the trajectory generation agent
  • Wiki-25-Dump as the retrieval corpus
  • multiple retrievers during collection, including BM25 and Qwen3-based retrievers

The paper reports:

  • 26,482 valid trajectories
  • 91,713 training pairs

For more details, please refer to the Training Data.

Training Configuration

The paper reports the following fine-tuning setup:

  • epochs: 2
  • batch size: 32
  • learning rate: 1e-6
  • maximum sequence length: 512
  • group size: 10
  • temperature: 0.02

The training is implemented with a FlagEmbedding-based dense retriever recipe.

Evaluation Summary

LRAT-E5 is evaluated inside six different search agents on:

  • InfoSeek-Eval
  • BrowseComp-Plus

The gains are consistent across both smaller task-optimized agents and larger generalist agentic foundation models.

Representative Results for This Backbone

Agent Backbone InfoSeek-Eval SR BrowseComp-Plus SR BrowseComp-Plus Recall
AgentCPM-Explore (base) 47.3 15.9 26.5
AgentCPM-Explore (+ LRAT) 49.7 15.9 32.1
WebExplore (base) 60.0 25.4 50.4
WebExplore (+ LRAT) 63.3 29.0 56.1
Tongyi-DeepResearch (base) 56.7 20.7 54.8
Tongyi-DeepResearch (+ LRAT) 68.0 23.9 61.8
GPT-OSS (120B, base) 41.7 10.8 50.1
GPT-OSS (120B, + LRAT) 50.7 13.1 56.0
MiniMax-M2.1 (base) 64.0 46.4 64.9
MiniMax-M2.1 (+ LRAT) 75.0 48.7 69.7
GLM-4.7 (base) 73.7 46.4 68.7
GLM-4.7 (+ LRAT) 81.7 50.6 76.3

Across these agents, LRAT improves:

  • InfoSeek-Eval success rate by roughly +5.1% to +21.6% relative
  • BrowseComp-Plus success rate by roughly +0.0% to +21.3% relative
  • BrowseComp-Plus evidence recall by roughly +7.4% to +21.1% relative

Usage

This model should be used with the same preprocessing and embedding extraction recipe as the upstream E5 backbone and the LRAT codebase.

from transformers import AutoTokenizer, AutoModel

model_id = "Yuqi-Zhou/LRAT-multilingual-e5-large"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModel.from_pretrained(model_id, trust_remote_code=True)

# Apply the same pooling / normalization setup used for
# multilingual-e5-large-instruct in your retrieval pipeline.

If you use E5-style query instructions or prefixes in your system, keep them consistent with your evaluation and indexing setup.

License and Release Notes

Please confirm the final release license against:

  • the upstream base model license
  • the released training dataset license
  • any additional source corpus obligations

This card can be updated later with a final project-level license statement if needed.

Citation

If you use this checkpoint, please cite the LRAT paper.

@misc{lrat2026,
  title        = {Learning to Retrieve from Agent Trajectories},
  author       = {TODO},
  year         = {2026},
  howpublished = {Manuscript in preparation},
  note         = {arXiv link to be added}
}

Links

  • Paper: TODO
  • Project page: TODO
  • Code: https://github.com/Yuqi-Zhou/LRAT
  • Model: https://huggingface.co/Yuqi-Zhou/LRAT-multilingual-e5-large
  • Dataset: https://huggingface.co/datasets/Yuqi-Zhou/LRAT-Train
  • Companion Qwen3 checkpoint: https://huggingface.co/Yuqi-Zhou/LRAT-Qwen3-Embedding-0.6B
Downloads last month
27
Safetensors
Model size
0.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Yuqi-Zhou/LRAT-multilingual-e5-large

Finetuned
(171)
this model

Collection including Yuqi-Zhou/LRAT-multilingual-e5-large