Figure 1: DCRE significantly improves retrieval accuracy compared with general-purpose embedding models by incorporating distribution-aware representations.
Figure 2: Overview of the DCRE framework. DCRE jointly models semantic intent and data distribution constraints to enable reliable statistical function retrieval.
Figure 3: Pipline of Constructing Evaluation Tasks.
Figure 4: An example of integrating DARE into Agent.
| Model | Params | NDCG@10 | MRR@10 | Recall@10 | Recall@1 |
|---|---|---|---|---|---|
| Snowflake/arctic-embed-l | 335M | 0.7932 | 0.7510 | 0.9235 | 0.6549 |
| intfloat/e5-large-v2 | 335M | 0.7513 | 0.7086 | 0.8838 | 0.6152 |
| jina-embeddings-v2-base-en | 137M | 0.7429 | 0.6965 | 0.8873 | 0.5969 |
| BAAI/bge-m3 | 568M | 0.7308 | 0.6843 | 0.8758 | 0.5847 |
| mxbai-embed-large-v1 | 335M | 0.7068 | 0.6565 | 0.8639 | 0.5508 |
| UAE-Large-V1 | 335M | 0.7066 | 0.6556 | 0.8658 | 0.5479 |
| gte-large-en-v1.5 | 435M | 0.6639 | 0.6122 | 0.8257 | 0.5040 |
| all-mpnet-base-v2 | 110M | 0.6606 | 0.6057 | 0.8330 | 0.4937 |
| Base Model (MiniLM) | 23M | 0.6127 | 0.5553 | 0.7936 | 0.4412 |
| DCRE (Ours) | 23M | 0.9347 | 0.9176 | 0.9863 | 0.8739 |
Figure 3: Efficiency comparison measured by queries-per-second (QPS). Despite its small parameter size, DCRE achieves both superior accuracy and fast retrieval speed.
| Model | RCodingAgent (w/o DARE) | RCodingAgent with DARE |
|---|---|---|
| claude-haiku-4.5 | 6.25% | 56.25% (50.00%) |
| deepseek-v3.2 | 18.75% | 56.25% (37.50%) |
| gpt-5.2 | 25.00% | 62.50% (37.50%) |
| grok-4.1-fast | 18.75% | 75.00% (56.25%) |
| mimo-v2-flash | 12.50% | 62.50% (50.00%) |
| minimax-m2.1 | 12.50% | 68.75% (56.25%) |