Radar: Fast Long-Context Decoding for Any Transformer - Publication

Transformer models have demonstrated exceptional performance across a wide range of applications. Though forming the foundation of Transformer models, the dot-product attention does not scale well to long-context data since its time requirement grows quadratically with context length. In this work, we propose Radar, a training-free approach that accelerates inference by dynamically searching for the most important context tokens. For any pre-trained Transformer, Radar can reduce the decoding time complexity without training or heuristically evicting tokens. Moreover, we provide theoretical justification for our approach, demonstrating that Radar can reliably identify the most important tokens with high probability. We conduct extensive comparisons with the previous methods on a wide range of tasks. The results demonstrate that Radar achieves the state-of-the-art performance across different architectures with reduced time complexity, offering a practical solution for efficient long-context processing of Transformers.

Bibtex

@inproceedings{
hao2025radar,
title={Radar: Fast Long-Context Decoding for Any Transformer},
author={Yongchang Hao and Mengyao Zhai and Hossein Hajimirsadeghi and Sepidehsadat Hosseini and Frederick Tung},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=ZTpWOwMrzQ}
}

Related Research

Designing Scalable Multi-Tenant Data Pipelines with Dagster’s Declarative Orchestration

Designing Scalable Multi-Tenant Data Pipelines with Dagster’s Declarative Orchestration

B. Zhang.

Research
Training foundation models up to 10x more efficiently with Memory-Mapped Datasets

Training foundation models up to 10x more efficiently with Memory-Mapped Datasets

T. Badamdorj, and M. Anand.

Research
DeepRRTime: Robust Time-series Forecasting with a Regularized INR Basis

DeepRRTime: Robust Time-series Forecasting with a Regularized INR Basis

C.S. Sastry, M. Gilany, K. Y. C. Lui, M. Magill, and A. Pashevich. Transactions on Machine Learning Research (TMLR)

Publications

Cookies Settings

Bibtex

Related Research