AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context Retrieval - Publication

Machine-based prediction of real-world events is garnering attention due to its potential for informed decision-making. Whereas traditional forecasting predominantly hinges on structured data like time-series, recent breakthroughs in language models enable predictions using unstructured text. In particular, (Zou et al., 2022) unveils AutoCast, a new benchmark that employs news articles for answering forecasting queries. Nevertheless, existing methods still trail behind human performance. The cornerstone of accurate forecasting, we argue, lies in identifying a concise, yet rich subset of news snippets from a vast corpus. With this motivation, we introduce AutoCast++, a zero-shot ranking-based context retrieval system, tailored to sift through expansive news document collections for event forecasting. Our approach first re-ranks articles based on zero-shot question-passage relevance, honing in on semantically pertinent news. Following this, the chosen articles are subjected to zero-shot summarization to attain succinct context. Leveraging a pre-trained language model, we conduct both the relevance evaluation and article summarization without needing domain-specific training. Notably, recent articles can sometimes be at odds with preceding ones due to new facts or unanticipated incidents, leading to fluctuating temporal dynamics. To tackle this, our re-ranking mechanism gives preference to more recent articles, and we further regularize the multi-passage representation learning to align with human forecaster responses made on different dates. Empirical results underscore marked improvements across multiple metrics, improving the performance for multiple-choice questions (MCQ) by 48% and true/false (TF) questions by up to 8%.

Bibtex

@misc{yan2023autocast,
title={AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context Retrieval},
author={Qi Yan and Raihan Seraj and Jiawei He and Lili Meng and Tristan Sylvain},
year={2023},
eprint={2310.01880},
archivePrefix={arXiv},
primaryClass={cs.LG}
}

Related Research

Designing Scalable Multi-Tenant Data Pipelines with Dagster’s Declarative Orchestration

Designing Scalable Multi-Tenant Data Pipelines with Dagster’s Declarative Orchestration

B. Zhang.

Research
Training foundation models up to 10x more efficiently with Memory-Mapped Datasets

Training foundation models up to 10x more efficiently with Memory-Mapped Datasets

T. Badamdorj, and M. Anand.

Research
DeepRRTime: Robust Time-series Forecasting with a Regularized INR Basis

DeepRRTime: Robust Time-series Forecasting with a Regularized INR Basis

C.S. Sastry, M. Gilany, K. Y. C. Lui, M. Magill, and A. Pashevich. Transactions on Machine Learning Research (TMLR)

Publications

Work With Us!

RBC Borealis is looking to hire for various roles across different teams. Visit our career page now and discover opportunities to join similar impactful projects!

Careers at RBC Borealis

Cookies Settings

Bibtex

Related Research

Work With Us!