A natural language database interface (NLDB) can democratize data-driven insights for non-technical users. However, existing Text-to-SQL semantic parsers cannot achieve high enough accuracy in the cross-database setting to allow good usability in practice. This work presents Turing, an NLDB system toward bridging this gap. The cross-domain semantic parser of \mname with our novel value prediction method achieves 75.1% execution accuracy, and 78.3% top-5 beam execution accuracy on the Spider validation set (Yu et al., 2018b). To benefit from the higher beam accuracy, we design an interactive system where the SQL hypotheses in the beam are explained step-by-step in natural language, with their differences highlighted. The user can then compare and judge the hypotheses to select which one reflects their intention if any. The English explanations of SQL queries in Turing are produced by our high-precision natural language generation system based on synchronous grammars.
Bibtex
@inproceedings{xu-etal-2021-turing,
title = “{TURING}: an Accurate and Interpretable Multi-Hypothesis Cross-Domain Natural Language Database Interface”,
author = “Xu, Peng and
Zi, Wenjie and
Shahidi, Hamidreza and
K{\’a}d{\’a}r, {\’A}kos and
Tang, Keyi and
Yang, Wei and
Ateeq, Jawad and
Barot, Harsh and
Alon, Meidan and
Cao, Yanshuai”,
booktitle = “Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations”,
month = aug,
year = “2021”,
address = “Online”,
publisher = “Association for Computational Linguistics”,
url = “https://aclanthology.org/2021.acl-demo.36”,
doi = “10.18653/v1/2021.acl-demo.36”,
pages = “298–305”,
abstract = “A natural language database interface (NLDB) can democratize data-driven insights for non-technical users. However, existing Text-to-SQL semantic parsers cannot achieve high enough accuracy in the cross-database setting to allow good usability in practice. This work presents TURING, a NLDB system toward bridging this gap. The cross-domain semantic parser of TURING with our novel value prediction method achieves 75.1{\%} execution accuracy, and 78.3{\%} top-5 beam execution accuracy on the Spider validation set (Yu et al., 2018b). To benefit from the higher beam accuracy, we design an interactive system where the SQL hypotheses in the beam are explained step-by-step in natural language, with their differences highlighted. The user can then compare and judge the hypotheses to select which one reflects their intention if any. The English explanations of SQL queries in TURING are produced by our high-precision natural language generation system based on synchronous grammars.”,
}
Related Research
-
A High-level Overview of Large Language Models
A High-level Overview of Large Language Models
W. Zi, L. El Asri, and S. Prince.
Research
-
ACL 2023 Recommended Reading List
ACL 2023 Recommended Reading List
P. Forsyth, K. Tang, and W. Zi.
Research
-
RBC Borealis at International Conference on Learning Representations (ICLR): Machine Learning for a better financial future
RBC Borealis at International Conference on Learning Representations (ICLR): Machine Learning for a better financial future
Learning And Generalization; Natural Language Processing; Time series Modelling
Research