Affiliation:
1. The University of Chicago, Chicago, USA
2. Cornell University, Ithaca, USA
Abstract
Causal analysis is essential for gaining insights into complex real-world processes and making informed decisions. However, performing accurate causal analysis on observational data is generally infeasible, and therefore, domain experts start exploration with the identification of correlations. The increased availability of data from open government websites, organizations, and scientific studies presents an opportunity to harness observational datasets in assisting domain experts during this exploratory phase.
In this work, we introduce Nexus, a system designed to align large repositories of spatio-temporal datasets and identify correlations, facilitating the exploration of causal relationships. Nexus addresses the challenges of aligning tabular datasets across space and time, handling missing data, and identifying correlations deemed "interesting". Empirical evaluation on Chicago Open Data and United Nations datasets demonstrates the effectiveness of Nexus in exposing interesting correlations, many of which have undergone extensive scrutiny by social scientists.
Publisher
Association for Computing Machinery (ACM)
Reference89 articles.
1. Hervé Abdi et al. 2007. Bonferroni and Sidák corrections for multiple comparisons. Encyclopedia of measurement and statistics 3 01 (2007) 2007.
2. National Highway Traffic Safety Administration. 2014. DDACTS: Data-Driven Approaches to Crime and Traffic Safety Operational Guidelines. https://www.nhtsa.gov/sites/nhtsa.gov/files/811185_ddacts_opguidelines.pdf.
3. Robin Barlow and Bilkis Vissandjee. 1999. Determinants of national life expectancy. Canadian Journal of Development Studies/Revue canadienne d'études du développement 20, 1 (1999), 9--29.
4. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing