Title: Using Machine Learning to Recommend Correctness Checks for Geographic Map Data
Abstract: Developing an industry application that serves geographic map data to users across the world presents the significant challenge of checking the data using "data correctness checks." The size of data that needs to be checked-the entire world-and data churn rate-thousands per day-makes executing the full set of candidate checks cost prohibitive. Current techniques rely on hand-curated static subsets of checks to be run at different stages of the data production pipeline, These hard-coded subsets are uninformed of data changes, and cause bug detection to be delayed to downstream quality assurance activities. To address these problems, we have developed new representations of map data changes and checks, formally defined "check safety," and built a recommender system that dynamically and automatically selects and ranks a relevant subset of checks using signals from latest data changes. Empirical evaluation shows that it improves (1) efficiency by eliminating 65% of checks unrelated to changes, (2) coverage by recommending and ranking change-related checks from the full set of candidate checks, previously excluded by the manual process, and (3) overall visibility into the data editing process by quickly and automatically identifying latest fault prone parts of the data.
Publication Year: 2019
Publication Date: 2019-05-01
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 4
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot