Title: Numerically stable parallel computation of (co-)variance
Abstract:With the advent of big data, we see an increasing interest in computing correlations in huge data sets with both many instances and many variables. Essential descriptive statistics such as the varianc...With the advent of big data, we see an increasing interest in computing correlations in huge data sets with both many instances and many variables. Essential descriptive statistics such as the variance, standard deviation, covariance, and correlation can suffer from a numerical instability known as "catastrophic cancellation" that can lead to problems when naively computing these statistics with a popular textbook equation. While this instability has been discussed in the literature already 50 years ago, we found that even today, some high-profile tools still employ the instable version.Read More
Publication Year: 2018
Publication Date: 2018-07-09
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 35
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot