Title: Estimating the real‐world usage of mobile apps for mental health: development and application of two novel metrics
Abstract: Mobile apps for health and wellness (mHealth apps) have the potential to expand access to information and support, especially for people who are unable to access face-to-face care. The role of these apps is becoming especially salient during the ongoing COVID-19 crisis. According to recent estimates, there are over 325,000 mHealth apps, with 78,000 added in 2017 alone1. Estimating the effectiveness of these apps has become a topic of great interest2, 3. Few studies have examined the extent to which mHealth apps reach real-world users. Some recent work suggests that mHealth app marketplaces may be highly skewed, with a small number of apps attracting a large share of users4. To rigorously test this assumption, an empirical characterization of asymmetry in mHealth markets is needed. Existing approaches, while useful, often lack an intuitive or easily interpretable meaning5. This presents a challenge that is especially important in the context of mental health research, which requires communication between experts from a variety of disciplines (e.g., psychiatry, clinical psychology, digital health, economics, public policy). We examined the dissemination of mHealth apps for a variety of mental health conditions, searching the Apple App Store and Google Play Store in March 2020. We applied the following search terms: “addiction”, “anxiety”, “depression”, “eating disorders”, “fitness”, “mood tracking”, “schizophrenia”, and “sleep”. Consistent with previous work, apps within the top 50 search hits on either app store were screened4. We included apps designed to offer treatment, support or information. We collected monthly active user (MAU) data from Mobile Action, a mobile app market research firm, for a one-month period from March 14, 2020 to April 13, 2020. Total MAUs per category ranged from 264,763 (addiction) to 47,133,801 (fitness), with a median of 6,254,650. We then developed two novel metrics to characterize the mHealth app marketspaces: the market share index-n (MSI-n) and the number needed to reach-p (NNR-p). The MSI-n refers to the percentage of MAUs in a category (e.g., depression apps) that is accounted for by the top n apps. For example, “MSI-3” refers to the percentage of MAUs that is accounted for by the three most popular apps. Higher MSI values indicate that the top apps are responsible for a greater proportion of active users. The NNR-p refers to the minimum number of apps that are needed to account for p percentage of active users. For example, “NNR-90” refers to the number of apps that are required to account for 90% of MAUs in a category. Lower NNR values indicate that the top apps are responsible for a greater proportion of active users. For each of the above-mentioned categories, we calculated the MSI-3, MSI-10 and NNR-90. In six of the eight categories, the top three apps were responsible for more than 50% of MAUs. The MSI-3 values were 41.5% for fitness (indicating that the top three apps accounted for 41.5% of MAUs in the fitness category), 45.6% for addiction, 66.2% for depression, 66.4% for sleep, 75.7% for anxiety, 79.2% for mood tracking, 88.9% for eating disorders, and 98.1% for schizophrenia, with a median MSI-3 value of 71.1%. The median MSI-10 value was 91.4% (ranging from 67.6% for fitness to 99.97% for schizophrenia). The NNR-90 values were 2 for schizophrenia (indicating that the top two schizophrenia apps accounted for 90% of MAUs), 4 for eating disorders, 6 for mood tracking, 7 for anxiety, 11 for depression, 12 for addiction, 16 for sleep, and 25 for fitness. The median NNR-90 value was 9. Thus, app marketplaces for mental illnesses (e.g., schizophrenia, eating disorders) were more asymmetrical than those for overall health and wellness (e.g., fitness, sleep). These findings have important implications for the study and evaluation of mHealth apps. To better characterize the content that real-world users encounter through these apps, we recommend that usage data be incorporated to adjust the findings of mHealth app reviews6. Additionally, there are over 45 app evaluation frameworks, and there has been enormous interest in developing tools that help consumers sift through crowded app marketplaces7, 8. However, the reliability and validity of such tools has been criticized, as many of them yield different and sometimes conflicting conclusions8. Due to overwhelming volume of mHealth apps and app evaluation methods, it is not surprising that such issues arise: investigators commonly evaluate hundreds or thousands of apps, a labor-intensive process that can yield cursory or unreliable evaluations. Given the skewness of mHealth app marketplaces, consumers may benefit more from highly detailed and reliable evaluations of a much smaller number of apps – those that they are most likely to encounter and use9. The exact number of popular apps may vary by mHealth category. To account for this, investigators could apply the MSI-n and NNR-p metrics. For example, using the NNR-p metric, investigators can determine how many apps, in a given category, should be evaluated in order to account for those that reach a certain proportion of active users. We collected MAU data in March-April 2020. This allowed us to characterize patterns of use during the COVID-19 pandemic, a period in which mHealth apps are playing an essential public health role. Future research could examine if these trends generalize during other time periods. Merging research on the efficacy of mHealth apps with that on usage will be appropriate to accurately estimate the real-world impact of these apps, determine research priorities, and inform the public about benefits and risks. Such a body of research could meaningfully change the way we study and evaluate mHealth apps, advancing a key priority in the digital health field that is likely to affect millions of consumers in the years ahead.