Longitudinal Compliance Analysis of AndroidApplications with Privacy Policies. (arXiv:2106.10035v1 [cs.CR])

Contemporary mobile applications (apps) are designed to track, use, and share
users data, often without their consent, which result in potential privacy and
transparency issues. To investigate whether mobile apps are transparent about
the collect information about users and apps comply with their privacy
policies, we performed longitudinal analysis of the different versions of 268
Android applications comprising 5,240 app releases or versions between 2008 and
2016. We detect inconsistencies between apps’ behaviors and stated use of data
collection, to reveal compliance issues. We utilize machine learning techniques
to classify the privacy policy text to identify the purported practices that
collect and/or share users’ personal information such as phone numbers and
email addresses. We then uncover the actual data leaks of an app through static
and dynamic analysis. Over time, our results show a steady increase in the
overall number of apps’ data collection practices that are undisclosed in the
privacy policies. This is particularly troubling since privacy policy is the
primary tool for describing the app’s privacy protection practices. We find
that newer versions of the apps are likely to be more non-compliant than their
preceding versions. The discrepancies between the purported and actual data
practices show that privacy policies are often incoherent with the apps’
behaviors, thus defying the `notice and choice’ principle when users install