Integrate Automated Tools and Custom Options for Privacy Settings

Problem Summary

Users face significant challenges in managing their online privacy due to the complexity and opacity of privacy-related information. Privacy settings are often buried within websites and applications, requiring users to navigate multiple pages and menus. Even when located, adjusting these settings can be cumbersome and unintuitive, with users often unaware of all available options and their consequences. The varied layout and functionality of privacy settings across platforms further complicate the process, as users must learn new configurations for each website or application. Additionally, many services have default settings that favour data collection and sharing, leading to unintentional exposure of personal information. These issues result in a lack of awareness and control over personal data, leaving users vulnerable to privacy risks.

Rationale

The aim is to simplify the process for users to manage their privacy preferences within privacy policies and privacy settings pages.

Solution

The development and integration of automated tools and customisable options that simplify privacy management and enhance user control. By integrating automated tools and customisable options, the solution enhances transparency regarding data collection practices and aids users in making informed decisions. Automated detection and presentation of privacy options reduce the complexity and effort required to locate and adjust settings.

Kumar et al. [1] presented a method that automates the extraction and presentation of opt-out choices from privacy policies. The method was realised through the Opt-Out Easy, a web browser extension available for Google Chrome and Firefox, Such automatic detection enables users to easily identify and exercise their opt-out choices, improving control over their privacy settings.
To develop Opt-Out Easy, the researchers collected a corpus of 236 privacy policies from the top 500 websites on the U.S. Alexa list in the fall of 2018. Using Selenium and Geckodriver, they captured the dynamically rendered content of these privacy policies. The Mercury Parser API stripped away irrelevant content, leaving the main text for analysis. This text was structured into a Document Object Model (DOM) tree using BeautifulSoup and the lxml parser, addressing issues with invalid HTML.
A significant challenge was segmenting the policy text, which often lacked conventional punctuation. The researchers developed a recursive function to insert spaces at line breaks, enabling effective segmentation with NLTK’s sentence tokeniser. Hyperlinks within the policies were meticulously annotated to distinguish between opt-out mechanisms and other links. This process involved both automated filtering and manual inspection.
The team trained a logistic regression model to classify these hyperlinks, identifying patterns indicative of opt-out mechanisms. They also explored active learning to reduce manual labelling. This approach significantly advanced the methodology for detecting opt-out options in privacy policies.
The resulting dataset is publicly available at https://www.usableprivacy.org/data. Users can install the browser extension and test it on policies available at The Usable Privacy Explore Website or submit new policies for analysis.

Mohammadi, Pampus, and Heisel [2] introduced a Privacy Policy Options (PPO) pattern. The PPO intends to provide a user-friendly and intuitive representation of privacy policies, empowering end-users to specify their privacy preferences. The solution’s core is a structured table pattern that categorises data usage into specific areas: service and support, research and development, marketing, and security, along with categories for data sharing with third parties and data retention periods. The table is designed to clearly show the types of data collected and how it is processed, shared, and stored. Symbols indicate whether data collection is mandatory or optional, allowing users to understand and modify their privacy preferences directly within the table. This pattern is adaptable to mobile and web applications, aiming to make privacy policies more transparent and customisable for the end-user, enhancing their control over personal data usage.

Khandelwal et al. [3] proposed PriSEC, a machine learning-based tool to automate the discovery, presentation, and enforcement of web privacy controls. PriSEC simplifies privacy management by providing a centralised interface for users to search, modify, and enforce privacy settings across various websites with minimal intervention. PriSEC's backend employs a Domain Crawler to systematically navigate websites and identify potential privacy control pages using keywords like "privacy" and "settings." The crawler simulates user interactions to reveal both visible and hidden privacy control elements. A machine learning classifier then distinguishes genuine privacy control pages from irrelevant ones based on textual and visual features. Once control pages are identified, the Recipe Generator takes over. This dynamic component interacts with each page, uncovering all interactive elements and organizing them into a structured dependency graph. This graph serves as a roadmap, mapping out the sequences of actions required to navigate privacy settings. The generator further classifies these elements into distinct UI types and creates detailed control recipes that encapsulate the necessary steps to adjust privacy settings automatically. The Client Application operates as a browser extension, providing a user-friendly interface for interacting with privacy settings identified by PriSEC's backend. Users can view a list of privacy options and issue natural language queries to quickly find specific settings. The Enforcer module then automatically applies the selected privacy settings by executing the necessary JavaScript code, requiring no further user interaction. Unfortunately, no prototype is publicly available for testing, and the dataset of privacy control pages mentioned by the authors is not yet available at https://github.com/wi-pi/prisec_data (as of May 2024).

Platforms: personal computers

Example

Opt-Out Easy screen for requesting the analysis of a privacy policy <a href="#section1">[1]</a>.

Opt-Out Easy allows users to request that a privacy policy be analysed [1]. (See enlarged)

Opt-Out Easy results <a href="#section1">[1]</a>.

Opt-Out Easy results for a website with a privacy policy already analysed (in the example, thewaltdisneycompany.com, available from the Usable Privacy Explore website [1]. (See enlarged)

Opt-out Easy help page <a href="#section1">[1]</a>.

Opt-out Easy help page [1]. (See enlarged)

Overview of an example realisation of the PPO pattern for the web layout <a href="#section2">[2]</a>.

Overview of an example realisation of the PPO pattern for the web layout [2]. (See enlarged)

A typical workflow of enforcement in PriSEC <a href="#section3">[3]</a>.

A typical workflow of enforcement in PriSEC [3]. (See enlarged)

Use cases
  • Facilitating the practical exercise of opt-out choices and personalisation of privacy preferences.
  • Systematic analysis of opt-out practices and advocacy for better opt-out standards.
Pros

  • The Opt-Out Easy extension increases user awareness of available opt-out choices, contributing to better privacy management. It helps users identify and exercise opt-out options more quickly, enhancing user empowerment regarding privacy decisions. The extension also highlights the potential for crowd-sourced efforts to pressure entities to improve opt-out services' availability and response times, advocating for better standards. Additionally, it suggests that systematically collecting opt-out process statistics could inform policymakers and encourage the implementation of minimum standards for opt-out service availability and responsiveness [1].
  • PriSEC significantly reduced users' time to adjust privacy settings, with participants completing tasks 3.75 times faster than with manual methods. It simplifies privacy management to a few clicks, avoiding extensive navigation, and achieves a higher System Usability Scale (SUS) score of 72 compared to 63 for the manual method, indicating a better user interface and overall user experience [3].
  • The pattern was tested and designed to be applicable across mobile and web applications, aiming to improve the practical implementation of privacy policies for service providers and end-users. Additionally, Transparent data handling and the ability for users to intervene in their data processing are expected to increase the trustworthiness of the service, as well as help service providers demonstrate GDPR compliance and potentially increase user trust by showing a commitment to transparent data handling and user empowerment [2].

Cons

  • There is no legal obligation for service providers to adopt a standardised format for privacy policies, potentially limiting the widespread adoption of the pattern. Service providers might also restrict the modifiability of the pattern by greying out options, reducing user control over data processing. Additionally, allowing users to disagree with certain data processing activities might restrict the use of the service, deterring both users and providers from fully embracing the pattern. Since service providers often rely on data usage for revenue, there may be resistance to implementing features that allow users to limit data processing [2].
  • PriSEC may struggle to manage privacy settings that require interactions with multiple selectors before enforcement and can fail to generate control recipes for websites using non-standard HTML implementations. Its backend may also face difficulties logging into websites without standard third-party social logins, necessitating manual intervention. Additionally, the diverse implementations of web technologies and changes in website HTML can render stored control recipes obsolete, leading to enforcement failures and impacting the system's long-term effectiveness [3].
  • The Opt-Out Easy extension focused exclusively on opt-out links using anchor tags, overlooking those implemented via non-anchor tags with JavaScript event handlers. Additionally, the study only considered policies written in English and was limited to top-ranked U.S. websites, missing diversity in privacy policies from non-U.S. and lower-ranked sites [1].

Privacy Choices

This guideline discusses solutions related to the privacy choices design space [4], as they focus on providing diverse and customisable privacy options for users.

  • Binary choices > Opt-in/out
    The opt-out-easy fits within the binary choice opt-in/out due to its focus on enabling users to easily opt out of data collection practices, which inherently involves making binary decisions.

  • On-demand
    This guideline discusses solutions that can be applied on-demand when users actively seek the tool's input to make informed decisions.
  • Just in time
    This guideline discusses solutions that aid users in identifying and exercising opt-out choices or denying privacy policies precisely at the moment such choices are relevant—when users are visiting websites and potentially subject to data collection practices.

  • Visual
    This guideline presents and discusses solutions designed to be delivered visually in the form of text, images, and icons, ensuring clear and understandable communication of privacy choices to users.

  • Feedback
    The solutions examined in this guideline are designed to provide users with immediate feedback on privacy-related actions, such as opting out of data collection practices or understanding privacy policies. These tools offer timely notifications about the actions taken, enhancing user awareness and decision-making. However, it is important to note that while these tools provide feedback, they do not guarantee enforcement, as they act as intermediaries between the user and the service's privacy mechanisms. This distinction highlights the tools' role in supporting user empowerment and informed choices rather than directly controlling data practices.
  • Presentation
    Privacy choices always involve presenting clear and easily understandable information to users about potential data practices, available options, and how to communicate privacy decisions. This often includes multiple components and integrates with related privacy notices, requiring careful consideration of design dimensions such as timing, channel, and modality [4].
    This guideline presents solutions that align with the definition of the presentation of privacy choices, aiming for clarity, understanding of data practices, and effective communication with users regarding their privacy decisions.

  • Primary
    Discussed solutions exemplify the use of the primary channel to facilitate direct and immediate interaction with users regarding their privacy choices embedded within the context of their current online activities.

Control

This guideline discusses solutions that provide users with tools and mechanisms to manage and control their privacy settings. This includes obtaining informed consent, opting out of data collection or processing, and allowing users to influence how service providers handle their personal data. The supporting papers emphasise enhancing user empowerment, simplifying the process of managing privacy settings, and making privacy controls more user-friendly. Other related privacy attributes:

Security can be considered as Indirectly related, as the ability to opt out of certain data uses can be a component of securing personal data against unwanted or malicious access or processing.

Detecting and highlighting privacy control mechanisms can improve transparency by clarifying how they can prevent their data from being used for certain purposes.

The discussed solutions leverage the aspect of allowing users to control and limit the sharing of their personal data with third parties to prevent unauthorised data sharing.


References

[1] Vinayshekhar Bannihatti Kumar, Roger Iyengar, Namita Nisal, Yuanyuan Feng, Hana Habib, Peter Story, Sushain Cherivirala, Margaret Hagan, Lorrie Cranor, Shomir Wilson, Florian Schaub, and Norman Sadeh. Finding a Choice in a Haystack: Automatic Extraction of Opt-Out Statements from Privacy Policy Text. In Proceedings of The Web Conference 2020 (WWW '20). Association for Computing Machinery, New York, NY, USA, 2020, 1943–1954 https://doi.org/10.1145/3366423.3380262

[2] Nazila Gol Mohammadi, Julia Pampus, and Maritta Heisel (2019). Pattern-based incorporation of privacy preferences into privacy policies: negotiating the conflicting needs of service providers and end-users. In Proceedings of the 24th European Conference on Pattern Languages of Programs (EuroPLop '19). Association for Computing Machinery, New York, NY, USA, 2019, Article 5, 1–12 https://doi.org/10.1145/3361149.3361154

[3] Rishabh Khandelwal, Thomas Linden, Hamza Harkous, and Kassem Fawaz. {PriSEC}: A Privacy Settings Enforcement Controller. In 30th USENIX Security Symposium (USENIX Security 21), 2021, 465-482. https://www.usenix.org/conference/usenixsecurity21/presentation/khandelwal

[4] Yuanyuan Feng, Yaxing Yao, and Norman Sadeh (2021). A Design Space for Privacy Choices: Towards Meaningful Privacy Control in the Internet of Things. In CHI Conference on Human Factors in Computing Systems (CHI ’21), May 8–13, 2021, Yokohama, Japan. ACM, New York, NY, USA, 16 pages. https://doi.org/10.1145/3411764.3445148

[5] Susanne Barth, Dan Ionita, and Pieter Hartel (2022). Understanding Online Privacy — A Systematic Review of Privacy Visualizations and Privacy by Design Guidelines. ACM Comput. Surv. 55, 3, Article 63 (February 2022), 37 pages. https://doi.org/10.1145/3502288