Implement Integrated Personal Data Storage to Allow Users to Store and Manage Their Personal Data

Problem Summary

In the digital age, users generate vast amounts of personal data across various platforms, leading to scattered and fragmented data storage managed by multiple service providers. This fragmentation poses significant challenges in ensuring data privacy, security, and user control. Users often lack the tools and understanding needed to manage their privacy preferences effectively, resulting in unintended data exposure and insufficient protection of personal information.

Rationale

To empower users with control over their personal data by implementing secure, user-centric personal data vaults that support advanced privacy management features and compliance with privacy regulations.

Solution

To implement personal data storages or Personal Data Vaults (PDVs) that give users comprehensive control over their personal data. These mechanisms should be equipped with advanced privacy management tools, automated decision-making capabilities, robust security measures, and compliance features to meet regulatory requirements.

Mun et al. [1] propose the PDVLoc framework, a Personal Data Vault (PDV) designed to control the sharing of location data from mobile devices. The framework allows users to store their location data in a centralised vault and manage data sharing with third parties through fine-grained Access Control Lists (ACLs). It includes tools like a rule recommender to help users make informed privacy decisions and a trace audit feature for evaluating privacy policy adherence over time. The aim is to ensure that users have control over their location data, provide secure storage and customisable sharing settings, and help maintain transparency and accountability through auditing.

Singh, Carminati and Ferrari [2] present the Privacy-Aware Personal Data Storage (P-PDS) system, which uses semi-supervised and active learning to automate privacy decisions for user data stored in a centralised repository. The system reduces user burden by learning privacy preferences from both labelled and unlabeled data and adjusts privacy decisions based on user-specific preferences through personalised and history-based active learning. This approach enhances accuracy and efficiency in managing privacy settings. The aim is to provide an intelligent and user-friendly way to manage privacy settings, automating decisions while aligning with individual user preferences and reducing manual configuration efforts.

Sanchez, Torre and Knijnenburg [3] propose a solution for managing privacy preferences in IoT environments using semantic web technology. They introduce the Privacy Preference for IoT (PPIoT) ontology, which integrates privacy preferences, the W3C Semantic Sensor Network Ontology, Fair Information Practices (FIP) principles, and GDPR compliance. The Personal Data Manager (PDM) component mediates and manages user privacy preferences, allowing users to control data disclosure to third parties. The solution includes an Interactive Privacy Preference Model (PPM) to present privacy policies transparently and gather affirmative user consent. This aim is to empower users to manage their privacy settings effectively in IoT environments, ensuring that data disclosure is controlled and compliant with privacy regulations. The source code for the PDM is available on GitHub.

While specific details vary, the practical implementation of these systems often points towards cloud-based storage due to the need for scalability, accessibility, and integration with diverse devices and platforms. Local storage could be used in certain contexts, but cloud storage is more likely given the complexity and volume of data involved in these systems.

Platforms: personal computers, mobile devices, smart devices

Example

PDVLoc web interface <a href="#section1">[1]</a>.

Right: The PDVLoc web user interface displays information inferred from collected user data, provides data-sharing feedback, and allows configuration of spatial boundaries; Left: Traceaudit: Web user interface for reviewing shared data [1]. (See enlarged)

PPM-based interactive UI <a href="#section3">[3]</a>.

The Interactive Privacy Preference Model (PPM) interface for Privacy Policy Settings, which conforms to the PPIoT ontology [3]. (See enlarged)

The Personal Data Manager (PDM) confirmation request and recommendation <a href="#section3">[3]</a>.

The Personal Data Manager (PDM) confirmation request and recommendation [3]. (See enlarged)

Use cases
  • Supporting users in retaining ownership of their data.
  • Empowering users to manage their personal data effectively.
Pros

  • User studies indicate high satisfaction and acceptance of privacy management frameworks incorporating policy management tools, such as rule recommenders and trace audits, which enhance user decision-making [1]. Additionally, the integration of semantic web technology in privacy frameworks like PPIoT enables interoperability and scalability across diverse IoT devices and platforms [3]. Combining automation with user consent ensures users maintain control over their privacy settings while benefiting from automated processes, enhancing user experience and regulatory compliance [3].

Cons

  • Tools like the PDM can only support negotiation when the third-party policy statement is encoded in PPIoT [3].

Privacy Choices

Privacy choices give people control over certain aspects of data practices. Considering the design space for privacy choices [4], this guideline can be applied in the following dimensions:

  • Privacy rights-based choices
    This guideline supports privacy rights-based choices, advocating for users to manage their data effectively.
  • Contextualised
    It aligns with the concept of providing choices based on specific contexts, such as time, location, or purpose of data use. For example, Sanchez, Torre and Knijnenburg [3] focus on providing detailed, context-specific privacy settings using semantic web technologies and an interactive model that allows for dynamic adjustment of privacy preferences based on context.
  • Multiple choices
    This guideline supports providing users with several options for data access and usage, such as setting different levels of data sharing or specifying conditions under which data can be accessed.
  • Binary choices
    The guideline allows for basic opt-in/opt-out mechanisms for users to consent to or deny data collection and processing.

  • Personalised
    The solution discussed in Sanchez, Torre and Knijnenburg [3] involves using semantic technologies and interactive models to ensure that privacy choices are personalised and presented at relevant times based on user-defined contexts.
  • On-demand
    Users can access and change their privacy settings at any time.
  • Context-aware
    The guideline supports providing privacy options based on the user's current context, such as location or activity.
  • At Setup
    Users can configure their privacy settings when first interacting with the system.
  • Just in time
    Privacy choices can be presented when specific data practices are about to occur, allowing users to make informed decisions just in time.

  • Machine-readable
    Solutions like the one discussed in Singh, Carminati and Ferrari [2] focus on automating privacy management through machine-readable privacy settings and user preferences, leveraging machine learning and privacy agents.
  • Visual
    The guideline includes using visual methods (text, icons, images) to communicate privacy choices clearly.

  • Presentation
    Privacy choices always have a presentation that involves a system providing clear and easily understandable information to users about potential data practices, available options, and how to communicate privacy decisions, often incorporating multiple components and integrating with related privacy notices, requiring careful consideration of design dimensions such as timing, channel, and modality [4]. The solutions presented in this guideline resort to displaying privacy options in an understandable manner and user-friendly interfaces.

  • Primary
    This guideline can be applied to the same platform or device the user is interacting with.
  • Secondary
    This guideline can be applied to secondary channels if the primary channel does not have or has a limited user interface.

Control

The guideline focuses on empowering users to store and manage their personal data, emphasising user consent, control over data sharing, and the ability to configure privacy settings, which aligns closely with the attribute of Control [5]. Other related privacy attributes:

This guideline can support Pseudonymisation by allowing the replacement of personally identifiable markers with artificial identifiers. This ensures that data can only be traced back to individual users with the help of additional information, providing an added layer of privacy protection.

This guideline can address the Correctness attribute by enabling users to edit and update their personal data.

The guideline involves setting up systems that allow users to manage how long their data is stored. Users can configure data retention periods according to their preferences, ensuring that data is not kept longer than necessary.

Implementing Personal Data Vaults (PDVs) includes robust security measures to protect personal data from unauthorised access and breaches. These measures involve encryption, secure data storage, and controlled access mechanisms.


References

[1] Min Y. Mun, Donnie H. Kim, Katie Shilton, Deborah Estrin, Mark Hansen, and Ramesh Govindan (2014). PDVLoc: A Personal Data Vault for Controlled Location Data Sharing. ACM Trans. Sen. Netw. 10, 4, Article 58 (June 2014), 29 pages. https://doi.org/10.1145/2523820

[2] Bikash Chandra Singh, Barbara Carminati, and Elena Ferrari (2019). Privacy-aware personal data storage (p-pds): Learning how to protect user privacy from external applications. IEEE Transactions on Dependable and Secure Computing, 18(2), 889-903. https://doi.org/10.1109/TDSC.2019.2903802

[3] Odnan Ref Sanchez, Ilaria Torre, Bart P. Knijnenburg (2020). Semantic-based privacy settings negotiation and management. Future Generation Computer Systems, 111, 879-898. https://doi.org/10.1016/j.future.2019.10.024

[4] Yuanyuan Feng, Yaxing Yao, and Norman Sadeh (2021). A Design Space for Privacy Choices: Towards Meaningful Privacy Control in the Internet of Things. In CHI Conference on Human Factors in Computing Systems (CHI ’21), May 8–13, 2021, Yokohama, Japan. ACM, New York, NY, USA, 16 pages. https://doi.org/10.1145/3411764.3445148

[5] Susanne Barth, Dan Ionita, and Pieter Hartel (2022). Understanding Online Privacy — A Systematic Review of Privacy Visualizations and Privacy by Design Guidelines. ACM Comput. Surv. 55, 3, Article 63 (February 2022), 37 pages. https://doi.org/10.1145/3502288