Big data applications promise to help for many urgent problems of society, like health care, traffic coordination, energy management, etc. The basic premise for these applications is "the more data the better". Theoretically, any smart-phone and -watch owner could be a continuous source of valuable data and contribute to many useful big data applications. However, such data can reveal a lot of sensible information, like the current location or the heart rate of the owner of such devices. Protection of personal data is important in our society and for example manifested in the EU General Data Protection Regulation (GDPR). However, privacy protection and useful big data applications are hard to bring together. Implementing proper privacy protection requires skills that are typically not in the focus of data analysts and big data developers. Thus, many individuals tend to share none of their data if in doubt whether it will be properly protected. However, there exist many good privacy solutions that fall in between the edges of the ?all or nothing? principle. For example, instead of continuously publishing the current location of individuals one might aggregate this data and only publish information of how many individuals are in a certain area of the city. By this personal data is not revealed and there is still useful information for certain applications like traffic coordination.
It is the goal of the Parrot project to provide tools for real-time data analysis applications that leverage this "middle ground". Data analysts should only be required to specify their data needs and end-users can select the privacy requirements for their data as well as the applications and end-users they want to share their data with. The project results are expected to enable the (semi-)automatic integration of appropriate privacy protection into real-time data stream applications. Thus, individuals can safely provide data which in turn improves the results of big data applications.
The first project period was strongly affected by the Corona pandemic and problems to recruit personnel. The corresponding reduced technical progress focused on two technical challenges, i.e., «zero interaction pairing (ZIP)» solutions and the overall architecture.
ZIP solutions can establish secure channels between smart devices without human interaction which could be very useful for the project, but existing solutions are too slow and vulnerable. We developed a new ZIP solution and demonstrated that it is substantially faster and more robust against attacks.
There are two cornerstones to achieve the objectives of the project, i.e., privacy quantification and integrated privacy support in real-time data analysis with Distributed Complex Event Processing (DCEP). Privacy quantification will allow to precisely specify the threat level and especially the level of privacy protection that can be achieved by the different privacy protecting mechanisms. There are two multidimensional optimization problems to be solved to configure a DCEP instance (1) selection of privacy protecting mechanisms that fulfill privacy requirements of data subjects and data quality requirements of IoT applications, (2) placement of traditional CEP operators and privacy protecting mechanisms. The resulting DCEP instance enables application developers to focus on application logic and data quality requirements, while the DCEP instance enforces the data subjects privacy requirements. The use of the new concept of Event Proximity will also allow to implement context aware privacy protection, i.e., in normal situations privacy will be protected, but if anomalies or situations close to hazards are detected privacy might be reduced or ignored. Another means to support privacy is the move from the classical cloud approach to fog computing or even fully distributed approach with DCEP.
The application domain for privacy protection is in human-centered IoT (including participatory sensing and mHealth), which are especially prone to privacy issues. The project will implement and deploy three IoT applications with different privacy concerns to gain experience, evaluate the project results, and to promote the project. The project combines theoretical work on privacy quantification, knowledge representation, and optimization problems with systems work to design and implement prototypes for systematic qualitative and quantitative evaluations, demonstrations, deployments and field tests to transfer conceptual research results to deployable solutions.