Privacy-Preserving Collection and Analysis of User Data – Provably Secure and Practical

  • Name:

    Privacy-Preserving Collection and Analysis of User Data – Provably Secure and Practical

  • Venue:

    252 / BBB

  • Date:

    2026-03-31

  • Speaker:

    Markus Raiber

  • Time:

    15:45

  • A large number of applications collect, store and analyze data about end users. Examples
    include customer loyalty systems such as Payback, incentive systems offered by health
    insurers, behavior-dependent car insurance tariffs, pay-as-you-go public transport services,
    smart metering, and many more besides. Currently, most of these systems simply collect
    raw user data, resulting in vast datasets of personal information. However, this has
    disadvantages for both users and companies. The collected data allows extensive profiles
    to be created that go far beyond the intended use. Large collections of data are also a
    lucrative target for cyber-attacks, which can harm affected users through identity theft,
    for example, as well as causing harm to the involved company through negative publicity
    and potential data protection penalties.
    Privacy-preserving technologies are a remedy for these problems, as they allow the desired
    analytics to be evaluated securely on relevant user data without the need to collect or
    store sensitive user data in the clear. However, this comes with new challenges, as privacy-
    preserving technologies are more computationally and communicationally complex. In this
    thesis, we propose and evaluate two generic solutions based on these technologies. Both
    solutions are formally modelled in the Universal Composability framework, which allows
    them to be used in any context while maintaining strong security guarantees through
    simulation-based security. Furthermore, both solutions come with a practical prototype
    implementation and evaluation, showcasing the potential for practical deployment as well
    as the current limitations. In both solutions, we ensure that users remain anonymous
    when data is collected, while guaranteeing the authenticity of the collected data.
    Our first solution, called PUBA, is based on personal logbooks stored on each user’s device.
    These logbooks are authenticated and can only be updated by the system operator while
    maintaining the confidentiality of their content. Users can then participate in privacy-
    preserving analytics computation, where it is ensured that their logbook is up-to-date
    and authentic. To accommodate constrained user devices, such as smartphones, users can
    outsource more complex analytics computations to a (potentially malicious) proxy that is
    not colluding with the system operator. Performance evaluations of our prototype show
    that PUBA has sufficient performance for logbooks storing the last 10-30 transactions.
    In our second solution, called POBA, the logbooks are stored on operator-controlled servers
    instead. We model a setting in which multiple operators collaborate to run the system
    without fully trusting each other. Logbook contents are protected by secret-sharing them
    between all the operators involved. Additionally, advanced cryptographic tools, such
    as oblivious RAM, are employed to protect user identities and prevent the linking of
    multiple interactions to the same user. Since data is available without user interaction
    in this setting, operators have more flexibility when running analytics. As long as some
    operators behave honestly, requiring all operators to agree to computations still ensures
    that the analysis results satisfy privacy requirements. Performance evaluations of our
    prototype demonstrate its practicability in the three-party setting: With three operators,
    it can handle over two million logbook entries per day.