Leveraging Multi-Party Computation for Privacy-Preserving Machine Learning

  • Name:

    Leveraging Multi-Party Computation for Privacy-Preserving Machine Learning

  • Venue:

    252 / BBB

  • Date:

    2026-04-30

  • Speaker:

    Yufan Jiang

  • Time:

    15:00

  • Machine Learning (ML) is no longer merely a research topic. At the time of writing
    this thesis, ML has become a well-studied and mature field, enabling the development
    of real-world industrial applications based on trained models, such as face recognition,
    fraud detection, recommendation systems, and more. In parallel, Generative AI, with
    models such as ChatGPT proposed by OpenAI, has gained significant attention, rapidly
    transforming the way people approach and perform everyday tasks.
    To train an effective machine learning model, a well-structured and comprehensive dataset
    is an essential prerequisite. In real-world applications, large companies such as Google and
    Amazon often collect and own extensive training datasets from their end users, enabling
    them to train models independently. However, even for these large companies, datasets
    are typically collected within the scope of their own business domains and can thus be
    further improved by integrating with datasets from other sources. However, due to strict
    privacy regulations such as the General Data Protection Regulation (GDPR), companies
    are prohibited from directly sharing their data with others, and in some cases, business
    considerations may further discourage them from doing so. As a result, the following
    research question arises: Can different entities collaboratively train a machine learning
    model without exposing their private datasets?
    Privacy-preserving machine learning (PPML) provides cryptographic mechanisms that
    enable model training without revealing the training data. Within the domain of PPML,
    different entities can collaboratively train a machine learning model from scratch or
    aggregate their local training results in a federated learning (FL) scenario. To achieve this
    goal, we apply secure multi-party computation (MPC) techniques, which enable distrustful
    parties to jointly evaluate functions without revealing their private inputs.
    In this thesis, we investigate how privacy-preserving machine learning can be achieved
    using MPC protocols. Specifically, we propose and experiment with MPC protocols that
    are more efficient compared to existing ones. Our first contribution is a four-party secret-
    sharing scheme called X-sharing, along with a set of four-party protocols built upon this
    scheme. We explore how four-party neuron network training can be accelerated using
    this new sharing method and compare its performance against existing approaches. Our
    second contribution is the development of new protocols for two-party training of gradient
    boosting decision trees (GBDT). We analyze the underlying modular protocols required
    for private GBDT and propose efficient two-party protocols to improve training efficiency.
    Our third contribution is a maliciously secure aggregation protocol designed for federated
    learning, which provides protection against poisoning attacks. The aggregation protocol
    is designed for a two-server setting, where clients efficiently share their gradient updates

    with the servers, supporting them in generating message authentication codes (MACs).
    We prove the security of the proposed protocols within the universally composable (UC)
    framework.