ReLOI
Project: Representation learning in oncologic imaging
Collaborating Departments: Klinikum rechts der Isar (TUM); Department of Computing (Imperial)
With a surge in the number of data-dependent learning problems, the scientific community needs to adapt the way the learning settings are designed to utilise data in a responsible manner. In order to achieve this, there is a need for a large degree of collaboration between the parties that have access to large-scale, well-curated datasets (i.e the data owners) and those with the knowledge and the resources to draw novel insights from that data (i.e. the model owners). However, in order to achieve the state where both of these parties can collaborate, the notion of trust needs to be established. Here trust involves a number of fundamental components: The data owners trust that their data is safely and responsibly handled and the model owners trust that the data provided allows to draw useful insights from it. To address the privacy- and safety-related challenges, a number of collaborative machine learning (CML) mechanisms have previously been proposed, the most notable one of which is the federated learning (FL) framework. However, on its own (i.e. without an addition of privacy enhancing technologies (PETs)), FL can be vulnerable to adversarial interference resulting in violation of privacy for the data owners or of integrity of the entire protocol, harming the collaboration for all parties. At the same time, while there exists a number of data valuation strategies for CML, these become significantly more difficult to perform when some of the parties are adversarial.
As a result, we can derive a fundamental problem that prohibits the widespread use of CML. While the data owners are able to commit their data to such protocols, they have very little motivation to do so. This problem arises from two separate issues. Firstly, the inability to share data due to data governance regulations (such as GDPR) and the associated privacy concerns. Secondly, even in the settings which do provide participants with provable guarantees of privacy, there is typically very little (financial or otherwise) incentive for them to share their private data. Moreover, for contexts with complex data where the price of mistake it very high (e.g. medical machine learning), it is often not feasible for the model owner to accept such data in private form, as it needs to be examined to certify that these contributions aid the model owner.
One solution to this problem is the development of trustworthy artificial intelligence (TAI) systems. There is a number of components that a CML setting needs to satisfy in order to be considered truly trustworthy. These include (among others) mechanisms for private, robust and interpretable model training. The overall goal of my PhD work is to provide the scientific community with mechanisms that satisfy the definition of TAI and are easy-to-use, computationally affordable and robust against adversarial influence. By the end of the project we aim to provide the means to solve these two problems:
- Data owners having the incentive to share their data
- Model owners being able to benefit from that data
This project is split into multiple stages:
- Identifying the need to rely on trust-enhancing technologies
- Providing the means for the community to use these technologies
- Providing the incentive for the community to use these technologies.
So far, stages 1 and 2 have been achieved through a number of scientific contributions. Stage one is covered by our work on adversarial interference, visually describing the risks associated with collaborative learning. Stage two is covered by our works on privacy-preserving and robustness-enhancing technologies, which can be easily integrated into certain clinical workflows. Stage three is currently ongoing and focuses on data valuation and evaluation of complexity of individual training data points.
[1] Usynin, D., Rueckert, D. and Kaissis, G., 2023. Beyond gradients: Exploiting adversarial priors in model inversion attacks. ACM Transactions on Privacy and Security, 26(3), pp.1-30.
[2] Usynin, D., Rueckert, D. and Kaissis, G., 2023. Leveraging gradient-derived metrics for data selection and valuation in differentially private training. arXiv preprint arXiv:2305.02942.
[3] Mueller, T.T., Paetzold, J.C., Prabhakar, C., Usynin, D., Rueckert, D. and Kaissis, G., 2022. Differentially Private Graph Neural Networks for Whole-Graph Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence.
[4] Chobola, T., Usynin, D. and Kaissis, G., 2022. Membership inference attacks against semantic segmentation models. arXiv preprint arXiv:2212.01082.
[5] Mueller, T.T., Kolek, S., Jungmann, F., Ziller, A., Usynin, D., Knolle, M., Rueckert, D. and Kaissis, G., 2022. How Do Input Attributes Impact the Privacy Loss in Differential Privacy?. arXiv preprint arXiv:2211.10173.
[6] Usynin, D., Klause, H., Paetzold, J.C., Rueckert, D. and Kaissis, G., 2022, September. Can collaborative learning be private, robust and scalable?. In International Workshop on Distributed, Collaborative, and Federated Learning (pp. 37-46). Cham: Springer Nature Switzerland.
[7] Mueller, T.T., Usynin, D., Paetzold, J.C., Rueckert, D. and Kaissis, G., 2022. SoK: Differential privacy on graph-structured data. arXiv preprint arXiv:2203.09205.
[8] Mueller, T.T., Paetzold, J.C., Prabhakar, C., Usynin, D., Rueckert, D. and Kaissis, G., 2022. Differentially private graph classification with gnns. arXiv preprint arXiv:2202.02575.
[9] Usynin, D., Rueckert, D., Passerat-Palmbach, J. and Kaissis, G., 2022. Zen and the art of model adaptation: Low-utility-cost attack mitigations in collaborative machine learning. Proc. Priv. Enhancing Technol., 2022(1), pp.274-290.
[10] Usynin, D., Ziller, A., Rueckert, D., Passerat-Palmbach, J. and Kaissis, G., 2021. Distributed Machine Learning and the Semblance of Trust. arXiv preprint arXiv:2112.11040.
[11] Ziller, A., Usynin, D., Knolle, M., Hammernik, K., Rueckert, D. and Kaissis, G., 2021. Complex-valued deep learning with differential privacy. arXiv preprint arXiv:2110.03478.
[12] Mueller, T.T., Ziller, A., Usynin, D., Knolle, M., Jungmann, F., Rueckert, D. and Kaissis, G., 2021. Partial sensitivity analysis in differential privacy. arXiv preprint arXiv:2109.10582.
[13] Usynin, D., Ziller, A., Knolle, M., Trask, A., Prakash, K., Rueckert, D. and Kaissis, G., 2021. An automatic differentiation system for the age of differential privacy. arXiv preprint arXiv:2109.10573.
[14] Kaissis, G., Knolle, M., Jungmann, F., Ziller, A., Usynin, D. and Rueckert, D., 2021. A unified interpretation of the gaussian mechanism for differential privacy through the sensitivity index. arXiv preprint arXiv:2109.10528.
[15] Usynin, D., Ziller, A., Makowski, M., Braren, R., Rueckert, D., Glocker, B., Kaissis, G. and Passerat-Palmbach, J., 2021. Adversarial interference and its mitigations in privacy-preserving collaborative machine learning. Nature Machine Intelligence, 3(9), pp.749-758.
[16] Ziller, A., Usynin, D., Knolle, M., Prakash, K., Trask, A., Braren, R., Makowski, M., Rueckert, D. and Kaissis, G., 2021. Sensitivity analysis in differentially private machine learning using hybrid automatic differentiation. arXiv preprint arXiv:2107.04265.
[17] Knolle, M., Ziller, A., Usynin, D., Braren, R., Makowski, M.R., Rueckert, D. and Kaissis, G., 2021. Differentially private training of neural networks with Langevin dynamics for calibrated predictive uncertainty. arXiv preprint arXiv:2107.04296.
[18] Ziller, A., Usynin, D., Remerscheid, N., Knolle, M., Makowski, M., Braren, R., Rueckert, D. and Kaissis, G., 2021. Differentially private federated deep learning for multi-site medical image segmentation. arXiv preprint arXiv:2107.02586.
[19] Ziller, A., Usynin, D., Braren, R., Makowski, M., Rueckert, D. and Kaissis, G., 2021. Medical imaging deep learning with differential privacy. Scientific Reports, 11(1), p.13524.
[20] Kaissis, G., Ziller, A., Passerat-Palmbach, J., Ryffel, T., Usynin, D., Trask, A., Lima Jr, I., Mancuso, J., Jungmann, F., Steinborn, M.M. and Saleh, A., 2021. End-to-end privacy preserving deep learning on multi-institutional medical imaging. Nature Machine Intelligence, 3(6), pp.473-484.
Team
Principal Investigator (Imperial)
Professor Dr. Daniel Rückert
Faculty of Engineering - Department of Computing | Imperial
Principal Investigator (TUM)
Prof. Dr. Marcus R. Makowski
Director of Institute for Diagnostic and Interventional Radiology
Doctoral Candidate (Imperial)
tba.