Short Bio
Balazs Pejo (CV) was born in 1989 in Budapest, Hungary. He received a BSc degree in Mathematics from the Budapest University of Technology and Economics (BME, Hungary) in 2012 and two MSc degree in Computer Science in the Security and Privacy program of EIT Digital from the University of Trento (UNITN, Italy) and Eotvos Lorand University (ELTE, Hungary) in 2014. He earned the PhD degree in Informatics from the University of Luxembourg (UNILU, Luxembourg) in 2019. Currently, he is a member of the Laboratory of Cryptography and Systems Security (CrySyS Lab).
List of Courses
Research interests
- Federated Learning
- Contribution Evaluation
- Inference Attacks
- Differential Privacy
- Robust Learning
Student Project Proposals
-
Advanced Quality Inference
In federated learning we often want to know which participants contribute most — but typical methods are expensive or leak data. This project explores recent ideas that privately estimate a client’s value with minimal computation. You’ll implement and test such methods, compare them to traditional approaches, and evaluate whether they still work when robust aggregation (defenses against malicious clients) is used. Good for students interested in privacy, efficiency, and experiments. (starter paper)
-
Personalized Shapley Values
Contribution scores depend on the test data used for evaluation — but each participant may have a different local distribution. How can we merge these different viewpoints into a single, fair score everyone accepts? This project asks you to design and evaluate ways to combine per-client evaluations into a consensus contribution score, balancing fairness, privacy, and incentive compatibility. Ideal for students who like fairness, game theory ideas, and careful evaluation.
-
Post-Training Contribution Evaluation
When data points are removed from a trained model (machine unlearning), their influence drops — but can we measure that to infer contributions after training? This topic explores using input sensitivity and other post-hoc signals to approximate Shapley values or to build new attacks/diagnostics. It’s a mix of theory (why signals correlate with contribution) and practice (implementing measurements on trained models). (related paper)
-
Non-accuracy-based Membership Inference
Classic membership inference decides if a sample was used for training by looking at accuracy/loss. This project asks: can we detect membership using other properties, such as how much a sample affects fairness or robustness of the model? You will design alternative inference strategies, test them empirically, and evaluate when they work or fail. This is a creative topic for students who enjoy thinking beyond standard metrics.
-
Privacy Audit of Industry ML
Many companies deploy machine learning systems in sensitive domains (e.g., healthcare) and must comply with regulations like GDPR. In this project you will act like a privacy auditor: develop realistic privacy attacks against an ML pipeline used by a company (EGroup), measure what personal information can be recovered, and then design practical fixes to close the leaks. This is a hands-on topic involving threat modeling, attack implementation, and engineering defenses — great for students who like applied security and working with real systems.
-
Testing-Data Inference
Research has focused on telling whether a sample was in the training set. But what about the test set? This exploratory project asks whether it’s even possible to infer if a data point was used as test data, and if so, how. You’ll survey why this is hard, propose candidate signals, and run experiments to validate them. This is an open, high-risk/high-reward topic suited to curious students who like foundational questions.
-
Improvement Prediction
Before starting a federated project, partners would like to estimate in advance how much the joint training will improve their model. This project investigates prediction methods that estimate collaboration benefit from available data and uses those predictions to choose privacy/security settings optimally. You’ll evaluate trade-offs between prediction accuracy and privacy assumptions. Recommended for students interested in practical FL. (background reading)
Selection of Thesis and Dissertations
- 2025
- Flora Ulrich (BSc, BME): Robustness Scores for Federated Learning Clients (Manuscript)
- 2024
- Ádám Horváth (BSc, BME): Freerider Detection via Property Inference (Manuscript)
- 2023
- Frank Marcell (BSc, BME): Altruism in Fuzzy Message Detection (Manuscript)
- 2022
- Nikolett Kapui (BSc, BME): SQLi Detection Using Machine Learning (Manuscript)
- 2021
- András Tótth (BSc, BME): Distributed Approximation of the Shapley Value (Manuscript)
- 2020
- Mathias Parisot (BSc, VU-AMS): Property Inference Attacks on Convolutional Neural Networks (Manuscript)
Program Committees
- [2024-]: Association for the Advancement of Artificial Intelligence (AAAI)
- [2023-]: Artificial Intelligence and Statistics (AISTAT)
- [2023-2024]: Conference on Computer and Communications Security (CCS)
- [2021-2025]: Emerging Security Information, Systems and Technologies (SECUWARE)
- [2020-2022; 2024]: Privacy Enhancing Technologies Symposium (PETS)
- [2020-2022]: Workshop on Privacy in Natural Language Processing (PrivateNLP)
List of Publications
- To Appear
- 2025
- 2024
- Francesco Regazzoni; Gergely Acs; Albert Zoltan Aszalos; Christos Avgerinos; Nikolaos Bakalos; Josep Ll. Berral; Joppe W. Bos; Marco Brohet; Andrés G. Castillo Sanz; Gareth T. Davies; Stefanos Florescu; Pierre-Elisée Flory; Alberto Gutierrez-Torre; Evangelos Haleplidis; Alice Héliou; Sotirios Ioannidis; Alexander Islam El-Kady; Katarzyna Kapusta; Konstantina Karagianni; Pieter Kruizinga; Kyrian Maat; Zoltán Ádám Mann; Kalliopi Mastoraki; SeoJeong Moon; Maja Nisevic; Balázs Pejó; Kostas Papagiannopoulos; Vassilis Paliuras; Paolo Palmieri; Francesca Palumbo; Juan Carlos Perez Baun; Peter Pollner; Eduard Porta-Pardo; Luca Pulina; Muhammad Ali Siddiqi; Daniela Spajic; Christos Strydis; Georgios Tasopoulos; Vincent Thouvenot; Christos Tselios; Apostolos P. Fournaris: "SECURED for Health: Scaling Up Privacy to Enable the Integration of the European Health Data Space", Design, Automation & Test in Europe Conference & Exhibition (DATE)
- 2023
- Wouter Heyndrickx; Lewis Mervin; Tobias Morawietz; Noé Sturm; Lukas Friedrich; Adam Zalewski; Anastasia Pentina; Lina Humbeck; Martijn Oldenhof; Ritsuya Niwayama; Peter Schmidtke; Nikolas Fechner; Jaak Simm; Ádám Arany; Nicolas Drizard; Rama Jabal; Arina Afanasyeva; Regis Loeb; Shlok Verma; Simon Harnqvist; Matthew Holmes; Balázs Pejó; Maria Telenczuk; Nicholas Holway; Arne Dieckmann; Nicola Rieke; Friederike Zumsande; Djork-Arné Clevert; Michael Krug; Christopher Luscombe; Darren Green; Peter Ertl; Péter Antal; David Marcus; Nicolas Do Huu; Hideyoshi Fuji; Stephen Pickett; Gergely Ács; Eric Boniface; Bernd Beck; Yax Sun; Arnaud Gohier; Friedrich Rippmann; Ola Engkvist; Andreas H. Göller; Yves Moreau; Mathieu N. Galtier; Ansgar Schuffenhauer; Hugo Ceulemans: "MELLODDY: cross pharma federated learning at unprecedented scale unlocks benefits in QSAR without compromising proprietary information", Journal of Chemical Information and Modeling (JCIM)
- Bowen Liu; Balázs Pejó; Qiang Tang: "Privacy-preserving Federated Singular Value Decomposition", MDPI Journal of Applied Sciences (AppSci)
- Balázs Pejó; Nikolett Kapui: "SQLi Detection with ML: a data-source perspectiv", 20th International Conference on Security and Cryptography (SECRYPT)
- Balázs Pejó; Gergely Biczó: "Quality Inference in Federated Learning with Secure Aggregation", IEEE Transactions on Big Data (IEEE TBD)
- Martijn Oldenhof; Gergely Ács; Balázs Pejó; Ansgar Schuffenhauer; Nicholas Holway; Noé Sturm; Arne Dieckmann; Oliver Fortmeier; Eric Boniface; Clément Mayer; Arnaud Gohier; Peter Schmidtke; Ritsuya Niwayama; Dieter Kopecky; Lewis Mervin; Prakash Chandra Rathi; Lukas Friedrich; András Formanek; Péter Antal; Jordon Rahaman; Adam Zalewski; Wouter Heyndrickx; Ezron Oluoch; Manuel Stößel; Michal Vančo; David Endico; Fabien Gelus; Thaïs de Boisfossé; Adrien Darbier; Ashley Nicollet; Matthieu Blottière; Maria Telenczuk; Van Tien Nguyen; Thibaud Martinez; Camille Boillet; Kelvin Moutet; Alexandre Picosson; Aurélien Gasser; Inal Djafar; Antoine Simon; Ádám Arany; Jaak Simm; Yves Moreau; Ola Engkvist; Hugo Ceulemans; Camille Marini; Mathieu Galtier: "Industry-Scale Orchestrated Federated Learning for Drug Discovery", 35th Annual Conference on Innovative Applications of Artificial Intelligence (IAAI)
- 2022
- 2021
- 2020
- 2019
- 2017
- 2016