Hi, I'm Karim! I am incoming CS PhD student at UC Berkeley. My work is partially supported by the Cooperative AI PhD Fellowship.
I am currently a research intern at the Center for Human-Compatible AI, working with Micah Carroll, Michael Dennis and Anand Siththaranjan. I am concurrently finishing my MSc in AI at the University of Amsterdam.
In the summer of 2024, I was a research intern at the Krueger AI Safety Lab, University of Cambridge, working on goal misgeneralization and specification misgeneralization in RL supervised by Michael Dennis, Usman Anwar and David Krueger.
Before my MSc, I graduated with a BSc in Mathematics and Computer Science at Bocconi University. I was fortunate to be advised by Prof. Marek Eliáš, working on learning theory and theory of algorithms. During my BSc, I spent the Spring '23 semester at Georgia Tech, supported with a full-ride scholarship.I am broadly interested in Reinforcement Learning, Cooperative AI, (Algorithmic) Game Theory, and AI Alignment. Currently, I am working on topics at the intersection of inverse reinforcement learning and assistance games.
Please reach out if you are interested in my research, you want to have a quick chat, or anything else! You can e-mail me at karimabdel at berkeley dot edu.
Reinforcement Learning Conference (RLC) 2025
Karim Abdel Sadek*, Matthew Farrugia-Roberts*, Usman Anwar, Hannah Erlebach, Christian Schroeder de Witt, David Krueger, Michael Dennis (* equal contribution)
Goal misgeneralization can occur when the policy generalizes capably with respect to a 'proxy goal' whose optimal behavior correlates with the intended goal on the training distribution, but not out of distribution. We observe that if some training signal towards the intended reward function exists, it can be amplified by regret-based prioritization. We formally show that approximately optimal policies on maximal-regret levels avoid the harmful effects of goal misgeneralization, which may exist without this prioritization. Empirically, we find that current regret-based Unsupervised Environment Design (UED) methods can mitigate the effects of goal misgeneralizatio.
Reinforcement Learning Conference (RLC) 2025
Karim Abdel Sadek*, Matthew Farrugia-Roberts*, Usman Anwar, Hannah Erlebach, Christian Schroeder de Witt, David Krueger, Michael Dennis (* equal contribution)
Goal misgeneralization can occur when the policy generalizes capably with respect to a 'proxy goal' whose optimal behavior correlates with the intended goal on the training distribution, but not out of distribution. We observe that if some training signal towards the intended reward function exists, it can be amplified by regret-based prioritization. We formally show that approximately optimal policies on maximal-regret levels avoid the harmful effects of goal misgeneralization, which may exist without this prioritization. Empirically, we find that current regret-based Unsupervised Environment Design (UED) methods can mitigate the effects of goal misgeneralizatio.
International Conference on Learning Representations (ICLR) 2024
Karim Abdel Sadek, Marek Eliáš
ML-augmented algorithms utilize predictions to achieve performance beyond their worst-case bounds. We design parsimonious algorithms for caching and MTS with action predictions. Our algorithm for caching is 1-consistent, robust, and its smoothness deteriorates with decreasing number of available predictions. We propose an algorithm for general MTS whose consistency and smoothness both scale linearly with the decreasing number of predictions. Without restriction on the number of available predictions, both algorithms match the earlier guarantees achieved by Antoniadis et al. [2023].
International Conference on Learning Representations (ICLR) 2024
Karim Abdel Sadek, Marek Eliáš
ML-augmented algorithms utilize predictions to achieve performance beyond their worst-case bounds. We design parsimonious algorithms for caching and MTS with action predictions. Our algorithm for caching is 1-consistent, robust, and its smoothness deteriorates with decreasing number of available predictions. We propose an algorithm for general MTS whose consistency and smoothness both scale linearly with the decreasing number of predictions. Without restriction on the number of available predictions, both algorithms match the earlier guarantees achieved by Antoniadis et al. [2023].