Karim Abdel Sadek

Hi, I'm Karim! I am incoming CS PhD student at UC Berkeley. My work is partially supported by the Cooperative AI PhD Fellowship.

I am currently a research intern at the Center for Human-Compatible AI, working with Micah Carroll, Michael Dennis and Anand Siththaranjan. I am concurrently finishing my MSc in AI at the University of Amsterdam.

In the summer of 2024, I was a research intern at the Krueger AI Safety Lab, University of Cambridge, working on goal misgeneralization and specification misgeneralization in RL supervised by Michael Dennis, Usman Anwar and David Krueger.

Before my MSc, I graduated with a BSc in Mathematics and Computer Science at Bocconi University. I was fortunate to be advised by Prof. Marek Eliáš, working on learning theory and theory of algorithms. During my BSc, I spent the Spring '23 semester at Georgia Tech, supported with a full-ride scholarship.

I am broadly interested in Reinforcement Learning, Cooperative AI, (Algorithmic) Game Theory, and AI Alignment. Currently, I am working on topics at the intersection of inverse reinforcement learning and assistance games.

Please reach out if you are interested in my research, you want to have a quick chat, or anything else! You can e-mail me at karimabdel at berkeley dot edu.

Curriculum Vitae

News
2025
Our paper, ‘Mitigating Goal Misgeneralization via Minimax Regret’, has been accepted at RLC 2025! See you in Edmonton.
May 26
Jan 07
2024
I started to work on Goal Misgeneralization in RL at Krueger AI Safety Lab, University of Cambridge. I will be supported by the ERA Fellowship
Jul 01
My very first paper, ‘Algorithms for Caching and MTS with reduced number of predictions’, has been accepted at ICLR with scores 8,8,8,8!
Jan 15
2023
I started my MSc in Artificial Intelligence at the University of Amsterdam
Sep 01
Selected Publications (view all )
Mitigating Goal Misgeneralization via Minimax Regret
Mitigating Goal Misgeneralization via Minimax Regret

Reinforcement Learning Conference (RLC) 2025

Karim Abdel Sadek*, Matthew Farrugia-Roberts*, Usman Anwar, Hannah Erlebach, Christian Schroeder de Witt, David Krueger, Michael Dennis (* equal contribution)

Goal misgeneralization can occur when the policy generalizes capably with respect to a 'proxy goal' whose optimal behavior correlates with the intended goal on the training distribution, but not out of distribution. We observe that if some training signal towards the intended reward function exists, it can be amplified by regret-based prioritization. We formally show that approximately optimal policies on maximal-regret levels avoid the harmful effects of goal misgeneralization, which may exist without this prioritization. Empirically, we find that current regret-based Unsupervised Environment Design (UED) methods can mitigate the effects of goal misgeneralizatio.

Mitigating Goal Misgeneralization via Minimax Regret

Reinforcement Learning Conference (RLC) 2025

Karim Abdel Sadek*, Matthew Farrugia-Roberts*, Usman Anwar, Hannah Erlebach, Christian Schroeder de Witt, David Krueger, Michael Dennis (* equal contribution)

Goal misgeneralization can occur when the policy generalizes capably with respect to a 'proxy goal' whose optimal behavior correlates with the intended goal on the training distribution, but not out of distribution. We observe that if some training signal towards the intended reward function exists, it can be amplified by regret-based prioritization. We formally show that approximately optimal policies on maximal-regret levels avoid the harmful effects of goal misgeneralization, which may exist without this prioritization. Empirically, we find that current regret-based Unsupervised Environment Design (UED) methods can mitigate the effects of goal misgeneralizatio.

Algorithms for Caching and MTS with reduced number of predictions
Algorithms for Caching and MTS with reduced number of predictions

International Conference on Learning Representations (ICLR) 2024

Karim Abdel Sadek, Marek Eliáš

ML-augmented algorithms utilize predictions to achieve performance beyond their worst-case bounds. We design parsimonious algorithms for caching and MTS with action predictions. Our algorithm for caching is 1-consistent, robust, and its smoothness deteriorates with decreasing number of available predictions. We propose an algorithm for general MTS whose consistency and smoothness both scale linearly with the decreasing number of predictions. Without restriction on the number of available predictions, both algorithms match the earlier guarantees achieved by Antoniadis et al. [2023].

Algorithms for Caching and MTS with reduced number of predictions

International Conference on Learning Representations (ICLR) 2024

Karim Abdel Sadek, Marek Eliáš

ML-augmented algorithms utilize predictions to achieve performance beyond their worst-case bounds. We design parsimonious algorithms for caching and MTS with action predictions. Our algorithm for caching is 1-consistent, robust, and its smoothness deteriorates with decreasing number of available predictions. We propose an algorithm for general MTS whose consistency and smoothness both scale linearly with the decreasing number of predictions. Without restriction on the number of available predictions, both algorithms match the earlier guarantees achieved by Antoniadis et al. [2023].

All publications