publications

2026

  1. Thumbnail for Self-Distillation Enables Continual Learning
    Self-Distillation Enables Continual Learning
    ICLR 2026 Workshop on Lifelong Agents (LLA)
    Idan Shenfeld, Mehul Damani, Jonas Hubotter, and Pulkit Agrawal
  2. Thumbnail for Position: It's Time to Optimize for Self-Consistency
    Position: It's Time to Optimize for Self-Consistency
    Itamar Pres, Belinda Z. Li, Laura Ruis, Zifan Carl Guo, Keya Hu, Mehul Damani, Isha Puri, Ekdeep Singh Lubana, and Jacob Andreas
  3. Thumbnail for Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty
    Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty
    ICLR, 2026
    Mehul Damani, Isha Puri, Stewart Slocum, Idan Shenfeld, Leshem Choshen, Yoon Kim, and Jacob Andreas

2025

  1. Thumbnail for The Surprising Effectiveness of Test-Time Training for Few-Shot Learning
    The Surprising Effectiveness of Test-Time Training for Few-Shot Learning
    ICML, 2025
    Ekin Akyürek, Mehul Damani, Adam Zweiger, Linlu Qiu, Han Guo, Jyo Pari, Yoon Kim, and Jacob Andreas
  2. Thumbnail for Learning How Hard to Think: Input-Adaptive Allocation of LM Computation
    Learning How Hard to Think: Input-Adaptive Allocation of LM Computation
    ICLR, 2025
    Mehul Damani, Idan Shenfeld, Andi Peng, Andreea Bobu, and Jacob Andreas

2024

  1. Thumbnail for Formal contracts mitigate social dilemmas in multi-agent reinforcement learning
    Formal contracts mitigate social dilemmas in multi-agent reinforcement learning
    AAMAS, 2024
    Andreas Haupt, Phillip Christoffersen, Mehul Damani, and Dylan Hadfield-Menell

2023

  1. Thumbnail for Mitigating Generative Agent Social Dilemmas
    Mitigating Generative Agent Social Dilemmas
    NeurIPS 2023 Foundation Models for Decision Making Workshop
    Julian Yocum, Phillip Christoffersen, Mehul Damani, Justin Svegliato, Dylan Hadfield-Menell, and Stuart Russell
  2. Thumbnail for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
    Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
    Transactions on Machine Learning Research
    Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jeremey Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Tong Wang, Samuel Marks, Charbel-Raphael Segerie, Micah Carroll, Andi Peng, Phillip Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Biyik, Anca Dragan, David Krueger, Dorsa Sadigh, and Dylan Hadfield-Menell
  3. Thumbnail for SocialLight: Distributed Cooperation Learning towards Network-Wide Traffic Signal Control
    SocialLight: Distributed Cooperation Learning towards Network-Wide Traffic Signal Control
    Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems
    Harsh Goel, Yifeng Zhang, Mehul Damani, and Guillaume Sartoretti

2022

  1. Thumbnail for Distributed Reinforcement Learning for Robot Teams: a Review
    Distributed Reinforcement Learning for Robot Teams: a Review
    Current Robotics Reports
    Yutong Wang, Mehul Damani, Pamela Wang, Yuhong Cao, and Guillaume Sartoretti
  2. Thumbnail for Multi-Agent Traffic Signal Control via Distributed RL with Spatial and Temporal Feature Extraction
    Multi-Agent Traffic Signal Control via Distributed RL with Spatial and Temporal Feature Extraction
    International Workshop on Agent-Based Modelling of Urban Systems (ABMUS)
    Yifeng Zhang, Mehul Damani, and Guillaume Sartoretti

2021

  1. Thumbnail for Flatland Competition 2020: MAPF and MARL for Efficient Train Coordination on a Grid World
    Flatland Competition 2020: MAPF and MARL for Efficient Train Coordination on a Grid World
    Proceedings of the NeurIPS 2020 Competition and Demonstration Track
    Florian Laurent, Manuel Schneider, Christian Scheller, Jeremy Watson, Jiaoyang Li, Zhe Chen, Yi Zheng, Shao-Hung Chan, Konstantin Makhnev, Oleg Svidchenko, Vladimir Egorov, Dmitry Ivanov, Aleksei Shpilman, Evgenija Spirovska, Oliver Tanevski, Aleksandar Nikov, Ramon Grunder, David Galevski, Jakov Mitrovski, Guillaume Sartoretti, Zhiyao Luo, Mehul Damani, Nilabha Bhattacharya, Shivam Agarwal, Adrian Egli, Erik Nygren, and Sharada Mohanty
  2. Thumbnail for PRIMAL2: Pathfinding Via Reinforcement and Imitation Multi-Agent Learning - Lifelong
    PRIMAL2: Pathfinding Via Reinforcement and Imitation Multi-Agent Learning - Lifelong
    IEEE Robotics and Automation Letters
    Mehul Damani, Zhiyao Luo, Emerson Wenzel, and Guillaume Sartoretti