Mehul Damani

Hello! I am a third year Ph.D. student at MIT, where I am advised by Jacob Andreas.

My research interests lie at the intersection of reinforcement learning (RL) and large language models (LLMs). I believe that RL and LLMs can synergistically improve each other. I am very excited by the potential of RL to improve reasoning, math, coding, and other capabilities in LLMs. Similarly, I am also interested in harnessing the common-sense knowledge of LLM’s to bootstrap RL. Recently, I have also been thinking about the paradigm of inference-time compute, and how optimally selecting inference-time techniques can significantly improve the efficiency of LLMs.
Finally, having worked on multi-agent RL in the past, I am also interested in studying cooperation in multi-agent settings, with a particular focus on understanding how LLM agents can be integrated into and benefit from multi-agent frameworks.

Previously, I worked with Lerrel Pinto at NYU on developing automatic curriculum learning methods for RL agents. Before that, I was a part of the MARMot Lab at NUS, where I worked with Guillaume Sartoretti on applying multi-agent reinforcement learning to traffic signal control and multi-agent pathfinding.

I’m always excited to explore new research directions and am open to collaborating or advising students. If you are interested in my research or simply want to chat, don’t hesitate to get in touch!

Selected Publications

  1. pre-print
    The Surprising Effectiveness of Test-Time Training for Abstract Reasoning
    Akyürek, Ekin, Damani, Mehul, Qiu, Linlu, Guo, Han, Kim, Yoon, and Andreas, Jacob
    2024
  2. pre-print
    Learning How Hard to Think: Input-Adaptive Allocation of LM Computation
    Damani, Mehul, Shenfeld, Idan, Peng, Andi, Bobu, Andreea, and Andreas, Jacob
    arXiv preprint arXiv:2410.04707 2024
  3. AAMAS
    Formal contracts mitigate social dilemmas in multi-agent reinforcement learning
    Haupt, Andreas, Christoffersen, Phillip, Damani, Mehul, and Hadfield-Menell, Dylan
    Autonomous Agents and Multi-Agent Systems 2024
  4. TMLR
    Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
    Casper, Stephen, Davies, Xander, Shi, Claudia, Gilbert, Thomas Krendl, Scheurer, Jeremey, Rando, Javier, Freedman, Rachel, Korbak, Tomasz, Lindner, David, Freire, Pedro, Wang, Tony Tong, Marks, Samuel, Segerie, Charbel-Raphael, Carroll, Micah, Peng, Andi, Christoffersen, Phillip, Damani, Mehul, Slocum, Stewart, Anwar, Usman, Siththaranjan, Anand, Nadeau, Max, Michaud, Eric J, Pfau, Jacob, Krasheninnikov, Dmitrii, Chen, Xin, Langosco, Lauro, Hase, Peter, Biyik, Erdem, Dragan, Anca, Krueger, David, Sadigh, Dorsa, and Hadfield-Menell, Dylan
    Transactions on Machine Learning Research 2023
  5. AAMAS
    SocialLight: Distributed Cooperation Learning towards Network-Wide Traffic Signal Control
    Goel, Harsh, Zhang, Yifeng, Damani, Mehul, and Sartoretti, Guillaume
    In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems 2023
  6. IEEE-RAL, ICRA
    PRIMAL2: Pathfinding Via Reinforcement and Imitation Multi-Agent Learning - Lifelong
    Damani, Mehul, Luo, Zhiyao, Wenzel, Emerson, and Sartoretti, Guillaume
    IEEE Robotics and Automation Letters 2021