Suriya Gunasekar

Principal Research Manager

Microsoft Research, Redmond

suriyag@microsoft.com

I am a Principal Research Manager at Microsoft Research, where I am part of the Physics of AGI group that pioneered the Phi family of language models. I currently work on improving language model capabilities through scale, data curation, and creative uses of synthetic data generation, while maintaining broader interests in evaluation and alignment of AI systems for targeted usecases. Prior to MSR, I was a Research Assistant Professor at Toyota Technological Institute at Chicago. I received my Ph.D. from The University of Texas at Austin.

Highlight: The Phi family

(2025) Phi-4-reasoning Technical Report.

Marah Abdin, Sahaj Agarwal, Ahmed Awadallah, Vidhisha Balachandran, Harkirat Behl, Lingjiao Chen, Gustavo de Rosa, Suriya Gunasekar, Mojan Javaheripi, Neel Joshi, Piero Kauffmann, Yahs Lara, Caio C. T. Mendes, Arindam Mitra, Besmira Nushi, Dimitris Papailiopoulos, Olli Saarikivi, Shital Shah, Vaishnavi Shrivastava, Vibhav Vineet, Yue Wu, Safoora Yousefi, Guoqing Zheng arXiv preprint.

(2024) Phi-4 Technical Report.

Marah Abdin, Jyoti Aneja, Harkirat Behl, Sébastien Bubeck, Ronen Eldan, Suriya Gunasekar, Michael Harrison, Russell J. Hewett, Mojan Javaheripi, Piero Kauffmann, James R. Lee, Yin Tat Lee, Yuanzhi Li, Weishung Liu, Caio C. T. Mendes, Anh Nguyen, Eric Price, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Xin Wang, Rachel Ward, Yue Wu, Dingli Yu, Cyril Zhang, Yi Zhang. arXiv preprint.

(2024) Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone.

GenAI Team at Microsoft. arXiv preprint.

(2023) Phi-2: The surprising power of small language models.

Marah Abdin, Jyoti Aneja, Sébastien Bubeck, Caio César Teodoro Mendes, Weizhu Chen, Allie Del Giorno, Ronen Eldan, Sivakanth Gopi, Suriya Gunasekar, Mojan Javaheripi, Piero Kauffmann, Yin Tat Lee, Yuanzhi Li, Anh Nguyen, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Michael Santacroce, Harkirat Singh Behl, Adam Taumann Kalai, Xin Wang, Rachel Ward, Philipp Witte, Cyril Zhang, Yi Zhang.

(2023) Textbooks Are All You Need II: phi-1.5 technical report.

Yuanzhi Li, Sébastien Bubeck, Ronen Eldan, Allie Del Giorno, Suriya Gunasekar, Yin Tat Lee. arXiv preprint.

(2023) Textbooks Are All You Need.

Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee, Yuanzhi Li. arXiv preprint.

Mentorship

I have had the priviledge of working with some awesome interns in our group.

All Publications

(2025) Phi-4-reasoning Technical Report.

(2024) Phi-4 Technical Report.

(2024) Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone.

GenAI Team at Microsoft. arXiv preprint.

(2023) KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval.

Marah I Abdin, Suriya Gunasekar, Varun Chandrasekaran, Jerry Li, Mert Yuksekgonul, Rahee Ghosh Peshawaria, Ranjita Naik, Besmira Nushi. International Conference on Learning Representations (ICLR, 2024).

(2023) Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models.

Mert Yuksekgonul, Varun Chandrasekaran, Erik Jones, Suriya Gunasekar, Ranjita Naik, Hamid Palangi, Ece Kamar, Besmira Nushi. International Conference on Learning Representations (ICLR, 2024).

(2023) Textbooks Are All You Need II: phi-1.5 technical report.

Yuanzhi Li, Sébastien Bubeck, Ronen Eldan, Allie Del Giorno, Suriya Gunasekar, Yin Tat Lee. arXiv preprint.

(2023) Textbooks Are All You Need.

Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Kalai, Adam Tauman Eldan, Yin Tat Lee, Yuanzhi Li. arXiv preprint.

(2023) (S) GD over Diagonal Linear Networks: Implicit Regularisation, Large Stepsizes and Edge of Stability.

Mathieu Even, Scott Pesme, Suriya Gunasekar, Nicolas Flammarion. Advances in Neural Information Processing Systems (NeurIPS).

(2022) How to Fine-Tune Vision Models with SGD.

Ananya Kumar, Ruoqi Shen, Sébastien Bubeck, Suriya Gunasekar. International Conference on Learning Representations (ICLR, 2024).

(2022) Unveiling Transformers with LEGO: a synthetic reasoning task.

Yi Zhang, Arturs Backurs, Sébastien Bubeck, Ronen Eldan, Suriya Gunasekar, Tal Wagner. arXiv preprint.

(2022) Generalization to translation shifts: a study in architectures and augmentations.

Suriya Gunasekar. arXiv preprint.

(2022) Neural-Sim: Learning to Generate Training Data with NeRF.

Yunhao Ge, Harkirat Behl, Jiashu Xu, Suriya Gunasekar, Neel Joshi, Yale Song, Xin Wang, Laurent Itti, Vibhav Vineet. European Conference on Computer Vision (ECCV).

(2022) Data Augmentation as Feature Manipulation.

Ruoqi Shen, Sébastien Bubeck, Suriya Gunasekar. International Conference on Machine Learning (ICML).

(2022) Inductive bias of multi-channel linear convolutional networks with bounded weight norm.

Meena Jagadeesan, Ilya Razenshteyn, Suriya Gunasekar. Conference on Learning Theory (COLT).

(2021) Methods and Analysis of The First Competition in Predicting Generalization of Deep Learning.

Yiding Jiang, Parth Natekar, Manik Sharma, Sumukh K Aithal, Dhruva Kashyap, Natarajan Subramanyam, Carlos Lassance, Daniel M Roy, Gintare Karolina Dziugaite, Suriya Gunasekar, others. NeurIPS 2020 Competition and Demonstration Track.

(2021) Mirrorless mirror descent: A natural derivation of mirror descent.

Suriya Gunasekar, Blake Woodworth, Nathan Srebro. International Conference on Artificial Intelligence and Statistics (AISTATS).

(2020) Implicit bias in deep linear classification: Initialization scale vs training accuracy.

Edward Moroshko, Blake E Woodworth, Suriya Gunasekar, Jason D Lee, Nati Srebro, Daniel Soudry. Neural Information Processing Systems (NeurIPS).

(2020) Implicit regularization and convergence for weight normalization.

Xiaoxia Wu, Edgar Dobriban, Tongzheng Ren, Shanshan Wu, Zhiyuan Li, Suriya Gunasekar, Rachel Ward, Qiang Liu. Neural Information Processing Systems (NeurIPS).

(2020) Kernel and Rich Regimes in Overparametrized Models.

Blake Woodworth, Suriya Gunasekar, Jason D Lee, Edward Moroshko, Pedro Savarese, Itay Golan, Daniel Soudry, Nathan Srebro. Conference on Learning Theory (COLT).

(2019) Theory of deep learning.

Raman Arora, Sanjeev Arora, Joan Bruna, Nadav Cohen, Simon Du, Rong Ge, Suriya Gunasekar, Chi Jin, Jason Lee, Tengyu Ma, others. Princeton Univ. Princeton, NJ.

(2019) Convergence of gradient descent on separable data.

Mor Shpigel Nacson, Jason Lee, Suriya Gunasekar, Pedro HP Savarese, Nathan Srebro, Daniel Soudry. International Conference on Artificial Intelligence and Statistics (AISTATS).

(2019) Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models.

Mor Shpigel Nacson, Suriya Gunasekar, Jason D Lee, Nathan Srebro, Daniel Soudry. International Conference on Machine Learning (ICML).

(2018) Implicit bias of gradient descent on linear convolutional networks.

Suriya Gunasekar, Jason D Lee, Daniel Soudry, Nati Srebro. Neural Information Processing Systems (NeurIPS).

(2018) On preserving non-discrimination when combining expert advice.

Avrim Blum, Suriya Gunasekar, Thodoris Lykouris, Nati Srebro. Neural Information Processing Systems (NeurIPS).

(2018) Characterizing Implicit Bias in Terms of Optimization Geometry.

Suriya Gunasekar, Jason Lee, Daniel Soudry, Nathan Srebro. International Conference on Machine Learning (ICML).

(2018) The Implicit Bias of Gradient Descent on Separable Data.

Daniel Soudry, Elad Hoffer, Mor Shpigel Nacson, Suriya Gunasekar, Nathan Srebro. Journal of Machine Learning Research (JMLR).

(2017) Implicit regularization in matrix factorization.

Suriya Gunasekar, Blake E Woodworth, Srinadh Bhojanapalli, Behnam Neyshabur, Nati Srebro. Neural Information Processing Systems (NeurIPS).

(2017) Learning Non-Discriminatory Predictors.

Blake Woodworth, Suriya Gunasekar, Mesrob I Ohannessian, Nathan Srebro. Conference on Learning Theory (COLT).

(2016) Preference Completion from Partial Rankings.

Suriya Gunasekar, Oluwasanmi O Koyejo, Joydeep Ghosh. Neural Information Processing Systems (NeurIPS).

(2016) Mining structured matrices in high dimensions.

Suriya Gunasekar.

(2016) Identifiable phenotyping using constrained non-negative matrix factorization.

Shalmali Joshi, Shalmali Joshi, Suriya Gunasekar, David Sontag, Ghosh Joydeep. Machine Learning for Healthcare Conference (MLHC).

(2016) Phenotyping using Structured Collective Matrix Factorization of Multi--source EHR Data.

Suriya Gunasekar, Joyce C Ho, Joydeep Ghosh, Stephanie Kreml, Abel N Kho, Joshua C Denny, Bradley A Malin, Jimeng Sun. arXiv preprint.

(2015) Unified view of matrix completion under general structural constraints.

Suriya Gunasekar, Arindam Banerjee, Joydeep Ghosh. Neural Information Processing Systems (NeurIPS).

(2015) Consistent collective matrix completion under joint low rank structure.

Suriya Gunasekar, Makoto Yamada, Dawei Yin, Yi Chang. Artificial Intelligence and Statistics (AISTATS).

(2014) Face detection on distorted images augmented by perceptual quality-aware features.

Suriya Gunasekar, Joydeep Ghosh, Alan C Bovik. IEEE transactions on information forensics and security.

(2014) Exponential family matrix completion under structural constraints.

Suriya Gunasekar, Pradeep Ravikumar, Joydeep Ghosh. International Conference on Machine Learning (ICML).

(2013) Noisy matrix completion using alternating minimization.

Suriya Gunasekar, Ayan Acharya, Neeraj Gaur, Joydeep Ghosh. Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD).

(2012) Review quality aware collaborative filtering.

Sindhu Raghavan, Suriya Gunasekar, Joydeep Ghosh. ACM conference on Recommender systems (RecSys).

Suriya Gunasekar

Principal Research Manager

Microsoft Research, Redmond

suriyag@microsoft.com

Highlight: The Phi family

Mentorship

Sadhika Malladi

Mathieu Even

Ananya Kumar

Ruoqi Shen

Meena Jagadeesan

Omar Montasser

All Publications