Publications

KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval. arXiv preprint, 2023.
Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models. arXiv preprint, 2023.
Textbooks Are All You Need II: phi-1.5 technical report. arXiv preprint, 2023.
Textbooks Are All You Need. arXiv preprint, 2023.
(S) GD over Diagonal Linear Networks: Implicit Regularisation, Large Stepsizes and Edge of Stability. Advances in Neural Information Processing Systems (NeurIPS), 2023.
How to Fine-Tune Vision Models with SGD. arXiv preprint, 2022.
Unveiling Transformers with LEGO: a synthetic reasoning task. arXiv preprint, 2022.
Neural-Sim: Learning to Generate Training Data with NeRF. European Conference on Computer Vision (ECCV), 2022.
Inductive bias of multi-channel linear convolutional networks with bounded weight norm. Conference on Learning Theory (COLT), 2022.
Data Augmentation as Feature Manipulation. International Conference on Machine Learning (ICML), 2022.
Methods and Analysis of The First Competition in Predicting Generalization of Deep Learning. NeurIPS 2020 Competition and Demonstration Track, 2021.
Mirrorless mirror descent: A natural derivation of mirror descent. International Conference on Artificial Intelligence and Statistics (AISTATS), 2021.
Implicit regularization and convergence for weight normalization. Neural Information Processing Systems (NeurIPS), 2020.
Implicit bias in deep linear classification: Initialization scale vs training accuracy. Neural Information Processing Systems (NeurIPS), 2020.
Kernel and Rich Regimes in Overparametrized Models. Conference on Learning Theory (COLT), 2020.
Theory of deep learning. Princeton Univ. Princeton, NJ, 2019.
Convergence of gradient descent on separable data. International Conference on Artificial Intelligence and Statistics (AISTATS), 2019.
Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models. International Conference on Machine Learning (ICML), 2019.
On preserving non-discrimination when combining expert advice. Neural Information Processing Systems (NeurIPS), 2018.
Implicit bias of gradient descent on linear convolutional networks. Neural Information Processing Systems (NeurIPS), 2018.
Characterizing Implicit Bias in Terms of Optimization Geometry. International Conference on Machine Learning (ICML), 2018.
The Implicit Bias of Gradient Descent on Separable Data. Journal of Machine Learning Research (JMLR), 2018.
Implicit regularization in matrix factorization. Neural Information Processing Systems (NeurIPS), 2017.
Learning Non-Discriminatory Predictors. Conference on Learning Theory (COLT), 2017.
Preference Completion from Partial Rankings. Neural Information Processing Systems (NeurIPS), 2016.
Identifiable phenotyping using constrained non-negative matrix factorization. Machine Learning for Healthcare Conference (MLHC), 2016.
Phenotyping using Structured Collective Matrix Factorization of Multi--source EHR Data. arXiv preprint, 2016.
Unified view of matrix completion under general structural constraints. Neural Information Processing Systems (NeurIPS), 2015.
Consistent collective matrix completion under joint low rank structure. Artificial Intelligence and Statistics (AISTATS), 2015.
Face detection on distorted images augmented by perceptual quality-aware features. IEEE transactions on information forensics and security, 2014.
Exponential family matrix completion under structural constraints. International Conference on Machine Learning (ICML), 2014.
Noisy matrix completion using alternating minimization. Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD), 2013.
Review quality aware collaborative filtering. ACM conference on Recommender systems (RecSys), 2012.