Lei Chen 陈磊

alt text 

Ph.D. Candidate, Computer Science
Advisor: Prof. Joan Bruna
Courant Institute of Mathematical Sciences
Phone: +1 917 575 4837
Email: lc3909 [AT] nyu [DOT] edu

github | vita | google scholar

Research

My research interests include

  • Understanding Transformer-based LLMs

  • Deep Learning Theory

  • Representation Learning on Graphs

Education

  • Ph.D. student, Computer Science, New York University, Sep. 2020 - present

  • M.S., Computer Science, New York University, Aug. 2018 - May 2020

  • B.Eng., Civil Engineering, Tsinghua University, Aug. 2012 - July 2016

Recent Manuscripts

  • How Truncating Weights Improves Reasoning in Language Models
    Lei Chen, Joan Bruna, Alberto Bietti
    Preprint arxiv

Abstract:
Recent work found that selectively removing certain components from weight matrices in pretrained models can improve such reasoning capabilities. We investigate this phenomenon further by carefully studying how certain global associations tend to be stored in specific weight components or Transformer blocks, in particular feed-forward layers. Such associations may hurt predictions in reasoning tasks, and removing the corresponding components may then improve performance. We analyze how this arises during training, both empirically and theoretically, on a two-layer Transformer trained on a basic reasoning task with noise, a toy associative memory model, and on the Pythia family of pre-trained models tested on simple reasoning tasks.

Publications

(* indicates joint authorship)

  • Beyond the Edge of Stability via Two-step Gradient Updates
    Lei Chen, Joan Bruna
    ICML 2023 arxiv poster

  • On Graph Neural Networks versus Graph-Augmented MLPs
    Lei Chen*, Zhengdao Chen*, Joan Bruna
    ICLR 2021 code arxiv

  • Learning the Relevant Substructures for Tasks on Graph Data
    Lei Chen, Zhengdao Chen, Joan Bruna
    ICASSP 2021

  • Can Graph Neural Networks Count Substructures?
    Zhengdao Chen, Lei Chen, Soledad Villar, Joan Bruna
    NeurIPS 2020 code arxiv

  • Attributed Random Walk as Matrix Factorization
    Lei Chen, Shunwang Gong, Joan Bruna, Michael Bronstein
    Graph Representation Learning Workshop NeurIPS 2019 code paper

  • SpiralNet++: A Fast and Highly Efficient Mesh Convolution Operator
    Shunwang Gong, Lei Chen, Michael Bronstein, Stefanos Zafeiriou
    Geometry Meets Deep Learning Workshop ICCV 2019 code paper

  • On the Equivalence between Graph Isomorphism Testing and Function Approximation with GNNs
    Zhengdao Chen, Soledad Villar, Lei Chen, Joan Bruna
    NeurIPS 2019 code arxiv

Professional Service

  • Journal reviewer: JMLR, TMLR, IEEE TPAMI

  • Conference reviewer: NeurIPS 2021-2023, ICLR 2022-2023, ICML 2022-2024