About Me

Welcome to Chen Liang (Chinese: 梁辰)‘s homepage! I am a third-year student in the Machine Learning Ph.D Program at Georgia Institute of Technology (Georgia Tech). I am very fortunate to work with Prof. Tuo Zhao in the FLASH (Foundations of LeArning Systems for alcHemy) research group. I received my M.S degree in Computational Science & Engineering from Georgia Tech, and received my B.S degree in Electrical Engineering from University of Southern California (USC). My undergrad advisor is Prof. C.-C Jay Kuo.

I am generally interested in machine learning for natural language processing. My research mainly focuses on developing methodologies and algorithms to improve parameter efficiency and model generalization of large-scale language models. My interests also include transfer learning and representation learning (e.g., multi-domain and multi-task learning).

Education

Ph.D in Machine Learning, Georgia Tech, School of Industrial&System Engineering, 2023 (Expected)

M.S in Computational Science&Engineering, Georgia Tech, School of Computational Science&Engineering, 2020

B.S in Electrical Engineering, USC, Department of Electrical&Computer Engineering, 2018

Publications

HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers
The 11th International Conference on Learning Representations (ICLR), 2023
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance
The 39th International Conference on Machine Learning (ICML), 2022
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation
The 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022
Self-Training with Differentiable Teacher
The 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL Findings), 2022
CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing
The 60th Annual Meeting of the Association for Computational Linguistics (ACL), 2022
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models
The Tenth International Conference on Learning Representations (ICLR), 2022
Adversarial Training as Stackelberg Game: An Unrolled Optimization Approach
The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
ARCH: Efficient Adversarial Regularized Training with Caching
The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP Findings), 2021
Token-wise Curriculum Learning for Neural Machine Translation
The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP Findings), 2021
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization
The 59th Annual Conference of the Association for Computational Linguistics (ACL), 2021
BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision
The 26th SIGKDD Conference on Knowledge Discovery and Pattern Mining (KDD), 2020
Multi-Domain Neural Machine Translation with Word-Level Adaptive Layer-wise Domain Mixing
The 58th Annual Conference of the Association for Computational Linguistics (ACL), 2020
A Fully Convolutional Tri-branch Network (FCTN) for Domain Adaptation
International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018

Experience

Research Intern, Google Research, May 2022 – August 2022
Applied Scientist Intern, Amazon Search, September 2021 – December 2021
Research Intern, Microsoft Azure AI, May 2021 – July 2021
Software Development Intern, Amazon, May 2019 – July 2019
Deep Learning Software Intern, NVIDIA, May 2018 – August 2018

Teaching & Services

Teaching Assistant, ISyE 3770 Statistics & Applications,  Georgia Tech,  2020 Summer
Teaching Assistant, CSE 6140 Algorithms,  Georgia Tech,  2019 Fall
Course Producer, EE 364 Introduction to Probability & Statistics for EECS,  USC,  2017 Fall
Reviewers: NeurIPS, ICLR, ICML, ACL, EMNLP, NAACL, COLING, EACL, WACV