assa

Publications

LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models
Module-wise Adaptive Distillation for Multimodality Foundation Models
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation
Less is More: Task-aware Layer-wise Distillation for Language Model Compression
HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models
Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization