Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment

Authors: Lingling Xu, Haoran Xie, Si-Zhao Joe Qin, Xiaohui Tao, Fu Lee Wang

What

This paper presents a comprehensive review and assessment of Parameter-Efficient Fine-Tuning (PEFT) methods for Pretrained Language Models (PLMs), focusing on their effectiveness in reducing trainable parameters and memory usage while maintaining comparable performance to full fine-tuning.

Why

This paper is important because it addresses the challenges of adapting large language models (LLMs) with billions of parameters to specific downstream tasks, especially given limited computational resources, by providing a systematic overview of PEFT methods and evaluating their performance across different tasks and models.

How

The authors conducted their research by categorizing PEFT methods into five groups: additive fine-tuning, partial fine-tuning, reparameterized fine-tuning, hybrid fine-tuning, and unified fine-tuning. They then conducted experiments using eleven representative PEFT methods on three different types of PLMs (RoBERTa, T5, and LLaMA) across NLU, MT, and NLG tasks, evaluating their performance and memory usage.

Result

Key findings include: (1) Most PEFT methods achieve comparable or better performance than full fine-tuning on the GLUE benchmark while significantly reducing the number of trainable parameters. (2) ProPELT adapter achieves the best average performance with only 1.5% of trainable parameters compared to full fine-tuning. (3) QLoRA significantly reduces GPU memory consumption, enabling fine-tuning of LLaMA with limited resources. (4) The effectiveness of PEFT methods in reducing memory usage increases with larger model sizes.

LF

The paper highlights several limitations and future directions, including: (1) Exploring lightweight hybrid PEFT methods that combine multiple PEFT methods for better performance with minimal parameter increase. (2) Developing more LoRA-derived PEFT methods, focusing on pruning and weight quantization to optimize storage and computation. (3) Expanding the PEFT library by integrating additional PEFT methods for wider application. (4) Conducting further theoretical studies to understand the underlying mechanisms of PEFT methods. (5) Exploring the application of PEFT methods in computer vision and multimodal learning.

Abstract

With the continuous growth in the number of parameters of transformer-based pretrained language models (PLMs), particularly the emergence of large language models (LLMs) with billions of parameters, many natural language processing (NLP) tasks have demonstrated remarkable success. However, the enormous size and computational demands of these models pose significant challenges for adapting them to specific downstream tasks, especially in environments with limited computational resources. Parameter Efficient Fine-Tuning (PEFT) offers an effective solution by reducing the number of fine-tuning parameters and memory usage while achieving comparable performance to full fine-tuning. The demands for fine-tuning PLMs, especially LLMs, have led to a surge in the development of PEFT methods, as depicted in Fig. 1. In this paper, we present a comprehensive and systematic review of PEFT methods for PLMs. We summarize these PEFT methods, discuss their applications, and outline future directions. Furthermore, we conduct experiments using several representative PEFT methods to better understand their effectiveness in parameter efficiency and memory efficiency. By offering insights into the latest advancements and practical applications, this survey serves as an invaluable resource for researchers and practitioners seeking to navigate the challenges and opportunities presented by PEFT in the context of PLMs.