The Expressive Power of Low-Rank Adaptation
Authors: Yuchen Zeng, Kangwook Lee
What
This paper provides the first theoretical analysis of the expressive power of Low-Rank Adaptation (LoRA) for adapting pre-trained Fully Connected Neural Networks (FNN) and Transformer Networks (TFN). It identifies the necessary LoRA-rank for exactly adapting a frozen model to match a target model and quantifies the approximation error when the LoRA-rank is lower than the required threshold.
Why
This paper is important because it provides the first known theoretical results on the expressive power of LoRA, a widely used and successful fine-tuning method. The findings contribute to understanding why LoRA is effective and offer insights for hyperparameter tuning and algorithm development.
How
The authors used a theoretical approach, starting with linear model approximation as a simplified scenario and extending the results to FNN and TFN with ReLU activation and softmax. They identified the required LoRA-rank by proving the existence of low-rank adapters that enable the adapted model to precisely match or approximate the target model under certain assumptions. The theoretical findings are validated by experiments on both synthetic and real datasets.
Result
Key findings include: (1) LoRA can adapt any FNN to exactly represent any smaller target FNN if the LoRA-rank meets a certain threshold. (2) For TFNs, any model can be adapted to a target model of the same size with a rank equal to half the embedding size. (3) In both linear and FNN settings, the total number of parameters needed to achieve an exact approximation is constant regardless of the LoRA-rank assignment across layers. (4) LoRA can adapt randomly generated models to match the target model with fewer parameters than final layer tuning.
LF
Limitations include the potential suboptimality of the constructed LoRA adapters, the lack of approximation error quantification for TFNs when the rank is lower than required, and the simplification of TFN architecture. Future work includes quantifying approximation errors for TFNs with insufficient ranks, refining LoRA adapter update algorithms, and studying LoRA’s expressive power under more general TFN architecture settings.
Abstract
Low-Rank Adaptation (LoRA), a parameter-efficient fine-tuning method that leverages low-rank adaptation of weight matrices, has emerged as a prevalent technique for fine-tuning pre-trained models such as large language models and diffusion models. Despite its huge success in practice, the theoretical underpinnings of LoRA have largely remained unexplored. This paper takes the first step to bridge this gap by theoretically analyzing the expressive power of LoRA. We prove that, for fully connected neural networks, LoRA can adapt any model to accurately represent any smaller target model if LoRA-rank . We also quantify the approximation error when LoRA-rank is lower than the threshold. For Transformer networks, we show any model can be adapted to a target model of the same size with rank- LoRA adapters.