StableTiger3 at 18:23, 22 August 2023

2023-08-22T18:23:19Z

← Older revision		Revision as of 14:23, 22 August 2023
Line 1:		Line 1:
	== Overview ==		== Overview ==
	Natural language processing includes large-scale pretraining and adaptation to specific tasks or domains. Full fine-tuning becomes impractical with large models.		Natural language processing includes large-scale pretraining and adaptation to specific tasks or domains. Full fine-tuning becomes impractical with large models. So Low Rank Adaptation (LoRA) is a solution to this. In the world of natural language processing, many applications depend on using a large, pre-trained language model and then adapting it for different specific tasks. Typically, this is done through fine-tuning, which changes all the parameters of the original model. This method becomes a problem with very large models, like GPT-3, as it keeps the same vast number of parameters, making deployment challenging.

			Some have tried to solve this by only changing a few parameters or adding external modules for new tasks. This makes the model more efficient but can slow it down or even reduce its effectiveness. These solutions often don't perform as well as fine-tuning, so there's a compromise between efficiency and quality.

			The Low-Rank Adaptation (LoRA) approach proposes a new way to tackle this issue. Inspired by the idea that changes in weights during adaptation have a low "intrinsic rank," LoRA optimizes specific parts of the neural network, keeping the pre-trained weights untouched. This method makes LoRA efficient in both storage and computation, even with extremely large models like GPT-3. It provides a promising balance between maintaining model quality and enhancing operational efficiency.

	== Definitions ==		== Definitions ==

StableTiger3: Created page with "== Overview == Natural language processing includes large-scale pretraining and adaptation to specific tasks or domains. Full fine-tuning becomes impractical with large models. == Definitions == === Fine-Tuning === Refers to slight adjustments to a pre-trained model for specific tasks. Less feasible for larger models due to high cost and parameter count. === Low-Rank Adaptation (LoRA) === LoRA retains pre-trained weights and incorporates trainable rank decomposition ma..."

2023-08-22T18:11:34Z

Created page with "== Overview == Natural language processing includes large-scale pretraining and adaptation to specific tasks or domains. Full fine-tuning becomes impractical with large models. == Definitions == === Fine-Tuning === Refers to slight adjustments to a pre-trained model for specific tasks. Less feasible for larger models due to high cost and parameter count. === Low-Rank Adaptation (LoRA) === LoRA retains pre-trained weights and incorporates trainable rank decomposition ma..."

New page

== Overview ==
Natural language processing includes large-scale pretraining and adaptation to specific tasks or domains. Full fine-tuning becomes impractical with large models.

== Definitions ==
=== Fine-Tuning ===
Refers to slight adjustments to a pre-trained model for specific tasks. Less feasible for larger models due to high cost and parameter count.

=== Low-Rank Adaptation (LoRA) ===
LoRA retains pre-trained weights and incorporates trainable rank decomposition matrices in the Transformer architecture, drastically reducing trainable parameters and GPU memory cost.

=== Trainable Parameters ===
These are adjustable aspects of the model during training. LoRA reduces these by 10,000 times, enhancing efficiency.

=== Model Quality ===
The accuracy or performance of a model. LoRA performs on par or better than traditional fine-tuning, even with fewer parameters.

=== Rank-Decomposition Matrices ===
Used in LoRA to reduce complexity without compromising quality or adding additional inference latency.

=== Inference Latency ===
Time taken for the model to respond. LoRA does not increase this latency.

== Benefits of LoRA ==
* Significant reduction in trainable parameters and GPU memory requirements.
* Comparable or superior performance to fine-tuning on models like RoBERTa, DeBERTa, GPT-2, and GPT-3.
* Higher training throughput, no additional inference latency.

== Empirical Investigation ==
Study into rank-deficiency in language model adaptation gives insights into LoRA's efficacy.

== Availability ==
Package released for integration with PyTorch, including implementations and checkpoints for RoBERTa, DeBERTa, and GPT-2.

LoRA - Revision history

StableTiger3 at 18:23, 22 August 2023