LoRA

From Stable Diffusion Wiki
Revision as of 14:11, 22 August 2023 by StableTiger3 (talk | contribs) (Created page with "== Overview == Natural language processing includes large-scale pretraining and adaptation to specific tasks or domains. Full fine-tuning becomes impractical with large models. == Definitions == === Fine-Tuning === Refers to slight adjustments to a pre-trained model for specific tasks. Less feasible for larger models due to high cost and parameter count. === Low-Rank Adaptation (LoRA) === LoRA retains pre-trained weights and incorporates trainable rank decomposition ma...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Overview

Natural language processing includes large-scale pretraining and adaptation to specific tasks or domains. Full fine-tuning becomes impractical with large models.

Definitions

Fine-Tuning

Refers to slight adjustments to a pre-trained model for specific tasks. Less feasible for larger models due to high cost and parameter count.

Low-Rank Adaptation (LoRA)

LoRA retains pre-trained weights and incorporates trainable rank decomposition matrices in the Transformer architecture, drastically reducing trainable parameters and GPU memory cost.

Trainable Parameters

These are adjustable aspects of the model during training. LoRA reduces these by 10,000 times, enhancing efficiency.

Model Quality

The accuracy or performance of a model. LoRA performs on par or better than traditional fine-tuning, even with fewer parameters.

Rank-Decomposition Matrices

Used in LoRA to reduce complexity without compromising quality or adding additional inference latency.

Inference Latency

Time taken for the model to respond. LoRA does not increase this latency.

Benefits of LoRA

  • Significant reduction in trainable parameters and GPU memory requirements.
  • Comparable or superior performance to fine-tuning on models like RoBERTa, DeBERTa, GPT-2, and GPT-3.
  • Higher training throughput, no additional inference latency.

Empirical Investigation

Study into rank-deficiency in language model adaptation gives insights into LoRA's efficacy.

Availability

Package released for integration with PyTorch, including implementations and checkpoints for RoBERTa, DeBERTa, and GPT-2.