Large Language Models (LLMs) like GPT, T5, and BERT are very effective in processing natural language. They are trained on a large amount of data and then fine-tuned for specific tasks to achieve better performance. However, as these models become more considerable, full fine-tuning becomes impractical due to computational and storage costs.

Parameter-Efficient Fine-tuning (PEFT) is a solution to this problem that only fine-tunes a small number of model parameters while freezing most of the pre-trained LLM's parameters. This reduces costs and prevents the model from forgetting previously learned information. PEFT approaches are better in low-data regimes and can be applied to various modalities. They also help portability by producing small checkpoints that can be added to the pre-trained LLM without replacing the entire model, making it useful for multiple tasks.

Hugging face has launched the PEFT library today. PEFT fine-tune a small number of (extra) model parameters while freezing most parameters of the pre-trained language models.

Parameter-Efficient Fine-Tuning using 🤗 PEFT
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

They have given the following three examples of using Parameter-Efficient Fine-tuning (PEFT) on large language models with consumer hardware.

  • In the first example, a 3 billion parameter model was fine-tuned using PEFT LoRA on hardware with limited RAM, such as an Nvidia GeForce RTX 2080 Ti or Nvidia GeForce RTX 3080. This was done using the hugging face's Accelerate's DeepSpeed integration and a specific Python script.
  • The second example took it up a notch by enabling INT8 tuning of a 6.7 billion parameter model in Google Colab using PEFT LoRA and bits and bytes.
  • Finally, the third example involved Stable Diffusion Dreambooth training using PEFT on consumer hardware with 11GB of RAM. The Space demo of this training should run seamlessly on a T4 instance with a 16GB GPU.

These examples demonstrate that PEFT can fine-tune large language models for various applications even with limited hardware resources.


We research, curate and publish daily updates from the field of AI.
Consider becoming a paying subscriber to get the latest!