LoRA Fine-Tuning
Fine-tune open models efficiently with LoRA and QLoRA. Generates PEFT configs, dataset formatting, training scripts using transformers + bitsandbytes, eval harnesses, and merge-and-export steps for serving.
Full fine-tuning is expensive — LoRA/QLoRA give you most of the gains for a fraction of the GPU cost. This skill writes PEFT configs for Llama, Mistral, and Qwen, formats your dataset, configures TRL trainers, runs evals, and merges adapters back into a deployable model.
When to use
Use when prompting alone isn't getting the quality you need on a specialized domain — code in a custom DSL, medical/legal text, brand-voice copywriting — and you have a few hundred to a few thousand examples.
Examples
QLoRA on a single GPU
Fine-tune a 7B model on a consumer GPU
Set up a QLoRA fine-tune of Mistral-7B on my 2K instruction examples using a single 24GB GPU with bitsandbytes 4-bit quantization
Merge and export for vLLM
Bake the adapter into the base model for serving
Merge my trained LoRA adapter back into the base weights and export a directory that vLLM can serve directly