In this tutorial, we demonstrate the workflow to adjust Mistral 7B using Qlora with <a target="_blank" href="https://github.com/axolotl-ai-cloud/axolotl”>Axolotlshowing how to manage limited GPU resources while customizing the model for new tasks. We will install Axolotl, we will create a small set of example data, configure the specific hyperparameters of LORA, we will execute the fine adjustment process and test the performance of the resulting model.
Step 1: Prepare the environment and install axolotl
# 1. Check GPU availability
!nvidia-smi
# 2. Install git-lfs (for handling large model files)
!sudo apt-get -y install git-lfs
!git lfs install
# 3. Clone Axolotl and install from source
!git clone https://github.com/OpenAccess-ai-Collective/axolotl.git
%cd axolotl
!pip install -e .
# (Optional) If you need a specific PyTorch version, install it BEFORE Axolotl:
# !pip install torch==2.0.1+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
# Return to /content directory
%cd /content
First, we verify what GPU there are and how much memory there is. Then we install GIT LFS for large model files (such as Mistral 7B) to be handled correctly. After that, we clone the Axolotl repository from Github and install in “editable” mode, which allows us to call their commands from anywhere. An optional section allows you to install a specific Pytorch version if necessary. Finally, we sail again to the directory /content to organize rear files and routes.
Step 2: Create a small sample data set and a Qora configuration for Mistral 7B
import os
# Create a small JSONL dataset
os.makedirs("data", exist_ok=True)
with open("data/sample_instructions.jsonl", "w") as f:
f.write('{"instruction": "Explain quantum computing in simple terms.", "input": "", "output": "Quantum computing uses qubits..."}\n')
f.write('{"instruction": "What is the capital of France?", "input": "", "output": "The capital of France is Paris."}\n')
# Write a QLoRA config for Mistral 7B
config_text = """\
base_model: mistralai/mistral-7b-v0.1
tokenizer: mistralai/mistral-7b-v0.1
# We'll use QLoRA to minimize memory usage
train_type: qlora
bits: 4
double_quant: true
quant_type: nf4
lora_r: 8
lora_alpha: 16
lora_dropout: 0.05
target_modules:
- q_proj
- k_proj
- v_proj
data:
datasets:
- path: /content/data/sample_instructions.jsonl
val_set_size: 0
max_seq_length: 512
cutoff_len: 512
training_arguments:
output_dir: /content/mistral-7b-qlora-output
num_train_epochs: 1
per_device_train_batch_size: 1
gradient_accumulation_steps: 4
learning_rate: 0.0002
fp16: true
logging_steps: 10
save_strategy: "epoch"
evaluation_strategy: "no"
wandb:
enabled: false
"""
with open("qlora_mistral_7b.yml", "w") as f:
f.write(config_text)
print("Dataset and QLoRA config created.")
Here, we build a minimum JSONL data set with two instructional response pairs, giving us an example of toy to train. Then we build a YAML configuration that points to the 7B Mistral model, establishes the Qlora parameters for fine memory adjustment and defines training hyperparameters such as lot size, learning speed and sequence length. We also specify the LORA configuration, such as abandonment and range, and finally save this configuration such as qlora_mistral_7b.yml.
Step 3: fine tane with axolotl
# This will download Mistral 7B (~13 GB) and start fine-tuning with QLoRA.
# If you encounter OOM (Out Of Memory) errors, reduce max_seq_length or LoRA rank.
!axolotl --config /content/qlora_mistral_7b.yml
Here, Axolotl automatically obtains and discharges the Mistral 7B (a large file) and then starts a fine adjustment procedure based on Qlora. The model is quantified to a 4 -bit precision, which helps reduce the use of GPU memory. You will see training records that show progress, including loss of training, step by step.
Step 4: Try the adjusted model
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the base Mistral 7B model
base_model_path = "mistralai/mistral-7b-v0.1" #First establish access using your user account on HF then run this part
output_dir = "/content/mistral-7b-qlora-output"
print("\nLoading base model and tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(
base_model_path,
trust_remote_code=True
)
base_model = AutoModelForCausalLM.from_pretrained(
base_model_path,
device_map="auto",
torch_dtype=torch.float16,
trust_remote_code=True
)
print("\nLoading QLoRA adapter...")
model = PeftModel.from_pretrained(
base_model,
output_dir,
device_map="auto",
torch_dtype=torch.float16
)
model.eval()
# Example prompt
prompt = "What are the main differences between classical and quantum computing?"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
print("\nGenerating response...")
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=128)
response = tokenizer.decode(outputs(0), skip_special_tokens=True)
print("\n=== Model Output ===")
print(response)
Finally, we load the Mistral 7B model again and then apply the newly trained Lora weights. We elaborate a quick indicator about the differences between classical and quantum computing, we turn it into tokens and generate an answer using the adjusted model. This confirms that our Qlora training has entered into force and that we can successfully execute inference in the updated model.
Instant models compatible with axolotl
In conclusion, the previous steps have shown how to prepare the environment, configure a small data set, configure specific LORA hyperparameters and execute a Qlora adjustment session in Mistral 7B with Axolotl. This approach shows an efficient parameter training process suitable for limited resources environments. You can now expand the data set, modify hyperparameters or experiment with different llM of open source to refine more and optimize your tight pipe.
Download the Colab notebook here. All credit for this investigation goes to the researchers of this project. Besides, don't forget to follow us <a target="_blank" href="https://x.com/intent/follow?screen_name=marktechpost” target=”_blank” rel=”noreferrer noopener”>twitter and join our Telegram channel and LINKEDIN GRsplash. Do not forget to join our 75K+ ml of submen.
Marktechpost is inviting companies/companies/artificial intelligence groups to associate for their next ai magazines in 'Open Source ai in production' and 'ai de Agent'.
Asif Razzaq is the CEO of Marktechpost Media Inc .. as a visionary entrepreneur and engineer, Asif undertakes to take advantage of the potential of artificial intelligence for the social good. Its most recent effort is the launch of an artificial intelligence media platform, Marktechpost, which stands out for its deep coverage of automatic learning and deep learning news that is technically solid and easily understandable by a broad audience. The platform has more than 2 million monthly views, illustrating its popularity among the public.