Finetuning. Tá no readme do github.
- Supervised Fine-Tuning
Train the base model with LoRA adapters:
python src/training/supervised_finetuning.py
Default configuration:
Base model: DeepSeek-R1-Distill-Qwen-7B
LoRA rank: 16, alpha: 32
Batch size: 16
Learning rate: 1e-4
Epochs: 5