Executando verificação de segurança...
2

Finetuning. Tá no readme do github.

  1. Supervised Fine-Tuning
    Train the base model with LoRA adapters:

python src/training/supervised_finetuning.py
Default configuration:

Base model: DeepSeek-R1-Distill-Qwen-7B
LoRA rank: 16, alpha: 32
Batch size: 16
Learning rate: 1e-4
Epochs: 5

Carregando publicação patrocinada...