I trained my own LLM and published it on HuggingFace

DEV Community

Akhilesh

May 5, 2026, 05:11 AM

This is the post where things got real. Training an actual language model, watching the loss go down, pushing it to HuggingFace with my name on it. I couldn't afford to train from scratch — that takes thousands of GPU hours and costs thousands of dollars. Instead I used fine-tuning: take an existing pre-trained model and train it further on my medical data. The model I chose: facebook/opt-1.3b — 1.3 billion parameters, open source, no access restrictions. The technique: LoRA (Low-Rank Adaptation) — instead of updating all 1.3 billion parameters, LoRA adds small trainable layers on top and only trains those. You go from training 1.3 billion parameters to training about 4 million. Same result, 100x cheaper. My laptop has no GPU. Training even a small LLM on CPU takes days. Google Colab gives you a free Tesla T4 GPU with 15GB of memory. You get 30 hours per week for free. This is what I used. The key parts: from transformers import AutoModelForCausalLM, AutoTokenizer from peft import LoraConfig, get_peft_model from trl import SFTTrainer, SFTConfig # Load base model model = AutoModelForCausalLM.from_pretrained("facebook/opt-1.3b") # Add LoRA adapters lora_config = LoraConfig( r=8, lora_alpha=16, target_modules=["q_proj", "v_proj"], task_type="CAUSAL_LM" ) model = get_peft_model(model, lora_config) # Train trainer = SFTTrainer( model=model, train_dataset=train_dataset, args=SFTConfig(num_train_epochs=3, learning_rate=2e-4) ) trainer.train() Training took 1.5 hours on the free T4 GPU. Here's what the loss looked like: Step 100: Loss 1.163 Step 500: Loss 0.994 Step 1000: Loss 0.967 Step 1700: Loss 0.944 ← training complete Loss going down means the model is learning. Both training and validation loss decreased together, which means the model generalized rather than just memorizing. model.push_to_hub("Yakhilesh/medmind-opt-medical") tokenizer.push_to_hub("Yakhilesh/medmind-opt-medical") That's it. My model is now publicly available at: Anyone can download and use it. The adapter weights are only 12.6MB — small because LoRA only saves the adapter, not the entire base model. Fine-tuning is more about data quality than model architecture. My 1.3B model trained for 1.5 hours learned genuine medical patterns. The loss numbers prove it.