AI News Hub Logo

AI News Hub

I trained my own LLM and published it on HuggingFace

DEV Community
Akhilesh

This is the post where things got real. Training an actual language model, watching the loss go down, pushing it to HuggingFace with my name on it. I couldn't afford to train from scratch — that takes thousands of GPU hours and costs thousands of dollars. Instead I used fine-tuning: take an existing pre-trained model and train it further on my medical data. The model I chose: facebook/opt-1.3b — 1.3 billion parameters, open source, no access restrictions. The technique: LoRA (Low-Rank Adaptation) — instead of updating all 1.3 billion parameters, LoRA adds small trainable layers on top and only trains those. You go from training 1.3 billion parameters to training about 4 million. Same result, 100x cheaper. My laptop has no GPU. Training even a small LLM on CPU takes days. Google Colab gives you a free Tesla T4 GPU with 15GB of memory. You get 30 hours per week for free. This is what I used. The key parts: from transformers import AutoModelForCausalLM, AutoTokenizer from peft import LoraConfig, get_peft_model from trl import SFTTrainer, SFTConfig # Load base model model = AutoModelForCausalLM.from_pretrained("facebook/opt-1.3b") # Add LoRA adapters lora_config = LoraConfig( r=8, lora_alpha=16, target_modules=["q_proj", "v_proj"], task_type="CAUSAL_LM" ) model = get_peft_model(model, lora_config) # Train trainer = SFTTrainer( model=model, train_dataset=train_dataset, args=SFTConfig(num_train_epochs=3, learning_rate=2e-4) ) trainer.train() Training took 1.5 hours on the free T4 GPU. Here's what the loss looked like: Step 100: Loss 1.163 Step 500: Loss 0.994 Step 1000: Loss 0.967 Step 1700: Loss 0.944 ← training complete Loss going down means the model is learning. Both training and validation loss decreased together, which means the model generalized rather than just memorizing. model.push_to_hub("Yakhilesh/medmind-opt-medical") tokenizer.push_to_hub("Yakhilesh/medmind-opt-medical") That's it. My model is now publicly available at: Anyone can download and use it. The adapter weights are only 12.6MB — small because LoRA only saves the adapter, not the entire base model. Fine-tuning is more about data quality than model architecture. My 1.3B model trained for 1.5 hours learned genuine medical patterns. The loss numbers prove it.