A technique where a model trained on one task is reused as the starting point for a different but related task — dramatically reducing the data and compute required to achieve strong performance.
In Depth
Transfer Learning is based on a simple but powerful insight: knowledge learned from one task can be useful for another. Instead of training a model from scratch for every new problem — which requires massive datasets and compute — transfer learning starts with a model that has already learned useful representations from a large, general dataset and adapts it to a new, often smaller, task-specific dataset. This is analogous to how a person who has learned to play piano can learn guitar faster than someone with no musical training at all.
In practice, transfer learning typically involves two phases. During pre-training, a large model learns general features from a broad dataset — for example, an image model learns to recognize edges, textures, and shapes from millions of photographs, or a language model learns grammar and world knowledge from internet-scale text. During fine-tuning, the pre-trained model is adapted to a specific task using a smaller, curated dataset — such as classifying skin lesions from dermatology images or answering customer support questions in a particular domain.
Transfer learning is one of the most important techniques in modern AI because it democratizes access to powerful models. Organizations that lack the resources to train a foundation model from scratch — which can cost tens of millions of dollars — can still achieve state-of-the-art results by fine-tuning an existing model on their own data. This paradigm underlies virtually all modern NLP (where fine-tuning BERT, GPT, or similar models is standard practice) and most of modern computer vision.
Transfer learning allows pre-trained models to be adapted to new tasks with minimal data, making state-of-the-art AI accessible without massive compute budgets.