The internal variables of a machine learning model that are learned during training — in neural networks, these are primarily the weights and biases that determine how the network transforms inputs into outputs.
In Depth
Parameters are the internal numbers that define a machine learning model's behavior — they are adjusted during training to minimize prediction errors and are fixed during inference. In a neural network, parameters are the weights (which determine how strongly each connection contributes to the computation) and biases (which shift activation values). A simple linear regression model might have just two parameters (slope and intercept). A modern large language model like GPT-4 has hundreds of billions of parameters. The values of these parameters collectively encode everything the model has learned.
The number of parameters in a model — its 'parameter count' — has become a proxy for model capability, driving the scaling laws that define modern AI. Language models are commonly described by their parameter count: 7B (7 billion), 70B, 405B, and so on. Research has shown that, given sufficient training data, model performance improves predictably as parameter count increases — a relationship codified in the scaling laws discovered by Kaplan et al. at OpenAI. However, parameter count alone does not determine quality; training data quality, architecture design, and training methodology all play critical roles.
It is essential to distinguish parameters from hyperparameters. Parameters are learned automatically from data during training (the model adjusts them). Hyperparameters are set manually by the practitioner before training begins — examples include learning rate, batch size, number of layers, and dropout rate. Parameters are internal to the model; hyperparameters are external configuration choices that control how the model learns. The term 'parameter-efficient fine-tuning' (LoRA, QLoRA) refers to techniques that adapt large models by training only a small fraction of parameters.
Parameters are the learned internal variables (weights and biases) that define a model's behavior — their number has become a key indicator of model scale, with modern LLMs containing hundreds of billions.