Specialized processors that accelerate deep learning computations. GPUs (Graphics Processing Units) perform massive parallel matrix operations; TPUs (Tensor Processing Units) are Google's custom chips optimized specifically for neural network workloads.
In Depth
Deep learning's computational demands are immense — training a large language model can require quintillions of arithmetic operations. CPUs, designed for sequential, general-purpose computing, are far too slow for this. GPUs, originally designed for rendering video game graphics, turned out to be ideal for deep learning because both tasks involve the same core operation: massively parallel matrix multiplication. A modern NVIDIA GPU contains thousands of cores that can perform thousands of matrix operations simultaneously, making neural network training hundreds of times faster than on CPUs.
NVIDIA dominates the AI GPU market with its CUDA software ecosystem and successive hardware generations: V100, A100, H100, and B200. TPUs (Tensor Processing Units), developed by Google, are custom-designed ASICs (application-specific integrated circuits) that sacrifice GPU versatility for maximum efficiency on tensor operations — they power Google's internal AI infrastructure and are available through Google Cloud. Other players include AMD (MI300X GPUs), Intel (Gaudi accelerators), and a growing ecosystem of AI chip startups.
The availability — and cost — of AI accelerators has become a central bottleneck in AI development. Training GPT-4-class models requires clusters of thousands of high-end GPUs running for months, at a cost of tens to hundreds of millions of dollars. This has created an 'AI compute race' where access to hardware determines who can train frontier models. The trend toward ever-larger models has driven massive investment in AI data centers, new chip architectures, and energy-efficient AI hardware design.
GPUs and TPUs are the computational engines of deep learning — their massively parallel architectures make training modern AI models feasible, and access to compute has become a decisive factor in AI development.