Viqus Logo Viqus Logo
Home
Categories
Language Models Generative Imagery Hardware & Chips Business & Funding Ethics & Society Science & Robotics
Resources
AI Glossary Academy CLI Tool Labs
About Contact
Back to Glossary
Machine Learning Beginner Also: Categorization, Pattern Classification

Classification

Definition

A supervised learning task where the model learns to assign input data to one of several predefined categories or classes — such as spam vs. not-spam, or identifying a handwritten digit as 0 through 9.

In Depth

Classification is one of the most fundamental tasks in machine learning. Given a set of input features — pixel values, text tokens, sensor readings — a classification model predicts which discrete category the input belongs to. When there are exactly two possible classes (e.g., fraudulent vs. legitimate transaction), it is called binary classification. When there are three or more classes (e.g., identifying animal species from photos), it is multiclass classification. Some problems are multi-label, where an input can belong to several classes simultaneously.

A wide range of algorithms can perform classification. Logistic Regression, despite its name, is a simple and interpretable classifier. Decision Trees and Random Forests split data based on feature thresholds. Support Vector Machines find optimal decision boundaries between classes. Neural networks, from simple perceptrons to deep Convolutional Neural Networks, learn hierarchical feature representations that achieve state-of-the-art results on complex classification tasks. The choice of algorithm depends on dataset size, feature complexity, and interpretability requirements.

Classification models are evaluated using metrics like accuracy, precision, recall, F1-score, and the area under the ROC curve (AUC-ROC). Accuracy alone can be misleading for imbalanced datasets — a model that always predicts 'not fraud' achieves 99.9% accuracy if only 0.1% of transactions are fraudulent, yet it is completely useless. For this reason, practitioners pay close attention to per-class metrics and use techniques like stratified sampling, oversampling, or cost-sensitive learning to handle class imbalance.

Key Takeaway

Classification is the task of assigning data to predefined categories — the backbone of spam detection, medical diagnosis, image recognition, and countless other AI applications.

Real-World Applications

01 Email spam filtering: classifying incoming emails as spam or not-spam based on content, sender, and metadata features.
02 Medical diagnosis: classifying medical images or patient symptoms into disease categories to assist clinician decision-making.
03 Image recognition: classifying photographs into categories such as 'cat,' 'dog,' 'car,' or 'building' using deep convolutional neural networks.
04 Credit scoring: classifying loan applicants into risk categories to determine approval and interest rates.
05 Content moderation: classifying user-generated text and images as safe, offensive, or policy-violating across social media platforms.