Quick answer: CNNs, vision transformers, segmentation — the deep-learning side of CV.
Computer Vision (Deep Learning) is the art of teaching machines to understand and interpret visual data using neural networks. At its core, it uses Convolutional Neural Networks (CNNs) to detect patterns in images, Vision Transformers to capture global context, and specialized architectures for tasks like semantic segmentation—where every pixel gets a label. This skill lets you build systems that can diagnose diseases from medical scans, detect defects in manufacturing, recognize faces, analyze satellite imagery for agriculture, or power autonomous vehicles. Unlike traditional image processing, deep learning approaches learn features automatically from data, making them adaptable to countless real-world problems where manual feature engineering would be impractical.