Terms for Neural Networks & Deep Learning

Understand the architecture terms that explain how AI systems learn, see, and read.

min

exercises

Upd. on

May 10, 2026

Start lesson quiz

AI Terminology for Designers & PMs

Course

The jargon engineers use in AI conversations, such as neural network, deep learning, transformer, and fine-tuning, can feel like a different language in product discussions. They come up in design briefs, vendor pitches, and roadmap discussions, and the expectation is that everyone in the room already knows what they mean. Most people nod along and look things up later.

Neural networks are the architectural foundation of nearly every AI feature in commercial products today. Understanding the vocabulary means knowing what a neural network actually is and how it differs from traditional software, what deep learning adds on top of that, and what terms like weights, layers, overfitting, and fine-tuning refer to in practice. These are not implementation details. They are the shared language of AI product work.

The major network types round out the vocabulary: CNNs for images, transformers for language, and transfer learning as the standard approach for building AI features without starting from scratch. Each of these terms shapes real product decisions, from scoping requirements to evaluating vendor claims to understanding why an AI feature behaves in ways no one fully expected. Knowing them changes the quality of every technical conversation a designer or product manager enters.

Neural network

A neural network is a machine learning model composed of interconnected layers of nodes (neurons) that process data by passing it from layer to layer until it produces an output. The name is loosely inspired by how biological neurons in the human brain pass signals to one another, though modern neural networks share more in common with applied mathematics than with neuroscience. The architecture is what allows AI systems to detect patterns in data too complex for rule-based programming to handle reliably.

In a product context, when engineers say they are using a neural network, they mean the system learns its behavior from training data rather than from instructions a programmer wrote. That single distinction has enormous implications. It means the system can handle inputs its creators never specifically anticipated, but it also means its decisions cannot be traced to any single written rule. For product managers and designers, neural network is not a technical term to decode but a vocabulary anchor. It signals a specific class of AI behavior where capability and unpredictability come as a pair.^[1]

Pro Tip! Neural networks do not follow instructions anyone wrote. They follow patterns found in data, which is why they can surprise even the people who built them.

Deep learning

Deep learning is a category of machine learning that uses neural networks with many stacked layers, rather than just one or two. The word deep refers to the number of these layers, known as hidden layers. Each additional layer allows the model to detect increasingly abstract patterns: an early layer might recognize edges in an image, a middle layer might detect shapes, and a later layer might recognize a face. This depth is what separates deep learning from earlier, shallower machine learning techniques.

Deep learning requires vast amounts of training data and significant computing resources to work well, but when those conditions are met, it dramatically outperforms older approaches on tasks like image recognition, speech transcription, and language generation. The practical consequence for product teams is that most AI features in modern consumer products, from search ranking to content moderation to voice interfaces, are built on deep learning. Knowing the term helps designers and PMs understand why AI features often require large datasets to launch, why they perform poorly with limited examples, and why they can be expensive to build from scratch.^[2]

Weights and bias in neural networks

Weights and bias are the numerical parameters that store everything a neural network has learned. Every connection between neurons in a network carries a weight, a number that determines how strongly that connection influences the final output. Bias is a small additional value attached to each neuron that gives the model flexibility to fit patterns that would otherwise be too rigid. Together, they are what gets adjusted during training and what remains fixed once training is done. The key thing to understand about weights and bias is that they represent everything a neural network has learned. There is no list of rules, no written logic, no code that says how the model should behave in any given situation. What exists instead are billions of tiny numerical adjustments. When engineers say a model has been trained, they mean these values have been optimized to minimize errors on the training data. For product managers and designers, this vocabulary matters because it grounds the abstract idea of AI learning in something concrete. It also explains why retraining a model is not a simple update but a process that shifts the numerical foundation on which the model's behavior rests.^[3]

Pro Tip! "Retraining the model" is not a small fix. It means recalculating every weight across the entire network — a computationally heavy process that teams often underestimate.

Layers in a neural network

In a neural network, layers are the distinct processing stages that data passes through on its way from input to output. Every neural network has at least 3 types:

The input layer, which receives raw data such as pixels, words, or numbers
The output layer, which produces the final result, such as a label, a probability, or a generated word
One or more hidden layers in between, where most of the pattern recognition takes place. Hidden layers are where the network builds abstract representations of the data.

In an image recognition model, a hidden layer might encode the concept of a curved edge without anyone defining what a curved edge is. The model learned it from examples. The term deep learning comes directly from this structure. A network with many hidden layers is a deep network. A shallow network has only one or two.

For designers and product managers, layers surface in conversations about model complexity. When engineers describe a model as deep or multi-layered, they are signaling a system that can handle complex, nuanced tasks but also requires significantly more data, compute, and time to train than a simpler alternative.^[4]

Training in neural networks

Training is the process by which a neural network develops its ability to make predictions. A newly initialized network has all its weights set to random values and produces unreliable outputs. Training exposes the network to a large dataset, checks how wrong its predictions are, and repeatedly adjusts the weights to reduce that error. The result, after thousands or millions of these adjustment cycles, is a model whose weights encode the patterns in the training data. The word training in AI borrows its logic from human learning but works differently. No teacher explains anything. The network learns only by seeing examples and receiving feedback on how accurate its guesses were. This is why training data quality matters so much: the patterns a model encodes are only as reliable as the examples it learned from. For product managers, training is the starting point for questions about AI feature quality. When did the model last train? On what data? A model trained on data from several years ago may behave unpredictably on today's inputs. Knowing this term helps teams hold more informed conversations about model maintenance and refresh cycles.^[5]

Overfitting and underfitting

Overfitting is a condition in which a neural network has learned the training data too precisely, capturing noise and specific details rather than general patterns. A model that overfits performs well on the data it was trained on and poorly on new, unseen data. It has memorized its training examples rather than learning from them.

The opposite condition is underfitting, where the model has not learned enough to make reliable predictions on either training data or new inputs. Both represent a failure of generalization, the ability to apply what was learned to situations the model has not seen before.

For product and design teams, overfitting and underfitting appear in post-launch diagnostics as well as pre-launch evaluation. When a model performs well in testing but disappoints in production, overfitting is a common explanation. When a feature seems to ignore relevant signals entirely, underfitting may be the cause. Knowing both terms gives designers and PMs a vocabulary for asking precise questions when an AI-powered feature behaves unexpectedly, rather than accepting vague answers from engineering about why the model is not working as promised.^[6]

Convolutional neural network (CNN)

A convolutional neural network, commonly referred to as CNN, is a type of neural network architecture designed for processing spatial data, particularly images. CNNs scan input data using filters that slide across it, detecting patterns like edges, textures, and shapes. Each layer builds on the previous one, moving from simple local features to complex global ones. This is why CNNs can identify a face in a photograph without being told what faces look like: the architecture is built to find structure in visual space. CNNs power most visual AI features in consumer products today. Photo tagging, medical image analysis, quality control systems that scan for manufacturing defects, and visual search tools all use CNN-based models. The term comes up in product conversations when teams are working with image or video input and discussing which model architecture to use. For a product manager scoping an AI feature that involves recognizing objects, reading text in images, or analyzing visual content, knowing that CNNs are the established architecture for spatial tasks helps ground the conversation and clarify what data and infrastructure the feature will need.^[7]

Transformer architecture

A transformer is a neural network architecture designed to process sequences of data, particularly text. Unlike earlier models that read inputs one element at a time and struggled to connect information across long distances, transformers process entire sequences at once using a mechanism called self-attention. Self-attention allows the model to weigh how relevant each word in a sequence is to every other word, capturing context and meaning across the full input.

The transformer architecture was introduced in 2017 and has since become the foundation of virtually every large language model in commercial use. Claude, ChatGPT, and Gemini are all built on transformer architectures. The architecture also underlies AI tools that process code, translate languages, summarize documents, and generate images from text. For product managers and designers, transformer is a term worth knowing because it appears in discussions about model selection, how models handle long inputs, and why context window size matters. That last concept, how much text a transformer can consider at once, directly shapes what an LLM-powered product feature can and cannot do.^[8]

Transfer learning and fine-tuning

Transfer learning is the practice of taking a neural network already trained on one large dataset and adapting it for a different, more specific task using a smaller dataset. The adaptation process is called fine-tuning. Rather than starting from scratch, a team continues training the pretrained model on their own data, allowing it to adjust its weights for the new context while retaining the broad knowledge it already holds. Training a deep neural network from zero requires enormous amounts of data and computing resources that most product teams do not have. A team building a model to classify customer feedback, for example, does not need to train a language model from scratch. They take an existing pretrained model and fine-tune it on their own labeled examples. The result is faster, cheaper, and typically more accurate than starting fresh with limited data. For product managers, transfer learning is a key term for scoping AI feature development. When an engineer says they are fine-tuning a model, they mean adapting an existing one rather than building from scratch, and that distinction changes how teams should estimate timelines, data requirements, and cost.^[9]

Pro Tip! "Fine-tuning" and "training from scratch" are not interchangeable. Mixing them up leads to inflated cost and timeline estimates that can stall an AI feature before it starts.