It has long been said that neural networks are capable of abstraction. As the input features go through layers of neural networks, the input features are transformed into increasingly abstract features. For example, a model processing images receives only low-level pixel input, but the lower layers can learn to construct abstract features encoding the presence of edges, and later layers can even encode faces or objects. These claims have been proven with various works visualizing features learned in convolution neural networks. However, in what precise sense are these deep features “more abstract” than the shallow ones? In this article, I will provide an understanding of abstraction that not only answers this question but also explains how different components in the neural network contribute to abstraction. In the process, I will also reveal an interesting duality between abstraction and generalization, thus showing how crucial abstraction is, for both machines and us.
I think abstraction, in its essence, is
“the act of ignoring irrelevant details and focusing on the relevant parts.”
For example, when designing an algorithm, we only make a few abstract assumptions about the input and do not mind other details of the input. More concretely, consider a sorting algorithm. The sorting function typically only assumes that the input is, say, an array of numbers, or even more abstractly, an array of objects with a defined comparison. As for what the numbers or objects represent and what the comparison operator compares, it is not the concern of the sorting algorithm.
Besides programming, abstraction is also common in mathematics. In abstract algebra, a mathematical structure counts as a group as long as it satisfies a few requirements. Whether the mathematical structure possesses other properties or operations is irrelevant. When proving a theorem, we only make crucial assumptions about the discussed structure, and the other properties the structure might have are not important. We do not even have to go to college-level math to spot abstraction, for even the most basic objects studied in math are products of abstraction. Take natural numbers for example, the process in which we transform a visual representation of three apples placed on the table to a mathematical expression “3” involves intricate abstractions. Our cognitive system is able to throw away all the irrelevant details, such as the arrangement or ripeness of the apples, or the background of the scene, and focus on the “threeness” of the current experience.
There are also examples of abstraction in our daily life. In fact, it is likely in every concept we use. Take the concept of “dog” for example. Despite we may describe such a concept as concrete, it is nevertheless abstract in a complex way. Somehow our cognitive system is able to throw away irrelevant details like color and exact size, and focus on the defining characteristics like its snout, ears, fur, tail, and barking to recognize something as a dog.
Whenever there is abstraction, there seems to be also generalization, and vice versa. These two concepts are so connected that sometimes they are used almost as synonyms. I think the interesting relation between these two concepts can be summarized as follows:
the more abstract the assumption, interface, or requirement, the more general and widely applicable the conclusion, procedure, or concept.
This pattern can be demonstrated more clearly by revisiting the examples mentioned before. Consider the first example of sorting algorithms. All the extra properties numbers may have are irrelevant, only the property of being ordered matters for our task. Therefore, we can further abstract numbers as “objects with comparison defined”. By adopting a more abstract assumption, the function can be applied to not just arrays of numbers but much more widely. Similarly, in mathematics, the generality of a theorem depends on the abstractness of its assumption. A theorem proved for normed spaces would be more widely applicable than a theorem proved only for Euclidean spaces, which is a specific instance of the more abstract normed space. Besides mathematical objects, our understanding of real-world objects also exhibits different levels of abstraction. A good example is the taxonomy used in biology. Dogs, as a concept, fall under the more general category of mammals, which in turn is a subset of the even more general concept of animals. As we move from the lowest level to the higher levels in the taxonomy, the categories are defined with increasingly abstract properties, which allows the concept to be applied to more instances.
This connection between abstraction and generalization hints at the necessity of abstractions. As living beings, we must learn skills applicable to different situations. Making decisions at an abstract level allows us to easily handle many different situations that appear the same once the details are removed. In other words, the skill generalizes over different situations.
We have defined abstraction and seen its importance in different aspects of our lives. Now it is time for the main problem: how do neural networks implement abstraction?
First, we need to translate the definition of abstraction into mathematics. Suppose a mathematical function implements “removal of details”, what property should this function possess? The answer is non-injectivity, which means that there exist different inputs that are mapped to the same output. Intuitively, this is because some details differentiating between certain inputs are now discarded, so that they are considered the same in the output space. Therefore, to find abstractions in neural networks, we just have to look for non-injective mappings.
Let us start by examining the simplest structure in neural networks, i.e., a single neuron in a linear layer. Suppose the input is a real vector x of dimension D. The output of a neuron would be the dot product of its weight w and x, added with a bias b, then followed by a non-linear activation function σ:
It is easy to see that the simplest way of throwing away irrelevant details is to multiply the irrelevant features with zero weight, such that changes in that feature do not affect the output. This, indeed, gives us a non-injective function, since input vectors that differ in only that feature will have the same output.
Of course, the features often do not come in a form that simply dropping an input feature gives us useful abstractions. For example, simply dropping a fixed pixel from the input images is probably not useful. Thankfully, neural networks are capable of building useful features and simultaneously dropping other irrelevant details. Generally speaking, given any weight w, the input space can be separated into a one-dimensional subspace parallel to the weight w, and the other (D−1)-dimensional subspace orthogonal to w. The consequence is that any changes parallel to that (D−1)-dimensional subspace do not affect the output, and thus are “abstracted away”. For instance, a convolution filter detecting edges while ignoring uniform changes in color or lighting may count as this form of abstraction.
Beside dot products, the activation functions may also play a role in abstraction, since most of them are (or close to) non-injective. Take ReLU for example, all negative input values are mapped to zero, which means those differences are ignored. As for other soft activation functions like sigmoid or tanh, although technically injective, the saturation region maps different inputs to very close values, achieving similar effects.
From the discussion above, we see that both the dot product and the activation function can play a role in the abstraction performed by a single neuron. Nevertheless, the information not captured in one neuron can still be captured by other neurons in the same layer. To see if a piece of information is really ignored, we also have to look at the design of the whole layer. For a linear layer, there is a simple design that forces abstraction: lowering the dimension. The reason is similar to that of the dot product, which is equivalent to projecting a one-dimensional space. When a layer of N neurons receives M > N inputs from the previous layer, it involves a matrix multiplication:
The input components in the row space get preserved and transformed to the new space, while input components lying in the null space (at least M–N dimensional) are all mapped to zero. In other words, any changes to the input vector parallel to the null space are considered irrelevant and thus abstracted away.
I have only analyzed a few basic components used in modern deep learning. Nevertheless, with this characterization of abstraction, it should be easy to see that many other components used in deep learning also allow it to filter and abstract away irrelevant details.
With the explanation above, perhaps some of you are not yet fully convinced that this is a valid understanding of neural networks’ working since it is quite different from the usual narrative focusing on pattern matching, non-linear transformations, and function approximation. Nevertheless, I think the fact that neural networks throw away information is just the same story told from a different perspective. Pattern matching, feature building, and abstracting away irrelevant features are simultaneously happening in the network, and it is by combining these perspectives that we can understand why it generalizes well. Let me bring in some studies of neural networks based on information theory to strengthen my point.
First, let us translate the concept of abstraction into information-theoretic terms. We can think of the input to the network as a random variable x. Then, the network would sequentially process x with each layer to produce intermediate representations T₁, T₂,…, and finally the prediction Tₖ.
Abstraction, as I have defined, involves throwing away irrelevant information and preserving the relevant part. Throwing away details causes originally different samples of x to map to equal values in the intermediate feature space. Thus, this process corresponds to a lossy compression that decreases the entropy H(Tᵢ) or the mutual information I(x;Tᵢ). What about preserving relevant information? For this, we need to define a target task so that we can assess the relevance of different pieces of information. For simplicity, let us assume that we are training a classifier, where the ground truth is sampled from the random variable Y. Then, preserving relevant information would be equivalent to preserving I(Y;Tᵢ) throughout the layers, so that we can make a reliable prediction of Y at the last layer. In summary, if a neural network is performing abstraction, we should see a gradual decrease of I(x;Tᵢ), accompanied by an ideally fixed I(Y;Tᵢ), as we go to deeper layers of a classifier.
Interestingly, this is exactly what the information bottleneck principle (1) is about. The principle argues that the optimal representation T of x with respect to Y is one that minimizes I(x;T) while maintaining I(Y;T)=I(Y;x). Although there are disputes about some of the claims from the original paper, there is one thing consistent throughout many studies: as the data move from the input layer to deeper layers, I(x;T) decreases while I(Y;T) is mostly preserved (1,2,3,4), a sign of abstraction. Not only that, they also verify my claim that saturation of activation function (2,3) and dimension reduction (3) indeed play a role in this phenomenon.
Reading through the literature, I found that the phenomenon I termed abstraction has appeared under different names, although all seem to describe the same phenomenon: invariant features (5), increasingly tight clustering (3), and neural collapse (6). Here I show how the simple idea of abstraction unifies all these concepts to provide an intuitive explanation.
As I mentioned before, the act of removing irrelevant information is implemented with a non-injective mapping, which ignores differences occurring in parts of the input space. The consequence of this is, of course, creating outputs that are “invariant” to those irrelevant differences. When training a classifier, the relevant information is those distinguishing between-class samples, instead of those features distinguishing same-class samples. Therefore, as the network abstracts away irrelevant details, we see that same-class samples cluster (collapse) together, while between-class samples remain separated.
Besides unifying several observations from the literature, thinking of the neural networks as abstracting away details at each layer also provides us clues about how its predictions generalize in the input space. Consider a simplified example where we have the input x, abstracted into an intermediate representation T, which is then used to produce the prediction P. Suppose that a group of inputs x₁,x₂,x₃,…∼x are all mapped to the same intermediate representation t. Because the prediction P only depends on T, the prediction for t necessarily applies to all samples x₁,x₂,x₃,…. In other words, the direction of invariance caused by abstraction is the direction in which the predictions generalize. This is analogous to the example of sorting algorithms I mentioned earlier. By abstracting away details of the input, the algorithms naturally generalize to a larger space of input. For a deep network of multiple layers, such abstraction may happen at each of these layers. As a consequence, the final prediction also generalizes across the input space in intricate ways.
Years ago when I was writing my first article on abstraction, I saw it only as an elegant way mathematics and programming solve a series of related problems. However, it turns out I was missing the bigger picture. Abstraction is in fact everywhere, inside each of us. It is a core element of cognition. Without abstraction, we would be drowned in low-level details, incapable of understanding anything. It is only by abstractions that we can reduce the incredibly detailed world into manageable pieces, and it is only by abstraction that we can learn anything general.
To see how crucial abstraction is, just try to come up with any word that does not involve any abstraction. I bet you cannot, for a concept involving no abstractions would be too specific to be useful. Even “concrete” concepts like apples, tables, or walking, all involve complex abstractions. Apples and tables both come in different shapes, sizes, and colors. They may appear as real objects or just pictures. Nevertheless, our brain can see through all these differences and arrive at the shared essences of things.
This necessity of abstraction resonates well with Douglas Hofstadter’s idea that analogy sits at the core of cognition (7). Indeed, I think they are essentially two sides of the same coin. Whenever we perform abstraction, there would be low-level representations mapped to the same high-level representations. The information thrown away in this process is the irrelevant differences between these instances, while the information left corresponds to the shared essences of them. If we group the low-level representations mapping to the same output together, they would form equivalence classes in the input space, or “bags of analogies”, as Hofstadter termed it. Discovering the analogy between two instances of experiences can then be done by simply comparing these high-level representations of them.
Of course, our ability to perform these abstractions and use analogies has to be implemented computationally in the brain, and there is some good evidence that our brain performs abstractions through hierarchical processing, similar to artificial neural networks (8). As the sensory signals go deeper into the brain, different modalities are aggregated, details are ignored, and increasingly abstract and invariant features are produced.
In the literature, it is quite common to see claims that abstract features are constructed in the deep layers of a neural network. However, the exact meaning of “abstract” is often unclear. In this article, I gave a precise yet general definition of abstraction, unifying perspectives from information theory and the geometry of deep representations. With this characterization, we can see in detail how many common components of artificial neural networks all contribute to its ability to abstract. Commonly, we think of neural networks as detecting patterns in each layer. This, of course, is correct. Nevertheless, I propose shifting our attention to pieces of information ignored in this process. By doing this, we can gain better insights into how it produces increasingly abstract and thus invariant features in deep layers, as well as how its prediction generalizes in the input space.
With these explanations, I hope that it not only brings clarity to the meaning of abstraction but more importantly, demonstrates its central role in cognition.