Categorization

At a cognitive level, we can understand the function of the neocortex in terms of the neuron detector model, where multiple layers of such detectors build up increasingly abstract categories that enable the organism to more systematically generalize to novel stimuli. For example, faces can look very different from one another in terms of their raw “pixel” inputs, but we can categorize these diverse inputs in many different ways, to treat some patterns as more similar than others: male vs. female, young vs. old, happy vs. sad, “my mother” vs. “someone else”, etc. See the faces simulation for a simple model demonstrating this principle.

Forming these categories is essential for enabling us to make the appropriate behavioral and cognitive responses (approach vs. avoid, borrow money from, etc.). Imagine trying to relate all the raw inputs of a visual image of a face to appropriate behavioral responses, without the benefit of such categories – the relationship (“mapping”) between pixels and responses is just too complex. These intermediate, abstract categories organize and simplify cognition, just like file folders organize and simplify documents on your computer. One can argue that much of intelligence amounts to developing and using these abstract categories in the right ways.

It is also critical to understand how many individual neural detectors at each stage of processing can work together to capture the subtlety and complexity necessary to encode complex conceptual categories, in the form of distributed representations. These distributed representations are also critical for enabling multiple different ways of categorizing an input to be active at the same time — e.g., a given face can be simultaneously recognized as female, old, and happy. A great deal of the emergent intelligence of the human brain arises from multiple successive levels of cascading distributed representations, constituting the collective actions of billions of neurons working together in the cortex.

Categorization processes

Figure 1:

Schematic of a hierarchical sequence of categorical representations processing a face input stimulus. Representations are distributed at each level (multiple neural detectors active). At the lowest level, there are elementary feature detectors (oriented edges). Next, these are combined into junctions of lines, followed by more complex visual features. Individual faces are recognized at the next level (even here multiple face units are active in graded proportion to how similar people look). Finally, at the highest level are important functional “semantic” categories that serve as a good basis for actions that one might take — being able to develop such high level categories is critical for intelligent behavior.

Figure 1 provides an illustration of how multiple levels of neuron detectors in the visual system can transform a pixelized image into a high-level categorical representation. Philosophically, it is an interesting question as to where our mental categories come from — is there something objectively real underlying our mental categories, or are they merely illusions we impose upon reality? Does the notion of a “chair” really exist in the real world, or is it just something that our brains construct for us to enable us to get by (and rest our weary legs)? This issue has been contemplated since the dawn of philosophy, e.g., by Plato with his notion that we live in a cave perceiving only shadows on the wall of the true reality beyond the cave.

It seems plausible that there is something “objective” about chairs that enables us to categorize them as such (i.e., they are not purely a collective hallucination), but providing a rigorous, exact definition thereof seems to be a remarkably challenging endeavor (try it! don’t forget the cardboard box, or the lump of snow, or the miniature chair in a dollhouse, or the one in the museum that nobody ever sat on). It doesn’t seem like most of our concepts are likely to be true “natural kinds” that have a very precise basis in nature. Things like Newton’s laws of physics, which would seem to have a strong objective basis, are probably dwarfed by everyday things like chairs that are not nearly so well defined (and “naive” understanding of physics is often not actually correct in many cases either).

The messy ontological status of conceptual categories doesn’t bother us very much. Neurons are very capable detectors that can integrate many thousands of different input signals, and can thereby deal with complex and amorphous categories. Furthermore, we will see that learning can shape these category representations to pick up on things that are behaviorally relevant, without requiring any formality or rigor in defining what these things might be. In short, our mental categories develop because they are useful to us in some way or another, and the outside world produces enough reliable signals for our detectors to pick up on these things.

Importantly, a major driver for learning these categories is social and linguistic interaction, which enables very complex and obscure things to be learned and shared. Indeed, the strangest things can be learned through social interactions; for example you now know that the considerable extra space in a bag of chips is called the “snackmosphere”, courtesy of Rich Hall. Thus, our cultural milieu plays a critical role in shaping our mental representations, and is clearly a major force in what enables us to be as intelligent as we are (we do occasionally pick up some useful ideas along with things like “snackmosphere”).

Figure 2:

How synaptic weights act to project input patterns along specific dimensions or bases, in this case projecting the inputs along the dimensions of Emotion and Gender. In the left panel, the very high-dimensional face inputs (256 dimensions for a 16x16 image) are projected along two random weight vectors, allowing us to visualize this high-dimensional input space in a 2D plot. In the right panel, the specific synaptic weights trained for discriminating along the emotion vs. gender dimensions have transformed or rotated the input space into a much more systematic and well-organized, low-dimensional space. This is fundamentally what neurons do: organize and transform input patterns along relevant dimensions, and that is another way of stating that neurons detect stimuli along these dimensions.

Figure 2 provides a complementary view of the neuron and its weights, as projecting input patterns along a specific dimension in a high-dimensional space. Mathematically, the synaptic weights for one receiving neuron are a vector that multiplies the high-dimensional input vector of neural activity signals using a dot product, which is just multiplying weights times activations and adding up the total.

This is also known as the projection of the input space onto the weight vector dimension. This projection operation organizes and systematizes the inputs along dimensions of behavioral importance, for example projecting a face input along dimensions of emotion and gender in the case shown in the figure, which you can explore in the faces simulation.

In linear algebra terms, the neural weights rotate the input space along a new basis set, where a basis set is a collection of different axes (like the X and Y axes) or dimensions that provides a different way of encoding the inputs. Furthermore, in these terms, learning is the process of finding a good basis set for encoding the inputs. This is a standard way of describing what abstract neural networks are doing over many successive “deep” layers, which each apply a different such “rotation”. The result at the “top” of such a network is typically a few informative such dimensions, such as the object category.

The detector way of looking at the neuron is useful for understanding the roles of inhibition and the neural firing threshold as we saw in the previous chapter — it specifically differentiates between active firing for detected items, vs. not firing for everything else, and provides a more “discrete” view of what the neuron is doing. By contrast, the dimension projection framework provides a more continuous, mathematical view. Both are useful ways of understanding what is going on in the brain.

One intuitive way of understanding the importance of having the right categories (and choosing them appropriately for the given situation) comes from insight problems. These problems are often designed so that our normal default way of categorizing the situation leads us in the wrong direction, and it is necessary to re-represent the problem in a new way to solve it (i.e., “thinking outside the box”).

For example, consider this “conundrum”: “two men are dead in a cabin in the woods. what happened?” The rules of this game involve asking a sequence of true/false questions, with the goal of eventually realizing that you need to select a different way of categorizing the word “cabin” in order to solve the puzzle. Here is a list of some of these kinds of conundrums (external link).

For computer programmers, one of the most important lessons one learns is that choosing the correct representation is the most important step in solving a given problem. As a simple example, using the notion of a “heap” enables a particularly elegant solution to the sorting problem. Binary trees are also a widely used form of representation that often greatly reduce the computational time of various problems. In general, you simply want to find a representation that makes it easy to do the things you need to do. This is exactly what the brain does.

One prevalent example of the brain’s propensity to develop categorical encodings of things are stereotypes. A stereotype is really just a mental category applied to a group of people. The fact that everyone seems to have them is strong evidence that this is fundamentally how the brain works. We cannot help but think in terms of abstract categories like this, and as we’ve argued above, categories in general are essential for allowing us to deal with the world in an intelligent manner.

But the obvious problems with stereotypical thinking indicate that these categories can also be problematic (for stereotypes specifically and categorical thinking more generally), and limit our ability to accurately represent the details of any given individual or situation. See the discussion of distributed representations for the benefits of having many different categorical representations active at the same time, which can potentially help mitigate these problems. The ability to entertain multiple such potential categories at the same time may be an individual difference variable associated with things like political and religious beliefs (Critcher et al., 2009; Nam et al., 2013). This stuff can get interesting!

Loading...

Static preview:

Categorization processes