Researchers in artificial intelligence have long been working towards modeling human thought and cognition. Many of the overarching goals in machine learning are to develop autonomous systems that can act and think like humans. In order to imitate human learning, scientists must develop models of how humans represent the world and frameworks to define logic and thought. From these studies, two major paradigms in artificial intelligence have arose: symbolic AI and connectionism.
The first framework for cognition is symbolic AI, which is the approach based on assuming that intelligence can be achieved by the manipulation of symbols, through rules and logic operating on those symbols. The second framework is connectionism, the approach that intelligent thought can be derived from weighted combinations of activations of simple neuron-like processing units.
Symbolic AI theory presumes that the world can be understood in the terms of structured representations. It asserts that symbols that stand for things in the world are the core building blocks of cognition. Symbolic processing uses rules or operations on the set of symbols to encode understanding. This set of rules is called an expert system, which is a large base of if/then instructions. The knowledge base is developed by human experts, who provide the knowledge base with new information. The knowledge base is then referred to by an inference engine, which accordingly selects rules to apply to particular symbols. By doing this, the inference engine is able to draw conclusions based on querying the knowledge base, and applying those queries to input from the user. Example of symbolic AI are block world systems and semantic networks.
Connectionism theory essentially states that intelligent decision-making can be done through an interconnected system of small processing nodes of unit size. Each of the neuron-like processing units is connected to other units, where the degree or magnitude of connection is determined by each neuron’s level of activation. As the interconnected system is introduced to more information (learns), each neuron processing unit also becomes either increasingly activated or deactivated. This system of transformations and convolutions, when trained with data, can learn in-depth models of the data generation distribution, and thus can perform intelligent decision-making, such as regression or classification. Connectionism models have seven main properties: (1) a set of units, (2) activation states, (3) weight matrices, (4) an input function, (5) a transfer function, (6) a learning rule, (7) a model environment.[1] The units, considered neurons, are simple processors that combine incoming signals, dictated by the connectivity of the system. The combination of incoming signals sets the activation state of a particular neuron.
At every point in time, each neuron has a set activation state, which is usually represented by a single numerical value. As the system is trained on more data, each neuron’s activation is subject to change. The weight matrix encodes the weighted contribution of a particular neuron’s activation value, which serves as incoming signal towards the activation of another neuron. Most networks incorporate bias into the weighted network. At any given time, a receiving neuron unit receives input from some set of sending units via the weight vector. The input function determines how the input signals will be combined to set the receiving neuron’s state. The most frequent input function is a dot product of the vector of incoming activations. Next, the transfer function computes a transformation on the combined incoming signals to compute the activation state of a neuron. The learning rule is a rule for determining how weights of the network should change in response to new data. Back-propagation is a common supervised learning rule. Lastly, the model environment is how training data, usually input and output pairs, are encoded. The network must be able to interpret the model environment. An example of connectionism theory is a neural network.
The advantages of symbolic AI are that it performs well when restricted to the specific problem space that it is designed for. Symbolic AI is simple and solves toy problems well. However, the primary disadvantage of symbolic AI is that it does not generalize well. The environment of fixed sets of symbols and rules is very contrived, and thus limited in that the system you build for one task cannot easily generalize to other tasks. The symbolic AI systems are also brittle. If one assumption or rule doesn’t hold, it could break all other rules, and the system could fail. It’s not robust to changes. There is also debate over whether or not the symbolic AI system is truly “learning,” or just making decisions according to superficial rules that give high reward. The Chinese Room experiment showed that it’s possible for a symbolic AI machine to instead of learning what Chinese characters mean, simply formulate which Chinese characters to output when asked particular questions by an evaluator.
The main advantage of connectionism is that it is parallel, not serial. What this means is that connectionism is robust to changes. If one neuron or computation if removed, the system still performs decently due to all of the other neurons. This robustness is called graceful degradation. Additionally, the neuronal units can be abstract, and do not need to represent a particular symbolic entity, which means this network is more generalizable to different problems. Connectionism architectures have been shown to perform well on complex tasks like image recognition, computer vision, prediction, and supervised learning. Because the connectionism theory is grounded in a brain-like structure, this physiological basis gives it biological plausibility. One disadvantage is that connectionist networks take significantly higher computational power to train. Another critique is that connectionism models may be oversimplifying assumptions about the details of the underlying neural systems by making such general abstractions.
While both frameworks have their advantages and drawbacks, it is perhaps a combination of the two that will bring scientists closest to achieving true artificial human intelligence.
Credit: BecomingHuman By: Michelle Zhao