Since the beginnings of artificial intelligence, researchers have long sought to test the intelligence of machine systems by having them play games against humans. It is often thought that one of the hallmarks of human intelligence is the ability to think creatively, consider various possibilities, and keep a long-term goal in mind while making short-term decisions. If computers can play difficult games as well as humans then surely they can handle even more complicated tasks. From early checkers-playing bots developed in the 1950s to today’s deep learning-powered bots that can beat even the best players at games like chess, Go and DOTA, the idea of machines that can find solutions to puzzles is as old as AI itself, if not older.
As such, it makes sense that one of the core patterns of AI that organizations develop is the goal-driven systems pattern. Like the other patterns of AI, we see this form of artificial intelligence used to solve a common set of problems that would otherwise require human cognitive power. In this particular case, the challenge that machines address is the need to find the optimal solution to a problem. The problem might be finding a path through a maze or optimizing a supply chain. Regardless of the specific need, the power that we’re looking for here is the idea of learning through trial-and-error, and determining the best way to solve something, even if it’s not the most obvious.
Reinforcement learning and learning through trial-and-error
One of the most intriguing, but least used, forms of machine learning is reinforcement learning. As opposed to supervised learning approaches in which machines learn by being trained by humans with well-labeled data, or unsupervised learning approaches in which machines try to learn through discovery of clusters of information and other groupings, reinforcement learning attempts to learn through trial-and-error, using environmental feedback and general goals to iterate towards success.
Without the use of AI, organizations depend on humans to create programs and rules-based systems that guide software and hardware systems on how to operate. Where programs and rules can be somewhat effective in managing money, employees, time and other resources, they suffer from brittleness and rigidity. The systems are only as strong as the rules that a human creates, and the machine isn’t really learning at all. Rather, it’s the human intelligence incorporated into rules that makes the system work.
Goal-learning AI systems on the other hand are given very few rules, and need to learn how the system works on their own through iteration. In this way, can wholly optimize the entire system and not depend on human-set, brittle rules. Goal-driven driven systems have proved their worth to show the uncanny ability for systems to find the “hidden rules” that solve challenging problems. It isn’t surprising just how useful goal-driven systems are in areas where resource optimization is a must.
AI can be efficiently used in scenario simulation and resource optimization. By applying this generalized approach to learning, AI-enabled systems can be set to optimize a particular goal or scenario and find many solutions to getting there, some not even obvious to their more-creative human counterparts. In this way, while the goal-driven systems pattern hasn’t seen as much implementation as the recognition, predictive analytics, or conversational patterns, the potential is just as enormous across a wide range of industries.
Reinforcement-learning based goal-driven systems are being utilized in the financial sector in such places as “roboadvising” which uses learning to identify savings and investment plans catered to the specific needs of individuals. Other applications of the goal-driven systems are in use in the control of traffic light systems, finding the best way to control traffic lights without causing disruptions. Other uses are in the supply chain and logistics industries, finding the best way to package and deliver goods. Further uses include helping to train physical robots, creating mechanisms and algorithms by which robots can run and jump.
Goal-driven systems are even being used in ecommerce and advertising, finding optimal prices for goods and automating bids on advertising space. Goal-driven systems are even used in the pharmaceutical industry to perform protein folding and discover new and innovative treatments for illnesses. These systems are capable of selecting the best reagent and reaction parameters in order to achieve the intended product, making it an asset during the complex and delicate drug or therapeutic making process.
Is the goal-driven systems pattern the key to Artificial General Intelligence (AGI)?
The idea of learning through trial-and-error is a potent one, and possibly can be applied to any problem. Notably, DeepMind, the organization that brought to reality the machine that could solve the once-thought unsolvable problem of a machine beating a human Go player, believes that reinforcement learning-based goal-driven systems could be the key to unlocking the ultimate goal of a machine that can learn anything and accomplish any task. The concept of a “general intelligence” is one that is like our human brain. Rather than being focused on a narrow, single learning task, as is the case with all real-world AI systems today, an artificial general intelligence (AGI) can learn any task and apply learning from one domain to another, without requiring extensive retraining.
DeepMind, acquired by Google in 2014 and established in the United Kingdom, is aiming to solve some of the most complicated problems for machine intelligence by pushing the boundaries of what is capable with goal-driven systems and other patterns of AI. Starting with AlphaGo, which was purpose-built to learn how to play the game Go against a human opponent, the company rapidly branched out with AlphaZero, which could learn from scratch any game by playing itself. What had previously taken AlphaGo months to learn, AlphaZero could now do in a matter of days using reinforced learning. From scratch, with the only goal of increasing its win rate, AlphaZero triumphed over AlphaGo in all 100 test games. AlphaZero had achieved this by simply playing games against itself and learning by trial & error. It is by this simple method that general-learning systems are able to not only create patterns but essentially devise optimal conditions and outcomes for any input given to it. This predictably became the crowning glory of DeepMind and the holy grail of the AI industry.
Naturally, as those in the tech industry are often done with new technology, they turned their minds towards its possible real-world applications. AlphaZero was created with the best techniques available at the time such as machine learning and applying other domains such as neuroscience systems and research in behavioral psychology. These techniques are channelled into the development of powerful general-purpose learning algorithms, and perhaps we might be only years away from a real breakthrough in research in AGI.
The AI industry is a bit of a crossroads with regards to research in machine learning. The most widely used algorithms today are solving important, but relatively simple problems. While machines have proven their ability to recognize images, understand speech, find patterns, spot anomalies, and make predictions, they depend on training data and narrow learning tasks to be able to achieve their tasks with any level of accuracy. In these situations, machine learning is very data and compute hungry. If you have a sufficiently complicated learning task, you might need petabytes or more of training data, hundreds of thousands of dollars of GPU-intensive computing, and months of training. Clearly, the solution to AGI is not achievable through just brute force approaches such as those.
The goal-driven systems pattern, while today being one of the least implemented of the seven patterns, might hold a key to learning that isn’t so data and compute intensive. Goal-driven systems are increasingly being implemented into projects with real-life use-cases. It is therefore one of the most interesting patterns to look into due to its potential promise.
Credit: Google News