AI: Winning Games vs Hearts and Minds
March 10, 2021 | Joe Roushar
When Siri was introduced as part of the iPhone 4s in October, 2011, she won the hearts and minds of a generation. Her star-power made Artificial Intelligence (AI) part of everyday life, but now that we are getting to know Siri, Alexa and Google more intimately, their limitations can be exasperating. When will the promise of AI be fulfilled? When the device understands whatever you say so it can help with the hard things you have to do, at home and at work. We are rapidly approaching a day when systems will interact with you on your terms, rather than forcing you to dumb it down to theirs. But that won’t happen until we graduate from natural language processing, a calculated approach to parsing words and phrases, to deep natural language understanding.
AI got a big boost in respect when, in 1997, Deep Blue defeated Chess Grand Master Garry Kasparov. Deep Blue defeated Kasparov by marshalling massive numbers of computing cycles to compare all the possible permutations of strategies and moves to infer what the best next move would be. For many years after that Go programs could defeat only amateur players, but in 2015, Google DeepMind’s AlphaGo program defeated Lee Sedol. “While Deep Blue mainly relied on brute computational force to evaluate millions of positions, AlphaGo also relied on neural networks and reinforcement learning.” (Wikipedia) The rules of chess are a bit more convoluted than Go, as different pieces have different capabilities and movement patterns, but Go has far more possible moves, requiring many more times the computing power for “brute force” techniques.
Adding neural networks and reinforcement learning was a great step forward. Before long IBM introduced Watson, and as a demonstration of its prowess, defeated leading human players of the trivia game, Jeopardy.
Games mimic life in some ways, and provide a narrow framework for interaction. They are interesting diversions and microcosms of life because they force each player into a set of calculated decisions, usually with the intent of gaining advantage over opponents while remaining neatly in the friendly segment of the social interaction spectrum – seldom becoming anti-social. But does the ability to out-maneuver a human in a game indicate advanced AI? I think the answer is YES. But I also think that a pocket calculator, at a rudimentary level, represents intelligence in a machine.
With this steady march of computers outperforming humans at more and more complex tasks, why can’t Siri, Google and Alexa answer complex questions? Why can’t Watson, the Jeopardy champ, communicate better than other conversational bots across a wide spectrum of topics? Why did Google give up on its search appliance for business? The answers are more complex than meets the eye.
In this post I will talk about categories of communication and brain tasks in general, relying on others’ ideas including Bloom’s taxonomy and my own to show what is needed for deep natural language understanding. In this way, I will attempt to answer these questions and establish a framework for judging the effectiveness of AI beyond trivia and the Turing test.
Who Understands Artificial Intelligence?
Higher Order Thinking Skills
Benjamin Bloom (1913-1999) is an educational psychologist who led the effort in developing a taxonomy that served as a framework for classifying learning objectives, i.e., what we expect students to learn as a result of instruction (HSC). The taxonomy was later revised and now includes the skills below (the numbers represent lower to higher order skills):
I love the way Benjamin Bloom classifies cognitive tasks people can perform as part of learning. And while my focus has been on interaction tasks, these learning tasks have important corollaries. Karen Tankersley of the Association for Supervision and Curriculum Development suggests that “most jobs in the 21st century will require employees to use the four highest levels of thinking—application, analysis, synthesis, and evaluation” (ASCD). Any of the tasks in the highest levels of thinking would be worthy goals – and reasonable tests for an AI. Machine learning is a large part of many modern AI initiatives, yet the only learning tasks most ML systems perform are in the lowest level task category of Remembering, with limited forays into Understanding and Applying.
For computer software, higher order thinking skills mean moving from calculated approaches to advanced fuzzy logic algorithms and targeted heuristics, especially algorithms and heuristics that can resolve ambiguity at every level of meaning. The processes in our brain are very fuzzy, often heuristic and massively parallel, incorporating all the knowledge we possess. Systems need to be that way too.
If machines or software could perform tasks in the top categories this would clearly serve as good tests for the intelligence level of ML algorithms, but no system today can do much included in levels two and three, much less four through six. What about tests to appraise a systems ability to interact intelligently? I will propose a taxonomy of interaction.
Taxonomy of Interaction
Human interaction spans a spectrum from social to anti-social. At the extreme end of anti-social interactions is nuclear warfare or mutually assured destruction. At the extreme opposite end is philanthropy and other forms of selflessness. Game playing is a form of interaction in which a set of rules, derived from nature or elsewhere, are applied to a defined interaction with narrow boundaries including a beginning and an end, often producing winners and losers. It is truly said that actions speak louder than words, so actions in the context of interaction, are clearly more powerful than words. But for today, let’s stick with a subject with which I’m much more comfortable: communication in human language.
As I consider everyday communication and try to classify what we discuss in the course of interaction, I see four distinct categories of content:
- Needs (immediate and usually nearby geographically)
- Wants (less immediate and sometimes with broader geographic scope)
- Ideas (not bound by time nor space but usually constrained by reality)
- Dreams (completely unbound and less pressing than wants or needs)
Naturally, many communications contain more than one, even all four categories of communication content. Thus, the system that can truly understand the content and intent of communication must interpret linguistic and logical phenomena at all levels. At the highest level of intelligence, this includes abstract reasoning, creativity and volition. Volition, of course, often involves decisions based on abstract reasoning and creativity.
If you see overlap with Maslow’s hierarchy, you’re not alone. Needs are the most primitive and are within the cognitive capabilities of all creatures. Wants and ideas probably exist in primitive form in sentient animals with larger brains. But abstract ideas and dreams that lack physical object references are likely the sole province of humans and above. Likewise, Maslow’s “Esteem” and “Self-Actualization” seem to be primarily characteristics associated with sapience found in self-aware creatures capable of wisdom.
Can the contents of all communications be wrapped into four categories: wants, needs, ideas and dreams. I suspect that anything we utter or write begins as an idea. But one category gives us no differentiation. It may be that reducing interaction to these four categories is not helpful: perhaps even absurd. So I will begin by describing the distinctions between these categories, and discuss a more expansive taxonomy of intelligence. In the following sections I will attempt to justify my thoughts as embodied in this taxonomy and the next.
Wants and Needs
Needs are fundamental and we have heard endless references to humans’ and other creatures’ survival instincts. The paleo-cortex is the most primitive part of our brain and is differentiated to concentrate on our most basic needs. It is suggested that we can’t even begin to engage in higher order thinking, and interaction, unless our core needs are first met. When our needs are not met, we are in trouble and our interactions are likely to suffer as they tend to focus on our needs, often to the exclusion of anything else. One difference between a want and a need is the possibility of abstinence without damage. I may want six more scoops of ice cream, but I don’t need them, and I can use will power, volition to stifle my desires. Higher powers such as restraint and moderation may be used in cases of want and need, as can excessive indulgence and reckless abandon.
Let’s get back to chess for a second. Chess mimics social interactions in interesting ways as the pieces represent a social hierarchy of power. And even though a pawn can capture a queen or king, it’s much more difficult. So how does chess fit into the taxonomy of interaction? Alex Lane shared a useful image in a Medium post that I have borrowed liberally in producing the following illustration. I used shapes in which the number of corners reflects my perception of the complexity of the brain tasks in each category. Unlike Bloom’s model, this model places analysis and planning in the highest level of intelligence. Go and chess require planning – Jeopardy, not so much. Jeopardy requires much knowledge, and, by extension, the ability to learn facts. It also requires the understanding of how facts become answers to reproducible questions.
Since Watson successfully outperformed human competitors, does that make it:
- Artificial General Intelligence?
- Able to interact with human competency?
I think most experts would agree that neither apply to Watson, Deep Blue or AlphaGo. Of all the categories of knowledge and brain tasks we have described so far, which are missing? Which are needed to break through into the level of super-intelligent machines?
Ideas and Dreams
Ideas that are distinct from wants and needs involve more than sensory input – thus they are abstract. Dreams are what creativity, imagination and innovation are made of. As suggested above, ideas are constrained by reality. Dreams are not. Sentience involves logical thinking about sensory experiences, but does not extend to abstract reasoning of higher order thinking skills. Are there systems that can process complex ideas? Yes – but most operate in very narrowly constricted domains. Humans can easily think in multiple domains and associate information from different domains to innovate remarkable advancements. But this ability to cross contextual boundaries has a huge penalty: ambiguity grows as more contexts are incorporated into a single idea. Humans are good at resolving ambiguity to interpret meaning. Machines tend to be incapable of resolving even the simplest ambiguities.
- Why can’t Siri, Google and Alexa answer complex questions? Because they can’t understand questions that have any ambiguity in them. They were not designed to resolve ambiguity. For that, they need higher order thinking skills.
- Why can’t Watson, the Jeopardy champ, communicate better than other conversational bots across a wide spectrum of topics? Because it was designed to calculate the right question behind a bit of trivia – not communicate.
- Why did Google give up on its search appliance for business? Because their algorithms require hyperlinks for prioritizing results, and cataloging trillions of web pages on a massive variety of topics is easier than differentiating a large number of documents, web pages and other content on a much narrower band of topics unless you have a conceptual model of what is important to the specific business. Google search appliance used no such model.
One key to evaluating an AI’s power is to test how well it works in multiple domains including business. I am not going to answer all the questions I posed in this post. So stay tuned for the next installment when we describe the solution to giving machines the ability to function well in multiple domains, resolve ambiguity and converse with humans at near-human competency.
Leave a ReplyWant to join the discussion?
Feel free to contribute!