Narrow Artificial Intelligence can seem overwhelming. AI is a complicated thing and is understandably filled with intimidating computational jargon. In this article I’m going to do something foolhardy, and that’s to tell you — the interaction designers — don’t panic. For our jobs, the basics of AI are probably simpler than you think. To make this case, I’m going to…
- Describe four major groups of AI use cases in simple terms (and a final set of use cases that aren’t tied to a core technology)
- Provide concrete examples
- List the inputs
- List the outputs
- List the implied basic interactions (the use cases for which you’ll need to design)
- List related concepts and jargon terms you may have heard, in case you want to learn more
Of course, as designers, you should be goal-oriented first and foremost. Who are the users? What are their goals? What are their contexts? How do they use the thing? Does it work? That said, you should also be familiar with the material in which you are designing, so you can match the material with the human goal as best as you can. To that end, let me introduce the four categories. These are roughly grouped around major AI technologies.
- Generators & Optimizers
More on each follows. Note that like any technology, these technologies can be used for good and ill. I’ve tried to provide good examples as much as I could, but sometimes a wicked example is easier to understand. I’ll note when I list those kinds of examples.
A clustering algorithm can look at a bunch of data and identify groups within the data. Use clusterers to save users the work of organizing things into useful groups manually.
Practical examples of clusterers
- Grouping pictures of people based on similar features (note that facial recognition is very problematic when used by the state or corporations, and is mentioned here as personal use, like sorting family photos)
- Identifying genres in a bunch of songs
- Suggesting folders for a bunch of text documents
- Identifying related people based on their DNA (Again, for personal use, like 23andme.com, though that example is hardly unproblematic.)
- Collaborative filtering or recommenders: People who have liked (or purchased) what you like also like…
↑ Inputs: Lots of things
↓ Outputs: Groups within the things
Clustering by itself doesn’t have interactions per se. It’s more of a behind-the-scenes thing. It just tells you what it sees. But its outputs are rarely perfect for human use, and so the basic interactions are related to modifying and curating the results.
Basic interactions with clusterers
1. Subdivide groups
2. Merge groups
3. Modify the definitions of a group
4. Add a group
5. Providing a name and details for a group
6. Delete (or hide) a group
7. (For large numbers of groupings): Filtering and sorting groups
Related concepts and jargon: K-means clustering, unsupervised learning
Clusterers are very tightly intertwined with classifiers since clusterers often create the groups, and classifiers try to put new things into the groups. Oftentimes after providing groups, users will want to look at individual examples to build trust that the clusterer worked, or to move an example from one group to another. But this is an act of classification, which is next.
A classifier can take a thing, compare it to categories it knows about, and say to what category or categories it thinks the thing belongs. (If there’s just one category it’s called a binary classifier, if multiple it’s a multi-classifier.) When said like this, classifiers sound abstract and rudimentary, but they are beneath so many concrete things. Use classifiers to help users identify what a thing is.
Practical examples of classifiers
- Text recognition / optical character recognition: One of the oldest examples of classifiers, these algorithms look at images to find the text within them, using both letter recognition and word recognition, both classifiers. (This is just one example of AI use cases building on each other. More on this below in “The basics are…”)
- Handwriting recognition: More recent than text recognition, and a harder problem, since an individual’s handwriting is a model of one person, even if they are expressions of a known set of symbols.
- Speech-to-Text: Conceptually akin to text recognition, these algorithms listen to audio to detect what words it contains.
- Computer vision: These classifiers look at images and tell you what they “see,” like the AI that can review radiographs for tumors or retinopathy for diabetes. Computer vision has had a lot of attention and is now quite sophisticated and powerful, despite the diminishing set of things it finds hard to tell apart, like blueberry muffins and chihuahua faces.
- Face detection and recognition: Many modern smart devices let users authenticate themselves by letting the device match a live image of their face against a model of their face.
- Fingerprint identification: A similar biometric technique looks at live fingerprints against stored models.
- Copyright violation: Classifiers on YouTube and Facebook routinely scan uploaded videos to see if they contain copywritten material.
- Hit predictors: Rumored gatekeepers of the pop music industry, these algorithms analyze tracks to predict how popular they will be
- And more: This list could go on longer. Spam filters. Translation. Hopefully by now you’ve got the point that classifiers do the heaviest lifting.
Trending AI Articles:
1. Why Corporate AI projects fail?
2. How AI Will Power the Next Wave of Healthcare Innovation?
3. Machine Learning by Using Regression Model
4. Top Data Science Platforms in 2021 Other than Kaggle
↑ Inputs: An example thing
↓ Outputs: Categories of the thing, with confidences that the thing belongs in each category
Confidences bear some explanation and attention. Where a person could look at an image and say, “Yeah, that’s an orange tabby cat,” classifiers respond with both the “feature” of the image and its confidence that it got it right. Confidences are most often expressed as a decimal between 0.00 (not confident) and 1.00 (completely confident). So the speculative cat picture might result in something like…
Domesticated cat: 0.90
One design task with classifiers is to take those computery-outputs and massage them into something that is more understandable and suited to the domain. Sometimes this is a simple classification, like spam/not spam. Sometimes this can be a prediction with a confidence number out of 5 stars. Sometimes just a prediction with high, medium, and low confidence. Sometimes you can show users a high and a low confidence classification simultaneously to let them be compared. A design task for classifiers is to find the right level of fidelity that helps avoid either false precision or faulty generalization.
It’s also worth noting that in a lot of cases, when the confidence of a classifier is very high, and the classification is correct, interactions can be seamless. The face recognition on your smartphone, for instance, can sometimes go by without your even thinking about it. If an app translates your handwriting perfectly into text, you just move on. The interactions often come when the classification is wrong, or its confidence is lower.
Basic interactions with classifiers
8. Flagging/correcting a false positive/negative: Sometimes a classifier will just get things wrong. Like email putting a party invitation into a spam folder. Users need the ability to at least identify bad classifications. Oftentimes users need more capabilities, like being able to manually assign a new classification and tell the algorithm what to do with similar things in the future. (I wrote in-depth about these use cases in Designing Agentive Technology, where you can read more).
9. Disambiguation: When a classifier’s confidences are below some threshold, it’s best to not just move forward with the unlikely match, but for the system to work with the user to increase its confidence. This back-and-forth interaction is called disambiguation and in some cases repair. If you’ve ever had a chatbot ask you which of several possibilities you actually meant, you’ve experienced disambiguation. If you’ve ever had to repeat something to an interactive voice response system, you’ve experienced repair.
Related concepts and jargon: Supervised learning, Naive Bayes Decision Tree, Random Forest, Logistic Regression, Support Vector Machines, Hidden Markov models
Sidebar: CAPTCHAS are a type of classifier, but they’re upside-down. They’re trying to identify problems that humans can accomplish, but that computers find difficult-to-impossible. They seek to classify you as not-a-computer.
“Regression analysis” sounds like some kind of psychology thing, but it’s a math thing. These algorithms look at the relationship between two or more variables to find correlations, allowing them to predict one variable based on the others. This can include prediction of future events based on models of the past, but it doesn’t have to be about time. Use regressors to infer one variable from a bunch of others, and to sensitize users to (and act on) likely future events.
It’s easier to understand with a toy example. Consider the height and weight of chimpanzees. As they get taller, they tend to weigh more. Because of this correlation, if I give you the height of a particular chimpanzee, you could look on a data table and predict how much it probably weighs. Regressors do this, but for much more complicated sets of variables, with much more complicated relationships to each other.
Note that under the hood, regressors and classifiers are the same task. But the difference from a user perspective is that classifiers output discrete classes, and regressors provide a number within a range (often between 0.0 and 1.0).
Practical examples of regressors
- Estimated delivery date for your Amazon package
- Financial forecasting (for example, the likelihood that you will reach retirement goals)
- Recommended pricing for a good
- Weather predictions
- A healthcare patient’s risk, given the details of a diagnosis
↑ Input: The set of known variables
↓ Output: A number representing the predicted variable (and sometimes the range of error for that predicted variable)
Basic interactions with regressors
10. Updating: Predictions can be updated as new information comes in. Think of estimated time of arrival on a navigation app. How is the user notified? Can they reject the update? What “downstream” recommendations need to change and how?
11. Scenario-izing: Users may want to investigate how different predictions will play out and compare them. In some domains, users may want to document contingency plans for each scenario as a playbook to be used by others later.
12. Refining: After a predicted event happens, users may want to compare actual effects to predictions, in order to improve playbooks.
Related concepts and jargon: regression analysis, supervised learning (this is both related to regressors and classifiers), inference
Generators & Optimizers
The newest and arguably most unsettling capability are those that output new things. It’s also the oddball category in this writeup that is not just about one kind of technology. These algorithms use some of the underpinnings of clusterers, classifiers, and regressors (and other things) to help people be creative or creatively solve some problem. (And, it must be said, some malefactors to spread disinformation.) Offer users generators as part of creativity tools. Offer users optimizers to achieve the best and unexpected results in complicated decision spaces, like architecture.
Practical examples of generators and optimizers
- “Inception” Deepdream images: Those psychedelic images with, say, dog faces everywhere are neural nets that change images to maximize anything that looks like some target feature, like a dog face.
- Deep style: Images in which the style from one image is applied to another, like, say, modifying a modern selfie to look like it was painted by J.C. Leyendecker or Raphael.
- Deepfakes: Images, audio, and video in which a person is replaced with another.
- Mebots: These chatbots train on recorded speech or text by an individual, and thereafter can mimic them in chat or interactions, even responding to questions and prompts it hasn’t encountered before.
- This Person Does Not Exist: (https://thispersondoesnotexist.com/) Which composites features from different images of people to create new people. It’s mostly seamless. (And a fine place to go for persona images, if you only ever need one.) There are also “does not exist” networks for art, cats, horses, and even chemicals.
- GPT-n: The Generative Pre-Trained Transformer algorithms from OpenAI can take a few words and generate a lot of sentences that sound related and reasonably convincing. People have used GPTs to “write” songs, poems, articles, blog posts (this is not one of them), and even movies. GPT-3 is the most recent version as of this writing, and is so sophisticated that OpenAI has not released it generally, for fear of its misuse.
- Text-to-image: Cris Venuzuela has created an engine that will take text typed in real-time and produce an image. The results are very crude at the moment, but like much of AI, I expect this will become more sophisticated quickly. Imagine a text-to-video or a text-to-music function.
- Generative design: Given a set of criteria, generative algorithms explore a possibility space and suggest novel designs that meet the criteria. Spacemaker, for example, can present multiple configurations of buildings on a site to optimize for quiet, sun exposure, or wind control.
↑ Input: Most work from “seed” input like a few lines of text or an image, and all work with massive data.
↓ Output: New images, sounds, or texts
Basic interactions with generators and optimizers
13. Seed/Reseed: The source material, options, and criteria for success, if any.
14. Rerun: If users didn’t like the output.
15. Adjust “drift”: Many generators allow control of how “prim” or “wacky” the results are. A news agency way want prim headlines, but a poet may want the wackiest output they can get their hands on.
16. Explore options: Generative systems can produce a lot of results. As with any large result set, users may need help making sense of them all, or to design parameters that can reduce the number of results. (Or even to design systems that let end-users manage those parameters themselves.)
17. Lock or reweight criteria: If users are happy with some aspects of a generated result, but not others, they should be able to lock those aspects and re-run the rest for new results. For systems that are trying to optimize for certain goals, users may want to change the weightings and see what else results.
18. Save results (of course)
19. Export and extend: When users want to use the selected results as a starting point for doing more manual work, they need the ability to export in a way that lets them continue working.
Related concepts and jargon: neural nets (CNNs, RNNs), generative adversarial networks, deep learning, deep dream, reinforcement learning.
Universal Use Cases
Some basic interactions aren’t tied to any core technology. These aren’t derived from any particular AI capability, but they are important to consider as part of the basic interactions that need to be designed. Many are required by law.
20. Explain that to me: This isn’t always easy. Some AI technologies are so complicated that, even though they’re often right, the way it figured things out is too complicated for a human to follow along. But when decisions can be explained, they should be. And when they can’t, provide easy ways for users to request human intervention.
21. I need human intervention: This is vital because AI will not always reason like a human would. As AI systems are given more weighty responsibilities, users need the option of human recourse. This can often be in the form of objecting to an AI’s output.
22. Override: In systems where users are not beholden to follow an AI’s recommendations, users may need mechanisms letting them reject advice and enact a new course of action.
23. Annotate: Users may need to store their reasons for an override or correction for later review by themselves, their managers, auditors, or regulators. Alternatively, they may want to capture why they chose a particular option amongst several for stakeholders.
24. Let me help correct your recommendation for the future: If a recommendation was rejected or overridden, users may want to tell the AI to adjust its recommendations moving forward.
25. Let me help correct your recommendation for my group: One of the promises of AI is hyper-personalization. Its model may need to adjust to the idiosyncrasies of its user or group of users.
26. Show me what personal information of mine that you have.
27. Let me correct personal information of mine that you have.
28. Delete my personal information from your system.
29. Graceful degradation: If the AI components of a system should fail, users still have goals to accomplish. If the AI is not critical, the system should support the user’s shift into a more manual mode of interaction.
There are distinctions to these use cases when the AI is acting as an agent, an assistant, a friendly adversary, or a teammate. But the domain will determine how those distinctions play out.
We’re fortunate that today many design discussions about AI involve ethical considerations. I don’t want to short-change these issues, but smarter people than me have written about ethics, and summaries risk dangerous oversimplification. For now, it is enough to remind you that understanding what AI can do is only part of understanding what it should and should not do.
This article is just about the likely interactions that can have good or bad effects depending on how they’re implemented. It is you and your team’s responsibility to work through the ethical implications of the things you design and to design with integrity — or to raise the alarms when something shouldn’t be designed at all. At the end of this article, you’ll find a list of links to better sources if you want to know more.
These basics are building blocks
The field of AI is changing all the time. New capabilities will come, but for now, these 29 are the use cases you are likely to run into when designing for narrow AI. But these also aren’t the end of it. These must be woven together and expanded upon given your particular domain and application. I expect most sophisticated products will use more than one category simultaneously. For example, the supply chain software that I work with at IBM, Sterling Supply Chain Software business assistant, built on Watson Assistant, uses all of these…
Clusterers: To analyze inputs for a given bot and identify common topics and questions that went unanswered. These become candidates for future development.
Classifiers: To understand the user’s intents from their natural language inputs, as well as the “entities” (things) that have been mentioned.
Regressors: To suggest playbooks based on ongoing conversations.
Generators: For generating natural language responses.
And you look back over the lists of basic interactions, I hope you realize that these are use cases you can handle. Disambiguation, optimizer criteria, and scenario-izing may be the most novel interactions of the bunch. But things like managing groups, flagging bad results, and exploring multiple options are something you could design outside of AI.
These four capabilities are like building blocks. They need to be embedded in larger systems that support the humans using them, the environment in which they operate, and more to the point — supporting the user’s goals for using them in the first place. That’s why user-centered design is still key to its successful implementation. You got this.
Special thanks to the contributors to this article:
- Jess Holbrook, Google
- John Langton, Wolters Kluwer
- Vera Liao, IBM
- Martin Labsky, IBM
- Michael Muller, IBM
Selected resources for AI and Ethics
- IBM’s AI Ethics https://www.ibm.com/artificial-intelligence/ethics
- The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power https://www.amazon.com/Age-Surveillance-Capitalism-Future-Frontier/dp/1610395697
- Algorithms of Oppression: How Search Engines Reinforce Racism https://nyupress.org/9781479837243/algorithms-of-oppression/
- Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor https://us.macmillan.com/books/9781250074317
- Future Ethics https://nownext.studio/future-ethics
- Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy https://weaponsofmathdestructionbook.com/
Don’t forget to give us your 👏 !
A Primer of 29 Interactions for AI was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.