Editor’s note: This is the latest installment in an Uptech series of video interviews and accompanying transcripts about the emerging development and uses of Artificial Intelligence along with Machine Learning, YourLocalStudio.com and WRAL TechWire are working together to publish this series. Alexander Ferguson is the founder and CEO of YourLocalStudio.
Artificial intelligence, machine learning: These emerging technologies are changing the way we live, work, and do business in the world for the better. How is AI actually being applied in business today, though? In this episode of UpTech Report, I interview Chaitanya Hiremath, who also goes by Chad. He’s the founder and CEO of Scanta, whose revolutionary adaption of augmented reality and machine learning will potentially change the way you commute to work, play a game, and schedule your next meeting. [Scanta, which is based in San Francisco, says on its website: “Bringing empathy to machines with Virtual Humans.”]
- So, tell me Chad, how did you get to where you are today?
Wow, that takes me back. So, we started Scanta in 2016. It was September, I remember, so that’s when we did all the paperwork and we incorporated the company. And at that time—just before that—I had another startup that was closing down, and we were focused on 3D animation for VR—creating hyper-realistic visualizations for it.
I used to go with this suitcase of a virtual reality headset, a computer, and like, all the wires and everything else and try to sell that and share this technology, right? At that time, our target market was real estate. So, pitched that multiple times and this big suitcase that I got specifically like a sticker put on and did all of that stuff. And that’s when I focused on integrating and working on the Google Tango project and seeing what could be made specifically for mobile AR.
In late 2016, early 2017, worked with Coca-Cola for an AR campaign. It is known to be the most successful AR campaign for brands and for specifically Coca-Cola, you know, from their innovation team, since the last five years it’s been on the top of the charts. It was the most successful campaign in South Asia. And the CEO talks about it a lot, and it’s like great learning curve for us. And what we did with Coca-Cola was, you know, from a tech standpoint at that time, was like, wow, is this really happening?
And we did the campaign in India, and people were not exposed to anything like that, you know. You’re talking about you putting your phone somewhere and robots entering your dimension and doing some crazy dance and giving you a Coca-Cola. So, it got a lot of appreciation, and that got us some revenue. So that was nice, but after that we realized that the business was not really scalable.
After which we started a project called Pikamoji, which is an AR avatar library. So we’ve built over 2000, late 2017, early 2018 the biggest library of AR avatars in the world with 110 characters. Then I realized that people actually want to interact with these characters. We’re allowing 3D characters and NLPs to be integrated with our technology so that it is possible for us to finally have that person connect with our machines or our devices or our 3D characters.
- It’s brilliant because machine learning and the ability, as you say, to give a personality to these interactions, that is a game-changer. And that’s what fascinates me when I look at your technology. So you say, on your site, you say you’ve used augmented reality machine learning to synthesize smart animations, thereby instrumenting a better way of communicating feelings. Interesting the fact of being able to have—a computer have—feelings. So, I think it’s a fascinating concept, and I wanna dig in more to the technology in a second. But first, coming back to the company itself, as obviously you’ve evolved over time as every business does: Would you say you found, like, product market fit with this pivot, or is it still kind of in that exploration phase?
Yeah, so fundamentally, that’s something I’ve learned over the last three years. Product market fit is something that, what we did earlier with the game and everything else, that’s something that we probably should have done earlier, rather than after or during. And now, before we started this entire process, we went to a couple of players in the market and tried to understand, one of them being very…the most famous game engine that’s out there.
So, we went, and we had multiple discussions with the R&D team. And one of the fundamentals that came about was—and this was about a little bit than a year backwards—we’re talking about an era where even when you’re playing a game, you’re there on a live streaming platform and you have 10 other friends and you’re interacting with them. But that’s not always the case. There is a need for you to fundamentally be more engaged within the game—to have that sense of immersion.
So why are you not able to speak to your character in the same way that you speak to your friends? And when you have those interactions, you need to have a unique personality so that the engagement is high. There’s a higher level of immersion, right? So, you need to identify that your car has a different name for your assistant, and it talks in a different way and the other car talks in a different way.
Fundamentally, they might do the same functions, but that’s the uniqueness that needs to come out. And that’s the same process during the gaming. So, even when you’re talking about gaming, right? If you’re talking about, why can’t you create complex NLPs for each and every one of those characters, right? You’re talking about hundreds of characters. How can you possibly create that, right?
And all of them have different personalities, right? But, if you have the same dataset, just with the same dataset, if you add a different layer of personality on to each and every one of them, you can just treat the personality and have the core knowledge set, which is the same. And for the user, it would just be different because they sound different, and the construct of the sentences and how they’re reacting to it is different. So, you can see it implies scalability.
- What immediately comes to my mind—and I also enjoy games very much and I’ve played several—I think of a massive undertaking like Skyrim, where it has all of these stories, all these unique characters, but they had to program ’em and do voice actors on all of them and, it probably, that’s why it’s such a massive undertaking. But if every character in a game like that could be its own interaction done through machine learning, that would scale a game instantaneously. All they’d have to do is set a few settings. Am I getting your vision correct?
Absolutely; and also, from a why standpoint, one of the technologies that we have done a lot of research on is neural voice transfer, which is a part of, you know, some people put it as a part of voice cloning, where we already have about 10 distinctive voices, which if you looked at the WaveNet demos of DeepMind by Google, right?
I mean, they have really human-like voices. And you must have maybe seen it on the Google I/O conference where Sundar Pichai had this demo of chat calling a Chinese restaurant and booking a reservation, and–
Weird, you know? And that’s all generated, right? That’s part of the whole NVD space, and that’s what we foresee, right. We foresee that those different voices could be integrated with different characters.
Watch for more from Chat in Part Two of this interview.
Credit: Google News