Computer Vision Tech & Analyzing Engagement Levels in Zoom Meetings with David Shim of Read

How can we make virtual meetings less soul-sucking and more productive?

In this edition of the UpTech Report, host Alexander Ferguson meets with David Shim of Read to discuss his computer vision tech for analyzing engagement levels on Zoom meetings.

When you’re looking at twenty different faces on a screen, it’s kind of hard to get a feel for what’s going on, but Read Dashboard provides metrics that allow you to actually scientifically quantify how a meeting is going.

By using deep learning, Natural Language Processing, and other exciting new technologies, we can really start to see who is paying attention and who isn’t. Read analyzes facial expressions, body language, speech patterns, voice tones, emotions, and more to provide real-time stats about how engaged people are in the conversation.

David Shim is the Co-Founder and CEO at Read, delivering better meeting experiences through a shared dashboard that measures engagement and sentiment in real-time. Read’s mission is to make every human interaction meaningfully better, smarter, and happier starting with the more than 500 million people that video conference daily.

Prior to Read, David was the CEO of Foursquare, the location layer of the internet. While at Foursquare, the company exceeded $150MM in annual revenue, achieved profitability, and acquired its two largest competitors, Placed and Factual, to create the de facto leader in location. David joined Foursquare through Snapchat, which had acquired his first startup, Placed for more than $175MM in 2019.

Co-Founders David, Rob, and Elliott saw an opportunity to usher in the next evolution of video conferencing by augmenting interactions with measurement. By making real-time sentiment and engagement metrics available to all meeting attendees, Read (https://www.read.ai) encourages collaboration toward a shared goal of a better meeting experience.

Read was founded with a privacy and transparency first approach to measurement, aligning with the founder’s past experiences at Placed, Snapchat, and Foursquare. Read Dashboard, the company’s first product, encourages more authentic interactions and is available for free on leading video conferencing platforms

show less

TRANSCRIPT

DISCLAIMER: Below is an AI generated transcript. There could be a few typos but it should be at least 90% accurate. Watch video or listen to the podcast for the full experience!

David Shim 0:00
What I realized was is hard to really get a sense of the temperature of the room like reading the room.

Alexander Ferguson 0:10
Welcome to UpTech Report. This is our applied tech series. UpTech Report is sponsored by TeraLeap. Learn how to leverage the power of video at Teraleap.io. Today, I’m joined by my guest, David Shim, who’s based in Seattle, Washington. He’s the CEO and co founder at Read. Welcome, David. Good. Have you done.

David Shim 0:28
Great to be here. Great to be here, Alexander.

Alexander Ferguson 0:30
Now Read, you guys provide a meeting analytics for the leading web conferencing platforms. That’s what Reed is all about helping understand and really what was the problem that you saw and set out to solve with read?

David Shim 0:42
Yeah, when with read, really, it started at COVID when you when we all send in more and more meetings. And what I realized was is hard to really get a sense of the temperature of the room, like reading the room, when I’ve presented before I’m able to look around the room who’s paying attention, who’s not paying attention, who’s looking at their phone, who’s talking to their friend. And for me, it became very difficult to do it video conferencing, and especially once you started to get past 2345 People that level of scale, you just not able to comprehend or process at that rate that you are in the physical world. And so I started to think about what can I do? What can we do to actually make that process a little bit simpler. And what came to mind was really a dashboard. Kind of like when you’re driving a car, you can’t always think about how much fuel there is, you can’t always think about how fast we’re going, you can’t think about the RPMs. But occasionally, you will look down at your dashboard while you’re driving and say, oh, I need to go to the gas station, or, Hey, I probably should slow down and go 20 minutes, 20 miles over the speed limit. And that’s what we want to be able to do is give those helpful nudges during a conversation, to not distract, but actually make the conversation better.

Alexander Ferguson 1:45
Everything is around conversations. Like if you can have effective communication with your team with potential people, that is paramount, and everyone shifting to online. There’s a sense of zoom fatigue, it’s a common statement everyone makes, but being able to still do effective communications. How are you? Let’s look at the technology first. And then we’ll look at some of the other pieces more human parts of it. Is it effectively you’re listening to the voice and the face? What Tell me some of the the pieces to it?

David Shim 2:14
Yeah, so it’s a combination of computer vision. So it’s looking at the faces and saying what is that kind of expression that you’re making those micro expressions where your eyes looking, are they darting around are they would hit somebody probably in the room and you’re having a conversation, those are all things that nonverbal, but just looking at the face. And if you think about it from a conversational standpoint, if you’ve got 20 people in the room, 19 of those people aren’t talking. So it’s important to actually in order for you to get a temperature of the room, you need to be able to actually look at the facial expressions, then we look at the words that are being said, so we were converting that transcription of voice into text. And then we’re running it through a number of different NLP models to say, hey, is the conversation positive or negative? Are they talking about great earnings? Are they talking about, hey, we’re going to do layoffs and really being able to understand that context to go in and say, Now I got facial expressions. Now I’ve got actual voice and specifically the words that are being said. And then the third group is prosody. So prosody, I didn’t know what it was until I started read. And so prosody is really the tone of your voice. How do you say something? Is there sarcasm in there? Are you joking around? Are you saying gray or like gray? Like there’s different ways that you can say those things, and it’s important to actually pick up those tones?

Alexander Ferguson 3:27
And you’re saying that your your model can pick up that that variation in tones? Absolutely.

David Shim 3:33
Absolutely. Now, it’s a work in progress, like certain people have different layers of sarcasm, where it’s better than others. So that is a very difficult problem. But there’s, there’s obvious things like if you’re very angry, the way that you say great can be very different from when you’re super happy, and you’re laughing, and you’re smiling. So those are things that we’re able to pick up and add into the model to say, at the very end, just go and say in real time, how’s the call going? Is it good or bad, that’s, that’s really that context. And you might say, on a one on one, you don’t need it. Most people on a one on one basis, you can get a read. But when you start to go into presentation mode, when you start to go in, you can only see three people on the top of the screen. Because zoom, you get your presentation going, you got your picture, and maybe two other people on the call, you’re missing out on the other 17 We’re able to pull that information in as an attendee in the meeting. And we just set it to gallery and we’re able to see those faces and we’re able to process all that information.

Alexander Ferguson 4:26
Being able to look and detect even just you saying that the visual cues of if someone’s paying attention, I imagine that’s a that plays a big role. But you’ve been able to work around it when when when someone is actually looking at their computer versus looking away was that difficult to figure out and finger through?

David Shim 4:44
It was because at first when we started it was cameras usually on your laptop, so you’re looking straight ahead. But in reality, then we started to say, Wow, this is this is strange, some of our training data. It’s like people are looking this way, but it looks like they’re paying attention. Oh, it’s a second screen. So then we started them off Like, are you actually looking at a second screen. So when you talk, you go to this position, but then when you hear somebody else talking, you go back here. But one of the basic rules of that is like your head is stay stationary, so it doesn’t move around. Because you’re looking at the screen, it’s a second screen. Versus if you’re looking at a web page, your eyes might start darting like this, your head moves a little bit. And you’re kind of reading content here. It’s like you’re very stationary, and you’re looking at the screen. And that’s that was kind of one feature in the model to pick up that second screen.

Alexander Ferguson 5:27
Wow, I can only imagine that the minute details you having to fine tune as you look at this now. What was the process of of beginning this is just creating this dashboard, getting those those feeds in? Where did you get the training data? Are you getting initial people for that? Or like, how did that begin?

David Shim 5:48
Yeah, so training data initially was in house, right? So co founders worked together, we built the training data set, as we hire more employees, we’re able to build more training scenarios. And we tried our best to act in certain cases, and like, hey, pretend you’re angry, pretend you’re sad. Those are hard things to do. Because a lot of people smile when they pretend to be like, I’m super angry with those kind of grunt. And so we started to look at a lot of open source dating data sites. So open source where we look at it, there’s a lot of data out there, but it’s very narrow. So it’s not video conferencing specific. But it’s going in and saying, here’s a set of facial expressions that signify anger, happiness, etc. So there’s a lot of data that’s out there, there’s been a number of great companies that have kind of worked in that space. Same thing goes for voice. If you think about voice today, it’s almost I don’t want to, I don’t want to simplify it. But it is become somewhat of a commodity where transcription 10 years ago was incredibly difficult, incredibly expensive. Now AWS, Azure, Google Cloud, all the services have transcription services there, there’s startups that do transcription services. So now you’re able to go in and pull in those transcripts. And then you run those transcripts through models. And there’s a ton of NLP models, thanks to the Alexa’s of the world, etc, where people are used to processing that data and figuring out what did they actually be?

Alexander Ferguson 7:03
So it’s a building, basically, because of the democratization of NLP, you’re able to build on top of that. And then what you’re doing is we’re finding I guess, some of the data is coming out of the audio side, but then the video side is the is probably the biggest piece that you’re having to build custom. Is that correct? Yeah,

David Shim 7:23
I’d say both. And then the reason why it’s both is a lot of the audio is one way conversation. So it’s not the conversation for the training data says I’m very angry, I’m very angry, I’m very angry. So when you have those things, that’s that training data for one person, but you want to have a conversation. So there’s almost a back and forth, there’s synchrony where you’re talking with someone, and if it’s a good flow, I’m not interrupting you, I stopped, you talk, we go back and forth are smiling, like, those are things that you have to pick up. So I’d say Open Source helped us get off the ground very quickly. But what we ultimately found was, it gets us about halfway there. And then the other half was like very much specifically, we needed to build training data associated with video conferencing that doesn’t exist at massive scale. When you have video conferencing, you have to be able to connect certain elements like what happens when it’s only camera off. But it’s voice How do you treat that person versus if they had camera on and voice on. So we start to get in these like little niches, but they’re very important niches when it comes to video conferencing. And for us, that’s where we spend a lot of time now where we’ve done things like, you know, paying and hiring people to actually build out these training sets to find public training datasets that are out there, where they might be a video conference call with what’s a good example, political panel, and it’s a video conference called political panel. It’s available on the internet. And we look at that data we say like, Hey, let’s have 100 people watch this video and they score in the Annotate different parts of the video to say, was this tense? Was this not tense? Was there laughter? Was there not laughter? Was there synchrony? Was there not synchrony really kind of building out these labels that are refined for video conferencing

Alexander Ferguson 9:01
is quite an undertaking, it your your desire that you truly feel that this is a needed tool in in today’s environment with the Zoom meetings of being able to quickly look at that dashboard?

David Shim 9:14
100%, because when you think about it, half of all meetings are considered not productive, or not satisfactory. And so that’s going in, it’s like half of the Zoom calls that we have, they’re not good. Now you can’t go in and just say like, I’m gonna cut half of the meetings, you need to be able to understand what half what made that bad for certain people might be good meeting. So I’ve been on calls where there’s 13 people on the call, three people are going back and forth, and they’re just jamming away. They’re having a great time. The other 10 people like why am I even here and this doesn’t matter. So the call technically could be bad, but it was good for those three people. And so now there’s opportunities to, we should probably do this call again, but maybe not invite those 10 people, not because we don’t want to be inclusive, but because we don’t want to waste their time because 10 people for one hour is one full time employee for a day. And once you start to quantify buy those things. It’s not about taking away opportunity, but it’s actually about giving people time back.

Alexander Ferguson 10:07
See, what pops in my head is I think this information is valuable. But I’m wondering if people going to automatically know how to use this like, like, you just had a greater insight right there have three people talking, but there’s 10 people on the call seven are doing it. Have you thought about how will you be providing that insight? Like, by the way, you should do this, or you think people should assume and be able to know this? Like, wow, okay, we really shouldn’t invite those seven people anymore, because we know they don’t they’re needed for this call. No, there’s

David Shim 10:40
a big education hurdle. Like there’s like, it would be like if someone never drew if someone drove a car for the first time. And there wasn’t a dashboard. Then all of a sudden, 20 years later, they said, here’s a dashboard, you’re like, what is I’ve been able to drive without this thing? Why do I need this? This doesn’t make any sense. And the way that I like to compare it is right now we’re at the Rand McNally stage. So Rand McNally is mapping company for folks are listening in it’s old school. And they used to sell maps, they still sell maps, but these are so maps where you’re driving the car, if you watch those 80s movie, and they pull out the map, that’s what they did. They understood where the streets were, where the buildings were, where the mountains where the rivers, very hard problem, you need that baseline. And I think everyone is kind of right now using Rand McNally, when it comes to video conferencing, they’re trying to take what they’ve learned in the real world, and trying to apply it in real world interactions and trying to apply it to the digital world, where they’ve only done it for about two years versus like in the real world, you’ve done it 20 3040 5060 years of human interaction where you’ve been able to pick up certain tells know how to respond to people, digital world, you’ve only had two years. So that’s the first step. So that’s where it’s like Rand McNally.

Alexander Ferguson 11:45
Got you. You’re saying basically, as humans, we’ve gotten used to in person meetings, knowing how to read people and understand the room. And since it got to virtual last two years of big emphasis that we don’t know how to do that anymore.

David Shim 11:59
100%. And I think we certain things will always stay the same. Like being rude is not good. Being funny is great. Like, there’s things like that. But how do you imagine if you were in a room where all of a sudden someone took a blank sheet, put it over their head, and didn’t say anything, that entire meeting, like, you’d be like, This is weird. But that’s what actually happens on video conferencing, someone turns off the camera like this. And then all of a sudden, you’re still supposed to interact with this person while you’re there in person. So these are traits where it’s like, we’re just not used to this. And I think there’s a there was a study that came out a couple years ago where it said, hey, the reason why we are so stressed out with video conferencing is because our human nature isn’t designed for video conferencing. If you’re within two feet of someone that is close, and you go in into the real world, you’re either trying to date them, or you’re trying to fight them. And imagine like, you’ve got eight hours like just You’re not human nature, like close talkers, people talk about close talkers like get away, like, right. I’m sure there’s Curb Your Enthusiasm episode about that. But here, it’s like laptop, we’re always that close. And that just isn’t normal for us. So that’s why we’re always in this heightened state of like, do I need to do something What’s going on here? And I think you need to retrain yourself. And one of the best ways to do that is to have something like a dashboard, something that nudges you along the way

Alexander Ferguson 13:17
that gives you insights to know how is this going? What is the what are the things going on? Interesting, interesting. I’m curious, your own feelings, in person meetings, online meetings, which is more soul sucking? Well,

David Shim 13:34
I think both can be just a soul sucking in different ways. And I think that’s what we’re trying to solve we’re trying to solve and our vision really is not to solve just for digital, the hybrid workforce is going to go into the office, we’re already seeing that across a number of our users where they have meetings, that they’re actually in the office with four or five people and there’s seven people dialing in, and making sure that connection exists. And that this that’s this is another reason why it’s important to have something to help nudge things along, where if I’m not in the room, I can’t pick up certain things. Someone says a joke, I might feel excluded. Someone’s in the author and they have a conversation and I can’t hear I might feel exclusive excluded. And you want to bring that in a hybrid world make everyone feel like they’re on a level playing field. And so for us, it’s those nudges where we can identify, oh, hey, there’s some conversation going on in this room that’s muffled it’s probably a distraction. Everybody else to give that room just a slight nudge that says like, hey, maybe, you know, slow down on the side chatter, or the sidebar, that could actually make the meeting so much better for everybody who is remote.

Alexander Ferguson 14:38
Wow. So you’re suggesting that this feature is active yet but that if there was a combined remote a hybrid meeting going on that it could listen to tell there was not enough audible sound, but the Enough chatter going on and alert would come up for those in that meeting saying, Hey, quit the quiet quit the side chat.

David Shim 14:58
Yes, I think not today. We have partial features that do alerts on that point. But we want to get to the recommendation. So when I use that Rand McNally today, it just maps over time, we want to give directions. So think of it like this is MapQuest, like you print out the directions. And this is how you get to that point A to point B in the fastest way generically, but we want to get to recommendations, which is Waze, and with Waze, that’s giving you nudges. And the beauty of Waze is really, it’s so subtle, but you trust it so much, that if it tells you like, hey, even though you’ve taken this route, 90 times over the last 90 days, if it tells you the one time to say, hey, cross across four lanes of the freeway, take the off ramp, and you’re going to save 15 minutes, you’re going to go across those four lanes, even if you’ve only got like less than a quarter mile to get there. And you’ll do it because it tells you that it’s going to give you a better outcome. And that’s I think that’s what technology needs to do. It’s not about replacing people. It’s not about AI that replaces someone in the conversation, or tells you what to say. But it’s just giving you that information and really bite size way so that you can decide to make a decision. And you can decide, hey, no, I’m going to stick with my own route. But I appreciate that, oh, I got stuck in traffic. Maybe next time, I’m going to listen to you the AI a little bit more.

Alexander Ferguson 16:09
This, none of the thought that pops in my head is this dashboard. It’s like the check engine light or tire pressures that is similar is trying to give you clues that there potentially could be an issue with someone on here is not being included, or you need to pay attention to what are some of the things that have you seen that cause meeting sentiment and engagement to drop in a meeting like it already, maybe just a tip for the duty meeting, say since you’re so focused on this, this industry, this,

David Shim 16:39
I think not having camera on so certain people have gone to the culture of like, hey, it doesn’t matter if you have a camera on or off. But it might not impact the person who has the camera off. But we see definitive like data that shows us like, if you have the camera on and you’re one of the few people you feel kind of like oh, you’re not paying attention just organically, because now I’m in this empty room while I’m willing to show my face. Everybody else isn’t. So you’re gonna go to that worst case scenario? Are they even paying attention? Or the surfing the web? Are they watching TV? Are they doing their laundry? You don’t know why they decided to not that camera. So from an engagement level perspective, if you want to be a strong participant, and this is not a webinar, webinars, they can’t even see you there. But if you’re in a meeting with like six or eight people turn on the camera, I think that that’s a bar to say like, if you don’t need to turn on the camera, do you really need to be on the call? Like, that’s a question that you should ask not a bad way. But it’s like, hey, they feel they don’t need to read you. They don’t need your facial expressions to understand like what your response is? Do you need to be on the call? Or is this more just an update that you could skip? And they send you an email?

Alexander Ferguson 17:40
Wow, that’s a great question to ask if you yourself and and many folks that are doing meetings, what? What do you see as kind of the the biggest roadblock of adoption of saying, hey, let’s, let’s try this. Let’s put it lets me start looking at dashboards all the time in my meetings?

David Shim 17:58
Yeah, there’s a couple. So I think one is education. So people don’t know they need it. So this is a big problem, in the sense of like Slack have the same problem, too. There’s a number of companies that have been very successful a number that have failed, where you have to kind of be able to say, this is a problem that you know, you didn’t know you had. But now I can explain why the value proposition is so strong that you should be able to utilize this solution. So I think education is going to be key for us from how do you use it? How do you use it in a simple manner versus like getting thrown in the deep end? So if you look at our analytics today, it’s really about the metrics. But we want to soften that up in future iterations to make it easier to digest when you first come in. It’s not hey, there’s a bunch of bar charts and a bunch of graphs going on. But it’s like, Hey, is the call going? Well, yes or no good or bad. We want to make it a little bit simpler there, and then let people dive into the details. So I think that’s going to be one hurdle. I’d say the other one is from a privacy aspect. So people are concerned like, Hey, if you’re measuring my calls, is something going to happen here where I get in trouble, because I’m not talking enough. I’m not participating enough. And the goal is no here. So the things that we’re doing right off the bat are, we’re saying we delete video and audio data, 24 hours after the conversation. So there is no transcript, there is no video, there’s no playback, we wanted to make it very Snapchat as so you can have more authentic conversation. So this is not a policing tool. This is more about, hey, we’re giving you directions in real time on how to make that conversation better. Think of it just like with Waze, you never go back to Waze and say, like, Hey, let me go in. And are you actually tracking where I’m going? And this is the direction here, and I disagree with it. And this is a problem. Oh, no, you’re just like, hey, give me the value, make me feel comfortable that you’re not doing anything with it that could cause me problems, which we’re absolutely and we’re saying, but the simplest way is we just delete your data. So we just kind of put that out there. So we do need to be able to message that strongly and be able to make sure that people are comfortable in that aspect. And when we join a call, just as another example, in the chat will go in and will actually let anybody opt out. So if you type in opt out when we’re on a call, it will immediately leave the call Not you don’t have to be the house, you can be the 14th person on the car, the fourth person, but we will immediately leave the call and we won’t measure anything. So the goal here is like, really, we want to make it as simple as possible for people who aren’t comfortable to opt out. But we also want to make people feel comfortable that we’re doing the right thing with the data. We’re trying to make conversations better.

Alexander Ferguson 20:19
What you were sharing earlier, actually about even the fact that you know, you’re paying attention to the way people’s eyes are looking at everything. That’s a big fear of privacy overall. So it sounds like you’re you’re definitely honing in on the realization for us, like, let’s just delete all the data, nothing’s kept, right.

David Shim 20:35
Absolutely. And that’s important, because you want that authentic conversation, like, whenever you know, you’re on camera, and even like, I’m gonna stand up a little bit straighter, I’m not gonna say more natural words, that idea and only trouble, like, there’s a certain aspect of that just naturally that we do. And for us, we want to have authentic conversations where when we give you advice to say, Hey, David, you’ve been talking a little bit too long give Alexander a chance to talk like that, because we’re natural. Like, that’s more like, it’s not tattletaling on me and saying, like you do this, and I’m gonna tell Alexander later, it’s more like, Hey, have a really good call. And this is one thing that you just might have missed, because you’re so excited. So excited about telling the story that you went on probably a little bit too long, and you can throttle it back.

Alexander Ferguson 21:17
And when we did our prep call before, you brought the the app in, and I showed it actually shows up as another participant so everyone can see this happening. Is that right? 100%.

David Shim 21:28
So everyone can see it, everyone has access to the same reporting. So it’s not someone that has like information over you, it’s actually democratized to say, everyone has the same information, because the way that I think about it is it’s almost like a baseball game or a basketball game or football game, you want the score on top of the screen, because people can see how the game is going. They can make decisions based on the score based on the hits based on the runs. All those information points make it a better interaction versus like, if I didn’t know like, you watch baseball, there’s no score, you’d be like, well, this is kind of I don’t know what’s going on right now. This is good or bad. I don’t know.

Alexander Ferguson 22:02
Oh, bad. Before we get into the business side, because there’s everyone who’s listened to the series knows, I like the two parts of technology and the business part of it. Coming back to just the technology a little bit more is is these these algorithms to be able to detect these nuances and understand and provide this data. Has it really developed that, like, has it developed so far in the last like two years? Like, would you have been able to do this two years ago, three years ago? Or is it’s been around for a while we’ve noticed just applied it,

David Shim 22:35
it wouldn’t have been possible two or three years ago, a couple couple reasons why certain parts still have not been solved from the data science broad perspective. So we’re solving it as we go. But there are things like the natural language processing like that, that now your GPT three where it’s like, okay, I can actually write sense for you based on your first three sentences. So you can see how far that technology has come along. Computer Vision, not all the way there yet. So I think it’s about reading the motion the face. And there’s a lot of biases that occur that have existed in the older model. So if you had a, let’s say, from a racial perspective, if you had a lot of Caucasians being the training dataset, well, that’s very different from someone who’s in Africa versus Asia versus Europe versus South America, like all of those groups have different ways that they express themselves. And in international culture, like you’re going to have different people on different calls. And they’re going to be interacting with one another. And you just can’t apply one group or one gender or one race and say this is the model that applies to everybody. So that is something that is a little bit newer, where a lot of the training data sets have had built in biases. So it’s important to actually fill in the gaps there to make sure that it is representative,

Alexander Ferguson 23:44
a getting sentiment from such a diverse group of people because the training said training data doesn’t exist a sounds like is one of the bigger hurdles to both solving your problem, but many other problems related to this. The idea of sentiment The other thing that pops in my head, I’m curious what you see around this as some people just are different. I did an interview a little while ago with someone and I thought he was angry at me. Like, like, like, I thought he was just like mad at me or something for most of the interview. But by the end of it, I realized he’s just a very intense individual. You know, he’s just just very here. And when he gets passionate he gets, but I’m gonna be firing my brows too much. I’m a very expressive person. Have you thought about being able to to, to be able to see the variety of people’s own expressions in that sentiment?

David Shim 24:35
Yes, so that does a very good point. So initially, we had conversations where it’s like your resting face can vary. So my resting face has a little bit more negative score than other of my coworkers, but we’re able to see what your resting faces. So if you’re able to build that into the models where you have a baseline so it’s like, Hey, I’m not saying anything. In screen listening to you talk. I can start to build a baseline and that baseline I can apply over time of Call so not across calls, because we don’t carry that information. But even in the 30 minute call in the first four or five minutes, I can start to build a baseline, then go in and say, okay, David is smiling, but he’s just, that’s a smile for David,

Alexander Ferguson 25:13
you’re actually building a baseline per person, not a general. Alright, this is a baseline of an individual, let’s apply it to this person that we see on camera. It’s a baseline of that individual

David Shim 25:22
for the call. Correct. So it’s going in it. So we have the generic overall models, so we’re able to kind of apply those. But as we start to build those baselines, we leverage that to say, Hey, David, smile, so far has Max smile is this. Okay, that’s all right. Well, that’s Maxwell. So we’re gonna keep an eye out for that. So it’s gonna be very different, or the words that he says, are very flat, not intentionally, but just like just like a Xander, this has been good, appreciate your time. But that tone, etc, that’s, that’s my baseline. So now, if I ever go like, Oh, this is great, all of a sudden, it should spike in terms of sentiment engagement, right? Because my baseline was so

Alexander Ferguson 25:58
flat. But that is that is fast. I remember when I did this test with with my team member. And I’m just naturally very expressive. It was going up a little bit up and down. But this other person isn’t as expressive. So when they did something, I saw that and it’s, it’s really fast in that you’ve tied it to the individual that makes it much more usable, then then I would first assumed that the product was, I don’t know, did you have that, in your mind from the beginning that it would be per per individual? Or did you have to was that more of an evolution

David Shim 26:29
isn’t evolution we knew that we wanted to classify is a meeting good or bad at a very high level, we want to be able to identify that, then he wanted it, then we said, Okay, now we need to figure out what parts of the meeting are bad so that it can be better, or what parts of the meeting were good that you do more of those things. And it ultimately then gave came down to kind of like cohorts and individuals. So from a cohort perspective, like let’s say you’re doing a sales pitch, you don’t care as much about your team and their kind of send them in engagement, you want them to be good, but it’s not as important as the clients. So you want to build cohorts during a call to say, these four people are the most important people that I want to read on. I don’t care about myself or the rest of my team. And having that as become very important. As we start to talk with different partners. As we talk with customers utilizing the solution, it’s been important to actually kind of dive into what is that use case that you’re trying to get to?

Alexander Ferguson 27:19
I wonder if people can get used to this on sales calls that the person selling you is going to be using this and wanting to know if you like my product? Do you not like my product? Is that Is that gonna be one of the use cases that?

David Shim 27:32
Absolutely, because if you think about it today, there’s Gong, there’s high spot, there’s outreach, there’s billion dollar companies with hundreds of millions in revenue, okay?

Alexander Ferguson 27:42
They focus on video as well, or just audio, just audio

David Shim 27:45
today. So it wouldn’t surprise me if they go down the path, but their focus really is like on sales enablement. So it’s not just the initial conversation, it’s more with the research the follow up, it’s the emails of the said. But with that said, they’re really kind of diving in and saying like, Hey, I a bot will join the call. So they’ve gone we’ll have someone joining the call the recording the conversation, it’ll declare itself, and a lot of people will be okay with it like, and they’ve kind of set the table where they’ve trained prospects that Hey, Gong is joining the call to make sure that it’s a satisfactory call. And the pitch there is like, hey, and enables me to do a better pitch. If something goes wrong in this call from the follow ups, I want to make sure I have the right follow ups, there’s enough value proposition there that people are opting

Alexander Ferguson 28:26
those transition more into the business side then of what’s your focus? Are you focusing also on the sales arena? Or what’s kind of your main categories? You’re looking at use cases?

David Shim 28:38
Yeah, today, we’re wide open. So by that we have certain assumptions where we think certain markets will be very favorable for us. But when we launched on Zoom, when we launched on WebEx, we actually made it completely free. So anyone can actually go into their app store’s or go to read it, I created an account, and be able to use read, unlimited, no questions asked, no credit card required, etc. And the reason why I’m doing this is like, in my past life at plays my former startup, we spent a lot of time trying to figure out who the right customer was, and we try to sell into them, and we limited who had access to the product. And that was a little bit of a mistake in the sense that we found product market fit, but it took like four or five iterations. And those four or five iterations would have happened a lot faster had we made the first version more accessible versus a six figure contract. And so here with read, we’re going after an audience where it’s like half a billion people use the top four platforms every single day. So it’s a huge market. So for us, what we want to do is we want to push this out into the marketplace, make it readily available, and start to see different groups adopt this and say, this is very interesting to us. And then from there, we can take their feedback, and then we will start to go in and say, Hey, what’s the right pricing model for this? And for us, our assumption is that it’s going to be perceived just like zoom, WebEx, teams, etc. And it It’s going to go in and it’s gonna be reasonably priced. We’re thinking somewhere around the same price as the video conferencing solutions themselves. So we want to make it very easy to adopt, in the short term to drive adoption across more users and more use cases.

Alexander Ferguson 30:14
You guys launched in 2021? Right?

David Shim 30:17
Yes. So we officially launched in September 29. So that was, that’s when we launched our first product into market. So we’re very kind of early days today. But we’ve measured, you know, millions of kind of hours of conversation. So far, we deliver the analytics, we’re getting the use cases. And we’ve seen a lot of like, and this is where I think, to your prior question, a lot of surprising use cases, like what has been, what is it international corporations that want to interact with American customers, and they’re not able to pick up some of the social cues. So the way that an American might talk, the vocal is the movement of the hands, the the eye for etc. Those things aren’t necessarily translate to someone in Singapore, it doesn’t necessarily translate to someone in Dubai. And so it’s important to actually bridge that gap of those social cues, those facial cues, those vocal cues. And so we’ve actually seen a really large uptick in terms of users that are international having calls with Americans. And we’ve actually started conversations with a few of these groups say, hey, how do we scale this out? So that’s one. But that was not an unexpected one that

Alexander Ferguson 31:25
I actually popped in my head of it. Because if anything, the internet is making more and more as the world gets smaller, and people are wanting to hire people in many different places. And so your cultural, your team culture, or rather, the different cultures that are part of your team is only going to increase? Have you thought about it that being playing a role of this educating you on people’s different ways of interacting of that actively? Because what sounds like one way to American but outward as well?

David Shim 31:56
Oh, absolutely, absolutely. So a good example is when I was at Foursquare, we had a number of partners in Japan. And I don’t know the Japanese culture as well as I should. And in some of those conversations, I couldn’t get a read. I thought, like, am I bombing this conversation? This is horrible. Like, no one’s talking, there, only one person is talking. They’re all looking at this person, like what’s going on here? And then later, as I talk to some of those great cause, like what are you talking about? Only one person was actually talking and like you were really slow to respond. It’s like, No, this is the way that they do. It’s the senior person that talks first, they’re kind of digesting it was like, Okay, I would have I would love to do that in real time, because I would have kept doing what I was doing. So I think hourly, there’s also a use case to go in and say, as I talk with other cultures, how do I bridge that gap? Because I might be doing something that is unintentionally an offensive as well. So maybe I’m super excited. I’m talking fast. And I’m talking really loudly, I might not understand that that might not be the right approach for different culture. I would love a nudge because it’s not intentional that I’m doing that. It’s just the way that I speak today. But I wouldn’t have any issues saying, hey, I can slow down a bit, you know, I can lower the volume a little bit, I can wait 30 seconds if they have a question. So I’m not interrupting them. Those are things that are easily solvable. But after the coffee, your sales rep, you’re done, like if you offended a customer, and you just you got to lead at Tikki. Let’s say you’ve got those emails where it’s like 100 emails, get you 10 leads those 10 leads get you one meeting, that one meeting 50% of the time it’s cancelled. So you’ve done all this work to get there. Don’t even want to actually have as many tools to make that 15 minute call or 30 minute call as good as possible.

Alexander Ferguson 33:35
I feel like this is a term you shared with me earlier, augmenting reality with with data. Not with glasses. It’s not augmented reality of like, let’s though actually, I will say I stopped myself with invid sentence because I could imagine eventually, your solution if when AR glasses do come out, could it be used to to read in a real environment, the sentiment and the environment like how people are doing is that is that a potential future case?

David Shim 34:06
Oh, absolutely. That’s that’s what we’re very excited. I think it’s a couple years video conferencing, huge platform. But over time, they’ll it’ll become closer and closer where the real world and physical world interact. And Google Glass, Apple, glass, Microsoft, glass, all these people that are coming out these glasses, a snap glass, don’t forget that they’re coming out of these glasses. There’s going to be apps and those glasses. And those apps can be something like read where we can go in and during a conversation actually help give nudges to make that conversation better. We actually had one of our investors come out and say like, Hey, I could definitely see a use case for autism, where if someone has autism spectrum, and they aren’t able to pick up certain social cues, this could be infinitely valuable to them, not just anybody else who’s interacting with them, but to them say, Hey, I didn’t have you, you noticed this social cue. Maybe you should stop talking about this topic. Or maybe you should go into more detail because Want more information. And just getting that knowledge makes that person’s life so much better because they’re able to read the people around them where they wouldn’t have been previously. And the person on the other side is able to actually have a better interaction as well. So that I think is like holistically where we want to get to a longer term. It’s really cool to say like, we can make every interaction better. So that’s what I’m excited about. With that said, there’s also like really business use cases like Okay, imagine that you are working at a $15 per hour job. first job out of high school. Someone comes in and nothing against Karen’s, but I’m going to stereotype a Karen comes in and they complain about X y&z They’re starting to yell at you. And you’ve never experienced this in your whole life. Wouldn’t it be great if something actually if you were in like Google Glass or an apple glass and said, like, hey, take a breath. This is what you need to do next. And all of a sudden, it’s like, this is phenomenal, right? This is like training in real time. This is making that outcome better, where you might have been graded by this person and they would have left unsatisfied, you would have been angry to versus like, hey, the SE de escalate the situation.

Alexander Ferguson 36:05
I see the future. I love it. I love it. So you you have two other co founders, right, Elliot? And Rob, is that right?

David Shim 36:15
Correct. Yeah, so VP of data science and VP of engineering? How did you guys meet? So we’ve worked together, Elliot, I’ve worked together for almost 20 plus years. So we met at a company called fair cast, which did airfare price predictions. And what we did was we actually scraped the internet for airline ticket prices. And then we built models, I didn’t build them, Elliot built the models that went back and said, Hey, based on the price movement, based on the route based on the time of day, based on the day of week, we can predict airline ticket prices will go up or down whenever you do a search. So when you’re on Kayak, they have this feature, Expedia used to have this feature. Other places have this feature where it’s like, you type in a route, it’ll tell you should you buy the ticket now, or wait, because the price is going to drop to that company, we work together there was bought out by Microsoft, I can’t remember when I think early 2000s. And then I got back together with Elliot and Rob, and my next startup called placed, and what place did was measure the physical world. And our core product was attribution. So Did someone see an ad here an ad that then drove them into the physical store? So what we were doing was kind of similar to this in a lot of ways was, we were taking really noisy location data from smartphones, and trying to identify did a store visit occurred, so not Did you walk by the score, but did the store visit actually occur, and then do it with a high enough level of accuracy, that you can actually decide how you want to spend advertising dollars, because if you ever open up Google Maps, it gets more and more accurate. But sometimes it shows you across the street, or you might be at a gym, but it shows you how to McDonald’s right next door. There’s all these variations that can occur. And if you get that wrong, that can have severe implications in a lot of different areas. So for us, we worked together there for a decade with the snap with the Foursquare. And then over time, like I left, they left the different points. And I took a little bit more time off, I was just kind of like enjoying a break from location. And then we ultimately kind of connected together this summer. And we started to say like, what are you thinking about? What are you working on and this kind of came to the top of the list?

Alexander Ferguson 38:19
For you as a CEO, what do you think is your

David Shim 38:25
biggest strengths? If let’s get one, I think I’ve gone let me be the first one is a previously was energy, unlimited energy. And this was in my 20s and 30s, where I could work 1819 hours, sleep for a couple hours and then go back right into the work. And that’s that’s something where you know, my 20s and 30s, I was able to do but as I got older, it becomes hard to do. Because that that level of energy, that level of commitment, that level of sacrifice, becomes harder, you have friends, you have family of children, you have house, you have all these other responsibilities. It’s, um, no, not married, no kids yet. But other things there where it’s like, it’s important that that unlimited energy lets you go in and say like, if there’s a problem, I can throw as much energy as possible to try to solve it. And that’s something that was like very important during the place days, it was very much like, I need to find more customers for the product. So what I could have done was like at night, if I was like I’m done at five, I’m not going to reach out, what I did was I would go in and I would write like 300 emails over the night, find all these people’s email addresses at draft them up. And then in the morning, when is the right time I’d start sending them out. But it was kind of like, let me go in and say like, I’m going to take this block of time and actually use it for something rather than sleep so that then I can actually make the business dry faster. And that worked out well. But now I’d say it’s just kind of like the operational experience, just understanding like all the mistakes that I’ve made over time, and I’ve made a lot of them and going into like Hey, I made a lot of mistakes, let’s not replicate those mistakes. So as I run into the same type of situation, how do I solve those differently from what I would have done before, and then also leaning on the successes that I’ve had, and leveraging that. And that’s really been kind of a differentiator with Reid has been really focusing around, take the learnings from past companies, and really apply them here. And that’s one of the reasons why we launched so quickly, our goal was really to get in people’s hands to get feedback to figure out what that product market fit is with the consumer versus trying to build product market fit in kind of a silo and then coming out two years down the road saying like, Hey, we’ve built this product. And people might say, I love it, this is great. I want to buy it. Don’t be like, Oh, this is not that great. Or like, oh, we spent two years on this. Well, that didn’t matter.

Alexander Ferguson 40:46
Yeah, it’s it’s taking that knowledge, you need to get there, get it there faster. Is there any lessons learned where you made a mistake in the past, we won’t do it again. And it’s an easy fix that you could recommend or maybe even share of for other numbers?

David Shim 41:02
Yeah, I think the quick the market, we’ve already talked about that, I would say relying on your team a little bit more, I would say you want to let people run a little bit more, especially as a founder, it’s hard sometimes, because it plays does a soul fam, read I have co founders. And that’s been night and day where you’re able to kind of focus in on where you’re best at. And then you’re able to trust those around you to actually execute it, own it at the end of the day. And that’s been really important. And that’s given me a lot more work life balance than I had the last time.

Alexander Ferguson 41:37
So coming back to read, I always love actually using the previous statement of what when you can use the company name in a sentence, and you’re like, oh, he said the company name, but he didn’t. But he did. That’s fun. When somebody can read the room, I think that’s what you said, Exactly. It wasn’t easy to come up with a name, by the way.

David Shim 41:59
It was it was hard. Because when it comes out is like domain names, like I know some really awesome names, right? But it’s like, can you actually get the domain name? It’s like read.com. Nothing’s on that page. But it will probably costs five to $10 million to buy it with the VC say that’s a good use of money probably now. So it was kind of figuring out what’s the right name? What describes what you’re doing. But it doesn’t have to be perfect. There’s a lot of companies that have names that have nothing to do with they do. I think with place where location company made total sense, people got that read, were reading the room, they get that. But we tried a lot of different ideas in between two, but this one kind of really stuck. We were able to get read on AI for a premium, but not a crazy premium. So that was kind of the path where it’s like, okay, we can get the domain. We liked the name, it aligns with what we do. It kind of describes very quickly, someone said, like, Hey, we’re reading the room, like, Okay, I sort of get what you’re doing there.

Alexander Ferguson 42:56
Yeah. Near term ish? Or do you currently? Or when will you support the ability for the hybrid meetings where there are some people and there’s like four people in a meeting? And then there’s three people online? Can you support that? And when would you support that?

David Shim 43:14
So we support that today, but we’re getting better at it. And we’re putting a significant amount of resources, improve that process? So one is kind of things like if there’s four people in a room and a conference room, and you can’t see their head or their head comes in and out? How do you handle that scenario? When a fifth person comes in? How do you handle that scenario, the diarization of who’s actually speaking, because a lot of the time when people are in a conference room, they’re looking at the other people the conference, and they’re looking at the camera. So how do you pick up who’s actually talking? Those are all things that we’re solving for, and we solved in some way. But we could absolutely get better at that. And the hardware is actually getting better too. So Microsoft is coming out with something Logitech has this zoom as is where they’re taking conference rooms now. And if you have the right type of high fidelity camera, so you can’t do with like a really bad camera. But if you got a high fidelity camera, you’re able to go in and actually take everybody in the conference room, they’ll take their kind of facial expression their shoulders, and they’ll put them into a box. So now instead of using a conference room, you’ve got a four by four by four by four, and the people in the conference room have their own box

Alexander Ferguson 44:17
does recreating what you would normally see if everyone was aligned.

David Shim 44:20
Exactly. And so I think that’s, that’s a place where, right time, right place, we’re getting to the place where we’re working on that solution. But there’s also others working on that solution. That’s why we decided not to build a video conferencing platform. So a lot of companies and startups in the space, a lot of great companies mind you are building video conferencing platforms to take on Zoom to take on WebEx to take on teams, etc. That’s a very hard battle, because you have to take down it’s incumbent that has a half a billion users. It’s built into the meetings licenses are set. And I think those there’ll be some challenges that come along. But what we said was we’re gonna treat this more like Apple and Android, where we say there are platforms that kind of fit For the ecosystem today, they don’t want to build all the applications. They don’t want to build every single feature that’s out there. They want companies to come in and build. That’s why you see zoom apps that way. That’s why you see WebEx apps, teams apps. And that’s where we’re kind of we went, and we said, we don’t need to recreate video conferencing, let’s make videoconferencing better by adding analytics to it.

Alexander Ferguson 45:21
When did the whole app stores in these places start to come about? It’s just been the last year,

David Shim 45:26
by the last year. Yeah. And if you think about it, like the App Store, I’m surprised it took that long because you have, you know, half a million people across the top four platforms in less than 18 months. So in that period of time, you’ve got it’s faster than iOS growth or Android growth combined. Because you’re looking at 18 months where it was 30 to 40 million people using video conferencing every day to half a billion people.

Alexander Ferguson 45:50
It’s just crazy to think about that numbers to go to half a billion. And there’s obviously opportunity here for for for providing additional value in these app stores. For you guys, when you when you look at the kind of what’s coming up, it’s quite a hurdle, I guess, in some ways to still solve these problems. That’s why other people haven’t, it doesn’t exist just yet. Because it’s so so nuanced. Are you worried about the competition now or because the challenge is so hard, you’re not, you’re not too worried about it,

David Shim 46:22
we’re not too worried about it, I took the same approach, a place where competition is good competition is actually good for a really nice market. Because that gets education out there, we can educate 500 million people. Like I’d love to say that I’ve got aspirations where I can figure out a way to educate half a million people that use video conferencing on the benefits of analytics, not not the case, the more companies that are kind of celebrating the value of analytics, the easier it is to then go in and say who has the best solution who has the simplest solution. And I think that’s where we will win, I think we will have the best tech, but it’s always nice to have people kind of nudging and educating. And that’s what happened to place like placed, we had maybe probably two dozen competitors in the location space there were there educating and talking with ad agencies, corporations, marketers, etc. To say like this is the value of location. And when we can’t, when we started, we had to explain what the value of location was. And then the value of plates. It makes it so much easier when you can say, hey, people already bought into the value of a better video conferencing and your analytic solution. Tell me more about that. Versus like, Oh, I didn’t know you can measure video calls. Tell me spend an hour and tell me about that.

Alexander Ferguson 47:32
It’s competition is not a bad things. Absolutely. It helps everyone. It helps everyone rise up. I love that. I love that mentality. So just a fun question to kind of the end up here is, is there any technology that that you would love for to come into existence? That it isn’t here yet? But any futuristic tech that you’re like, Man, if I could have right now I would?

David Shim 47:57
I think it’d be AR I think it’s the most realistic, but it’s still two to four years out at scale. And by AR I mean, AR in a glasses environment, where you’re able to look at anything out there and actually start to get information, but only when you want it so that that’s the key, like how do you actually bring AR to glasses where it doesn’t distract. But it enhances, I think that’s the important, it’s got to enhance the experience, not distract, because I think we’ve all had this smartphone, where it’s like, Hey, you can actually see the real world and you see all these bubbles on here, like Google has this apple as as you get all this information, it’s like, I’m never going to hold my phone like this in the real world scenario to find anything. It has to be something that’s very elegant and natural. And I think what you start with AR with glasses where you’re able to go out and just wear something very simple. And then it’s a how do you actually from a user experience and design perspective, do those nudges and it might even be something along lines like neuro link at some point, which is Elon Musk chip in your head. I’m not saying that. That’s what happens. But to be able to have some type of subtlety, say, when you look at something from a facial expression standpoint, this is where technology can come in. It’s like, oh, you’re focusing on that? Or you’re You look confused. Wouldn’t it be great when you’re looking at something? Are you confused? It’s like you’re at a museum or like, what is that? You see a little notice this is information,

Alexander Ferguson 49:15
UX, powered by facial reactions and people sentiment.

David Shim 49:20
Exactly. But you got to make it like seamless, right? It has to be like a third arm or just move naturally. Because if you if you have it, where it’s a distraction doesn’t work. But like I said, if you’re looking at a car and you’re like, hi, you look at the price tag, wouldn’t you want automatic like six other prices for dealers nearby? That’d be great. That would be like all day long. Give it to me.

Alexander Ferguson 49:40
A salesperson pays attention to those cues. What’s gonna start Oh, I better go over there and and help them out and and technologies to be able to do the same if you have an issue of a question. Here’s some potential answer. That is a really I’ve never thought about UX being powered by humans reactions and their response. The future.

David Shim 50:00
So there’s a lot of fun stuff coming.

Alexander Ferguson 50:02
I love it. Thank you so much for sharing the journey that you’re on. And for those that want to learn more about Read go to read.ai. And you can download right now it’s fairly it’s free and, and and try it out. Thank you so much, David.

David Shim 50:16
Excellent. Thanks, Alexander.

Alexander Ferguson 50:18
I will see on the next episode of UpTech Report. Have you seen a company using AI machine learning or other technology to transform the way we live, work and do business? Go to UpTech report.com and let us know

YouTube | LinkedIn | Twitter| Podcast

Computer Vision Tech & Analyzing Engagement Levels in Zoom Meetings with David Shim of Read

SUBSCRIBE

Written by Alexander Ferguson

Senior Care During Covid-19 & Improving Communication with Katherine Wells of Serenity Engage

Learning Activities + Educational Games for Kids and Babies with Gustavo Rodriguez of BabySparks

Creating Better Marketing Emails with Peter Clark from Journey

An AI-Powered Reading Platform with Ryan Welsh from Kyndi

Keeping Workers Safe Using VR Training with Mousa Yassin from Pixaera

Building a Better Bot | Danny Tomsett at Uneeq

Predicting Preppy: AI in Fashion | Cece Lee at Trendalytics

The AI Revolution – Current impact & future progress

SUBSCRIBE

Add to Collection

No Collections