We now live in a data-driven society, and that’s probably a good thing. Data is the ultimate gut-check, the final arbiter of our best guesses. But interacting with data can be exceedingly complicated—and often slow.
Kendall Clark, the founder and CEO of Stardog, has developed a faster, more efficient technology that allows you to query data directly, without any copying or migrating. He calls it “data fabric,” and claims it’s the only solution of its kind.
The applications are wide ranging, and in use by eBay, NASA, Bosch, and Cisco.
More information: https://www.stardog.com/
TRANSCRIPTION
DISCLAIMER: Below is an AI generated transcript. There could be a few typos but it should be at least 90% accurate. Watch video or listen to the podcast for the full experience!
Kendall Clark 0:00
You know, if you’re too quick to market, and you try to, you know, sort of take your shot before the markets ready. That’s in some ways worse, it’s more heartbreaking than being too late to the market.
Alexander Ferguson 0:15
Welcome to UpTech Report. This is our applied tech series UpTech Report is sponsored by TeraLeap. Learn how to leverage the power of video at Teraleap.io. Today, I’m excited to be joined by my guest, Kendall Clark, who’s based in the DC area, specifically Arlington, Virginia. And he’s the CEO and founder at Stardog. Welcome. Good to have you on Kendall.
Kendall Clark 0:34
Hi, Alex. Thanks for having me. I’m excited to chat today.
Alexander Ferguson 0:23
Now, Stardog is an Enterprise Data Fabric platform on your site, he actually stayed Data Fabric is the future of data integration, help me understand Kendall what was the problem you initially saw and set out to solve with Stardog?
Kendall Clark 0:51
Sure. So we like to say start off the Data Fabric platform is the only platform that lets large enterprises query and connect to their data, no matter where it is on prem or in the cloud, without moving or copying that data first. And that leads, you know, in some sense to the value proposition, which is to lower time to insight and lower time to value by being data driven and making data driven decisions. And so the problem, the underlying problem that we’re trying to address in the platform, is the idea of data silos. So the the sort of natural native state of enterprise data is to be very disconnected and fragmented. And that’s an impediment to analytics and AI and machine learning. And in fact, it’s a big impediment to digital transformation in large enterprises. And so the focus of our platform is to connect that data in a very agile, flexible and timely manner, so that businesses can use their data to, to drive profit, efficiency savings, and to be more competitive.
Alexander Ferguson 1:57
One of the things we talked about earlier is just a kind of a bigger concept here that the quantity of data that is being created every day is only increasing, I think it’s in the name, the number you gave me before, is like 30 zettabytes every year. 2.5 quintillion bytes per day, and it’s only going to grow. And so like, the performance on cross networks isn’t gonna keep up with that. But how we we integrate and use the data or get access to it has to change, help me understand like, is that correct? Am I am I on the right point here?
Kendall Clark 2:27
Yeah, that’s exactly right, from a slightly technical point of view, the underlying idea behind data integration, technology, really all of it and this is a pretty the underlying idea behind data integration technology, really all of it and this is a pretty stable idea for the last 50 years is to physically move or copy data between different systems in order to integrate that data. That’s what we do in a data warehouse that aggregates lots of data, and then provides analytic services on that data, you consider like something like snowflake, you move your data into snowflake, physically. And then snowflake as a query answering system acts upon the data in snowflake. But as you say, there’s just too much data data volumes are nowhere near any kind of limit of growth. And network performance. And network performance is relevant here. Because when you’re moving and copying data, you’re Of course doing it over networks. So those a curve, that’s a data volume curve that’s ever growing, and a flat data network performance curves don’t really go together. So our approach is to solve this problem address this problem by by removing the requirement that the data be copied or moved in order to be queried. So we we play the data where it lays, as it were,
Alexander Ferguson 3:41
it’s like this, the solution up at this point is Oh, we have data over here. Let’s just copy it. Let’s bring it over here. And now we need over here, keep duplicating. But the point is, you don’t want any migrations, no ripping, no replacing a copying, just a the term use of fabric. Is that a new term? Right? I don’t think anyone else is using that term.
Kendall Clark 3:58
Well, it’s not unique to us. And when you said earlier, Data Fabric is the future data management. That’s actually the title of a Gartner analyst piece that came out in December. So it’s we’re agreeing with him, of course, because you’re seeing the
Alexander Ferguson 4:11
trend you’re like, Yes, I agree. That’s exactly
Kendall Clark 4:14
right. People say nice things about you, the general thing to do is to agree with them, of course. But it’s not a term that’s unique to us. It’s a term that’s been used at various times in history, data management to mean different things. What it really means now, and the focus in the marketplace now is on Data Fabric as a means or a technique of doing data integration. And again, the key point here for us is to do that data integration in the new way, without, as you said, requiring the moving or copying data. And the point really to be made here for the for the audience understand, it’s not the it’s not as if the moving and copying data doesn’t work. It does work. But it’s slow, it’s inefficient. And it also tends to cause other problems in the enterprise. When you’re like at some point people throw up their hands everybody who’s worked in a big company. said, which one of these many copies of this data source should we trust, like, and then you get the notion of reference data and gold data and master data. And that’s because we’ve made so many copies, we’ve sort of confused ourselves at the enterprise level about what’s the good stuff
Alexander Ferguson 5:15
for a CIO or Chief Digital Officer at an enterprise. I mean, this, this is not an uncommon problem or challenge that they’ve been facing. It’s just this data across everywhere. Moving forward, I mean, being able to use your type of solution, tell me how we understand how does it work like the underlying technology and to make a better solution than what they’ve been using? Sure. So
Kendall Clark 5:34
it’s a great question we focus on I talked about querying and connecting the data. So there are in querying for us means querying and searching the data, they both happen in startup platform. So the underlying idea is to have a different kind of data model, we use this what’s called a semantic graph data model. So a graph data model is different from the traditional relational data model, and that it’s much simpler. And it’s much, it’s much more powerful to represent every kind of data. And in fact, when you think about it, when people naturally go to the whiteboard and an IT context, and maybe do a little data management, a little data modeling, they naturally draw circles to represent things. And they draw lines between the circles to represent connections between those things. And what they’re doing the intuitive natural thing is to draw a graph data model of some problem that they’re trying to solve. So start out uses that same technique directly to model the data. And then we add to that this mapping capability, which is really a way to just say, you know, when a query happens, that start out when the query comes from some application or some user, we’re going to turn some remote data into the graph and query it as if it were always inside of startup. But we’re actually just going to do that at query time, just at the last second, so that we don’t actually have to move or copy the data. And we until we actually need it, we only move enough data to answer a particular question. So it’s a completely different approach to the physically bulk, move or copy all of the data into a place and query it there. It’s in fact, in some ways, the exact opposite approach.
Alexander Ferguson 7:05
Do you have any Simple Stories or analogies to help like see this in action?
Kendall Clark 7:11
That’s a great question. This, you know, it would bringing a new capability to market that’s the real, that’s the that’s the secret sauce to getting that that picture in people’s heads that everybody can understand. The way I like to talk about it is Stardog is a Data Fabric platform that primarily means that we answer queries, just like a database is a system that answers queries against using some data, right, but databases, traditional or conventional databases, combine the idea of query answering with storing the data, they do both of those things, they stored the data, and they answer queries with the data that they’ve stored. So the mental picture for us is start off is like a database without any storage, right? It’s just the query answering part. So it’s able to reach out and query other systems where the data really lives, again, without moving or copying it, and act. So in essence, it acts almost like a database without storage. And for people who, you know, are in the IT business, I think that may be a new idea, but it’s relatively straightforward. So we’re querying data that we’re not storing.
Alexander Ferguson 8:16
It’s it’s a place that effectively knows where everything is, but it’s not storing itself. It just has the knowledge.
Kendall Clark 8:22
That’s right. That’s exactly right. So in this graph, I talked about with the circles, representing entities and lines between the circles representing relationships between those entities, we extend that model with some circles and lines that represent almost like promissory notes, like remember wimpy saying, I will gladly pay you Tuesday for a hamburger today, right? So effectively, the system says, you know, at some point in the future, when you need this data to answer this query, you will be able to go get it in this other system. And all that happens automatically.
Alexander Ferguson 8:52
Your whole solution is just a basically underlies then the other apps or or our solutions or analytics that they can build the actual enterprise builds off of.
Kendall Clark 9:03
That’s exactly right. That’s the other point. I like the fabric metaphor, the Data Fabric metaphor. So you know, like, I’ll give you a silly analogy, you’re throwing a dinner party for friends, and you pull out your, your table extension, because you know, you want to set up a buffet maybe. And the table underlying is maybe a little messed up chipped old or whatever, put a nice data, put the tablecloth over it, right. And suddenly, it’s this homogenous, you know, you can’t see what’s underneath it. And I like that’s the idea. So we want to put the data fabric over the data sources. And then as you say, above that, like we’re on top of the fabric, you get applications reporting, machine learning AI, what we would call the business application layer.
Alexander Ferguson 9:45
Now how many said like, are you able to keep that updated? Like what if something moves or changes? How does how does the fabric adapt?
Kendall Clark 9:53
Yeah, that’s a that’s a great question. So and there’s an analogy I like here as well. That helps help Understand this, you know, like when you’re trying to manage manage forestry, when you’re trying to manage a forest fire, we cut, we cut this thing called fire breaks, right. So we, we cut pathways through a forest, so that the forest itself is not one continuous flammable thing, right. But with these, we cut these channels through it, and that contains the fire to particular sections, and then you can sort of fight it, the fire at the fire breaks. So we think about Data Fabric, in some ways is a kind of a fire break for the enterprise. Where underneath the Data Fabric, that’s where you do change management, as you say, changes still occur. But now in this well, technical folks will call a loosely coupled architecture, the business layer reporting and AI and compliance and the application layer can’t, isn’t directly connected to the layer underneath that changes. And so that lets you manage those changes in a way that’s opaque or hidden to the application layer. And it’s that it’s that opaqueness or hidden this, that, again, is the point of the of the fabric, you can’t really see what’s underneath it. But you can sit on top of it right, you can rely on it. So so you’re right, things still change. But one of the benefits of a Data Fabric approach at the enterprise level is you’re able to manage change in a more rational, coherent way. It’s just easier, because now instead of having to make a take account of those changes, in many places, you can maybe take account of those changes in only one place in the fabric itself.
Alexander Ferguson 11:31
Again, let me understand what are they doing right now? Like what what would a CEO or Chief Digital Officer be be using or not using that this is makes? Maybe it’s is that? Is that a major shift? Or is it just a small incremental move forward, that makes a big impact?
Kendall Clark 11:45
Yeah, it’s actually both like the underlying level of how we think about it, like the architecture It is, it is a big shift, it’s a different way to manage data. But the the point of the fabric is, is, you know, we, as you said, don’t require any rip and replace, it’s actually all of our big customers. And we’re really focused on the enterprise on large customers, large organizations, they all incrementally adopt and implement the Data Fabric, one use case at a time, which is really just to say, a few data silos at a time. So the approach here is to be incrementally implementable, and sort of slide the fabric in between the storage and the application layer slowly over time. Because you know, really, that’s the only realistic way to, you know, halt and turn directions in a big ship is slowly one step at a time, because these enterprises, you know, they’re they’re large and powerful, they’re not necessarily agile,
Alexander Ferguson 12:42
it’s not something that they have to suddenly shut everything down forever, until they make this giant shift that it can be a slow change into using.
Kendall Clark 12:50
That’s exactly right. And in fact, some of the interest that we’re starting to see in the market recently is around using the data fabric to enable cloud migrations. So you’re looking at all of your data assets, you know, 1000s of data assets at the enterprise level, your CIO. And you have maybe 60% that are on prem and 40%, that are already in the cloud. And you’re identifying in an ongoing fashion operational efficiencies that are that are available by moving shifting more data assets from on prem to the cloud. Now, that process itself is is a change process that has to be managed. And you have to do that without having any outage or downtime or business interruption. Right. So one of the things we’ve been talking to them with the market about recently is using the data fabric as a kind of, you know, for lack of a better word, a pontoon bridge, a temporary structure to get you over the migration ride because again, it gives this kind of decoupling effect. And it’s that decoupling effect that lets the change underneath be managed rationally, without having so much interruption or ideally any interruption to the to the business above. Now,
Alexander Ferguson 13:57
your platform is specifically for enterprise just because of the the scale and complexity of the data. Obviously, they’re smaller, they could just figure it out and do themselves. But is there any particular industries or segments that you’re focused on within enterprise or just overall?
Kendall Clark 14:12
Yeah, absolutely. I think as the market matures, there’ll be opportunities for what you know, the SMB market, small medium business market. I’m struck by the statistic I read recently that 85% of all businesses, regardless of size, are in have data assets in more than one cloud. So this is a problem for smaller business. But as you say, our focus is on the large enterprises. And within that focus, we definitely pay. We have particular concentration in financial services where we’re doing a lot of work with banks and operational risk management. Second, pharmaceutical Life Sciences industry where we do a lot of drug discovery, our customers are using the Data Fabric platform for drug discovery, and then third in supply chain or manufacturing where we’re focused on digital twin Supply Chain Management use case so There’s definitely focus. But in each of those cases, what they share is an underlying business challenge where the answer to some critical or strategic business problem, the the data that answers that problem exists in many different places, and they’re disconnected. So the thing that all of those have in common from our point of view is, again, this idea of a connected enterprise using the data fabric.
Alexander Ferguson 15:22
What if you’re kind of looking at the space overall and where it’s headed? He already gave a few mentions of it. What What do you see coming up that that you think other CIOs and chief dish officers should be aware of in this space?
Kendall Clark 15:36
Ah, well, so there’s definitely a lot of attention on this topic, I think, on this on this idea of taking a new approach to data management strategy. So I think one thing that’s happened recently, that’s very interesting is people had sort of taken the view that data management was kind of a solved problem. And I’ve mentioned snowflake before. And it Look, it makes sense there was it was a quiet space for a long time. But then you see snowflake come along, and be just a crushing financial and commercial success. innovating in a space that we all more or less assumed, was quiet and sort of finished, you know, like, data warehouses were like, they were what they were right. And then they come along and move all that to the cloud and see huge, you know, benefit for their customers and shareholders by doing that. So I think one of the things that I’m excited about is this growing awareness, largely, I think, fueled by pandemic, that data management strategy is strategic to the corporate to the enterprise, not just to its it function, but it’s strategic and central to the idea of digital data transformation. And everybody understands that that’s key to being competitive for large companies. So you know, the key to them not being disrupted by small startups is to really monetize and drive efficiencies and innovation with their data. And to do that, you really need to take a fresh look and data management strategy. And as a data management practitioner, I’m excited. I’m excited by that, by that turn of events, right?
Alexander Ferguson 17:05
What for you, this wasn’t a, hey, let’s build this whole thing. And a year, you’ve been working on this for a while? What can you share of just this this experience that you’ve been been on? And maybe even some of the, the difficulties you’ve had on the past over the years that you’ve been able to overcome and building start off?
Kendall Clark 17:23
Yeah, so I’ll say two things super quickly. I think the first thing from a personal point of view from, from my perspective as a person is, I like to say that patience is the most underrated virtue of an entrepreneur. Right? Because the stereotype is everybody’s got to be like Elon Musk, and very hard charging, and it has to be done tomorrow. And, and like, you know, in some situations, that’s the right posture, right. And in fact, I think the Data Fabric opportunity is such that I’m starting to adopt more of that posture. But you know, if you’re too, too quick to market, and you try to, you know, sort of take your shot for the markets ready. That’s in some ways worse, it’s more heartbreaking than being too late to the market in some way. So I do think patience is a hard virtue to learn. But it’s an important lesson for, you know, other startup folks who may be listening to this. And the other thing I’ll say quickly is, we have to be strategically nimble and keep our keep our minds open, like we have my co founders, and I’ve been working on the underlying technology for some time. But that doesn’t mean we’ve had everything, we’ve gotten everything right. And you know, this is an infrastructure play. And infrastructure plays are typically headless, without it’s a back office capability. So there’s not necessarily user experience. But we had a acknowledge a couple of years ago that we needed to build the user experience layer in the platform. But that’s the world that changed. And even headless technologies now really require a user experience so that business unit users can really interact with the platform and feel some sense of ownership. And so that was a lesson. You know, it was a kind of a tough lesson to learn. I didn’t want to build that part of it. But at some point, you have to realize, Look, don’t you know, stubbornness is a fine line between grittiness and stubbornness. And you need to make sure that you know you’re on the right side of,
Alexander Ferguson 19:04
I’m seeing this this growing trend of software built on top of software built on top on top of software. It’s like this whole nother existence of old revolution, kinda like industrialization, everyone started to specialize. Do you see that also as a trend?
Kendall Clark 19:20
Yeah, absolutely. I mean, I think that’s, I mean, if we just consider just remember the lessons we all learned in macro economics, one on one in college, right, like specialization, and then hyper specialized. This is the engine of modern economic growth. This is how our society works. And it’s definitely happening in the IoT space, and even in the data management space, right. So for instance, we’ve seen in the last 10 or 15 years, a real sort of Cambrian explosion of different types of databases. This was called the NO SEQUEL movement, which did it you know, remember, wasn’t like no to SQL was not only SQL, it’s, you know, other data types need data management systems that are specialized or customized. So now we have time series. Use databases, we have graph database. We have a lot of different kinds of databases where 15 years ago, we really only had SQL relational databases. So yeah, we see a specialization all up and down. I think the economy particularly in it, I think that’s a good thing. It’s a sign of progress. It’s a it’s a market civilization. So I’m excited. We’re excited to, you know, play our part in the
Alexander Ferguson 20:20
looking ahead from here, on your own roadmap, what can you share of what you see where you see the company in the next 235 years from that?
Kendall Clark 20:29
Oh, sure. So from the point of view of product, we’re really focused, as I said before, on user experience, making the Data Fabric like literally visible being able to interact with it, explore it, explore, like, inside the data, we have, as I mentioned, one of our key use cases, this year of focus is drug discovery. Drug Discovery is like literally the scientific method at work, you know, like, pharmaceutical researchers, as we’re all very sensitive to these days, you know, mRNA vaccine therapies are changing the world and giving us you know, society back, and the ability for our customers to sort of get in and inhabit and kind of roam around and know the data in that sort of unstructured browsing way that leads to insight and you know, those sort of eureka moments, that’s a big focus for us. And then as well going forward, we’re really paying a lot of attention to and investing heavily in, in automation in the backend back end. So underlying techniques around knowledge graph, embeddings, and machine learning, to make the implementation of the Data Fabric just easier and faster.
Alexander Ferguson 21:35
One thing we can count on is things will stay the same technology will continue to develop and grow and in some ways get more complex, hence the the specialization. I’m just curious for you personally, what kind of upcoming technology are you most excited about?
Kendall Clark 21:48
A well, not to be a homer, I’m super excited about data management strategy. But I’ll give you something a little bit outside my space that I think is adjacent to us, there’s a bunch of really interesting groundbait groundbreaking work sorry, recently, some of it coming out of IBM, some of it come out of research community around notions of homomorphic encryption, and differential privacy. So these are algorithmic techniques that allow companies like ours, or code cloud based data management companies to basically possess and operate on the data of our customers, while that data at all times is still encrypted. So this is a big step forward and making the cloud a safe environment to be for even regulated industries are very sensitive data. And so I think while it’s kind of a very nerdy thing that only like math geeks would be interested in. I’m excited. And I won’t come to understand at all, of course, but I’m really excited about the potential that has to sort of flatten out the differences between the cloud and other computing environments. I think that’s gonna be a good, exciting trend.
Alexander Ferguson 22:52
Well, it can I really appreciate both the time and understanding more about this future of Data Fabric and the role that it can play in enterprise and eventually SMB. For those that want to learn more, they can go over to Startdog.com, what’s a good first step they can take?
Kendall Clark 23:08
There’s a startup training videos, there’s lots of white papers, there’s a very lovely product features, overview. All sorts of fun stuff over there if if you’re in this business, and this is intriguing, awesome to be able to dig into.
Alexander Ferguson 23:22
Thank you again, Kendall, good having you on the series. It’s great.
Kendall Clark 23:26
Thanks, Alex. I appreciate it.
Alexander Ferguson 23:27
We’ll see you all on the next episode of UpTech Report. Have you seen a company using AI machine learning or other technology to transform the way we live, work and do business? Go to UpTech report.com. And let us know
SUBSCRIBE
YouTube | LinkedIn | Twitter| Podcast