Clearing the Smoke from the Cloud | Nitzan Shapira at Epsagon

Cloud computing for many companies has become more than a trend—it’s a way of life. And yet, the logs and metrics used for these distributed systems are often headaches. Engineers can take hours searching through logs and troubleshooting issues that should only take minutes to resolve.

This was the opinion of Nitzan Shapira when he co-founded Epsagon, which gives teams automated instrumentation and tracing for containers like Kubernetes and virtual machines.

On this edition of UpTech Report, Nitzan explains this technology and tells us how it’s helping people easily see the complexities of their systems from first line of code to deployment.

More information:

Nitzan is the CEO and a co-founder of Epsagon. He is a software engineer with 15 years of experience in software development, management, and cybersecurity from the Israeli Intelligence unit. He also enjoys playing the piano and is a traveling enthusiast, an experienced chess player, and is addicted to sports.

Epsagon enables teams to instantly visualize, understand and optimize their microservice architectures. With our comprehensive lightweight auto-instrumentation, gaps in data and manual work associated with other APM solutions are eliminated, providing significant reductions in issue detection, root cause analysis, and resolution times. Increase development velocity and reduce application downtime with Epsagon.

DISCLAIMER: Below is an AI generated transcript. There could be a few typos but it should be at least 90% accurate. Watch video or listen to the podcast for the full experience!

Nitzan Shapira 0:00
I want my solution to be open source, I want it to be not tied to any vendor, but people forgot that vendors are actually helping your business by saving a lot of time on things you don’t have to do.

Alexander Ferguson 0:20
Nitzan, I’m excited to be able to chat with you today and hear more about Epsagon. To begin, can you share in five seconds? Very brief, what is Epsagon?

Nitzan Shapira 0:33
Sure, Epsagon is a SaaS platform helping developers and DevOps teams to monitor and troubleshoot their applications in production in cloud environments, and microservices technologies. So that’s, that’s what types of applications our customers are running.

Alexander Ferguson 0:54
Where did this begin for you? What problem did you initially see? Were you in DevOps yourself? And then you saw this problem? How did it begin?

Nitzan Shapira 1:02
So both me and my co founder were in the engineering technology space for about 15 years or so a lot of our experiences from the Israeli intelligence, where we did a lot of work in cybersecurity, embedded systems, reverse engineering, like very low level techie stuff. And as r&d managers there and cybersecurity managers, we develop very complex systems that were and we had very limited tools to actually go ahead and figure out what, what’s going on there and troubleshoot issues and all that stuff. So we definitely know the problem firsthand, not necessarily in cloud environments, or stuff like Kubernetes, like the technologies we use, there was a bit less advanced. But, you know, we definitely know the pain and the value of having the right solution. So when we started a company, we looked into cloud infrastructure as a huge market with a lot of technology challenges. And we spoke with probably 50 Different companies, from small startup to huge, large enterprises. And it was pretty, very clear that those people using technologies like containers, Kubernetes, serverless, are still using the same logs and metrics to figure out what’s going on. And these are not the right tools for a distributed system. So they basically said over and over again, you know, that just takes them too much time to troubleshoot a production problem. And they, they don’t understand what’s going on there. And they want to do it in minutes, but but it can take them even hours. So the impact on the business. And the end, the developer velocity is very large. That’s why it’s such an important problem.

Alexander Ferguson 2:52
Would you say, because now that there’s so many different components and segments and elements that are needing to be used integrated, that you really do need some sort of system to be able to manage? And be able to troubleshoot at this point? Did I capture that correctly? Or no?

Nitzan Shapira 3:10
Yeah, the new environment requires the right solutions. Like you can’t keep using the same dashboards and logs when you have a 50 or 500, or even 5000 components in your system. It’s not the same as as having one or two services, which is, you know, like, what, what people used to have

Alexander Ferguson 3:33
your main state, what would you say is a real differentiator for your platform that someone say, Wow, okay, I see where I would want to use maybe give me a good use case.

Nitzan Shapira 3:44
Yeah, so a use case can be a company that’s now launching their new digital platform, for example, an e commerce website, it can be a retail company or any company with a web application. So they decided to go into my into the cloud, for example, AWS, and use micro services like Kubernetes, containers, and so on. So Epsagon is being there for them from even from the beginning, when they just develop the code to the moment they deploy, and especially in production when the code is running and, you know, generating revenue. And the reason they would like to use it is that the alternative is basically, you know, one option is not to change anything and keep using the same tools. But we can very easily show that the troubleshooting time or the MTTR mean time to resolution goes up extremely fast when you don’t have the right solution, as well as the developer or students going down. So you really need to change something. And the other alternative is to build something in house, which is, for example, coding, using something like open tracing, to go in your code and really build this distributed tracing solution yourself. But this can take take weeks, months or even more, and unless you’re a company that is, you know, knows exactly what they need. And those companies are Netflix, Uber, Airbnb, those companies are building a lot of in house. But for most companies, it doesn’t make sense to do that. And it’s to, to get an automated solution out of the box, that will just work. So Epsagon is exactly that, you know, you plug it in a few minutes, and you’re up and running with very little incremental cost and maintenance cost.

Alexander Ferguson 5:31
What feature are you most excited about that you guys have recently released.

Nitzan Shapira 5:37
And settings about the ability to generate dashboards now in Epsagon is something we released a few weeks ago based on tracing data. So that means that of course, in every monitoring product, you can create dashboards of some metrics. But Epsagon collects not just metrical data, but actually trace data, which we call it trace data, which is actually payloads of different API’s request responses, and so on. So that means you can have Epsagon plugged into your system in a matter of five minutes, for example. And, and then what you’re going to get is really high contextual data of your system, for example, JSON messages of HTTP request and response information for different API calls like Stripe or off zero. So you can now have all this data in Epsagon with very little effort, and then create a dashboard that shows you for example, how much money is going through your stripe API calls. So that is something very unique. Because to do that, you need to like the alternative to those two is really a very manual tool that you would send all this data. And but we’d have someone who can easily create both performance related dashboards and business related dashboards. That can be pre revolutionary actually,

Alexander Ferguson 7:03
what tip or education or anything knowledge that you would want to share with DevOps teams out there that they should be thinking about when it comes to this this space and how they’re managing their different services.

Nitzan Shapira 7:19
And I think there is some kind of hype today about open source solution and this kind of things. Many people are kind of talking about it, yeah, I want my solution to be open source, I want it to be not tied to any vendor, but people forgot that vendors are actually helping your business by saving a lot of time on things you don’t have to do. And while we see, all the companies talk about, you know, those open source options like open tracing, we are using it actually under the hood ourselves. But But what we see is that people are talking about it, but very rarely, they go ahead and do a large scale project and deployment. Using open tracing across the board, it’s just too much work too much implementation and maintenance. So instead of, you know, getting hammered with all the recent buzzwords, it’s better to think about the business and what’s really a good ROI. And it’s actually on a monitoring tool, it’s very easy to show ROI. Because, you know, every every problem as a cost, every time a developer troubleshoot an issue has a cost, like their time as a price. So usually, it’s very easy to justify buying a solution versus building it just like you’re not going to build your own. Find finance or HR software, right, there is no reason that you will spend time on things that are not core to your business.

Alexander Ferguson 8:56
Time is a finite resource that you can’t get back and you need to be very cautious of where you’re spending it. So it’s nice to have a tool sounds like like this for DevOps teams to make sure they’re not wasting their time on something they don’t need.

Nitzan Shapira 9:08
Yeah, definitely.

Alexander Ferguson 9:11
Alright, what your business model is it a monthly yearly subscription. So how it works?

Nitzan Shapira 9:18
Yeah, we do both typically, in the larger contract, it’s going to be yearly. But we do have a self serve option through the website that everyone can just go into and use. Starting from like 100 bucks a month, it gets you all the way to where you can actually run, you know, production scale applications. However, you know, in our enterprise accounts or like the big growth accounts, they need, typically much more than that. So that’s why we have another you know, enterprise tier which will also provide different levels of support, custom implementations scale and so on and more flexible pricing to

Alexander Ferguson 9:56
got it. where can people go to learn more and and watch Did they do or can they do as a first step.

Nitzan Shapira 10:03
Of course, the website, we try to have it as detailed as possible and all the information about the product. We also have a live demo environment that you can play with on the website. We have a demo video available on YouTube. So many webinars and customer and talks that we did some customer case studies are actually really good way to understand the value of the tool. We have some great names like the loi, Vonage, many big companies, as well as startups. And so and then you can just you know, sign up and try it yourself. You get two weeks of retrial ice scale, which you can use. So that’s what many of our customers did initially.

Alexander Ferguson 10:45
That concludes the audio version of this episode. To see the original and more visit our UpTech Report YouTube channel. If you know a tech company, we should interview you can nominate them at Or if you just prefer to listen, make sure you’re subscribed to this series on Apple podcasts, Spotify or your favorite podcasting app.



YouTube | LinkedIn | Twitter| Podcast

When the Prayer became the Product | Ben Hindman at Splash

Friend Your Funders First | Caren Maio at Funnel