1. Library
  2. Podcasts
  3. Jamstack Radio
  4. Ep. #105, Real-time Data with DeVaris Brown and Ali Hamidi of Meroxa
Jamstack Radio
32 MIN

Ep. #105, Real-time Data with DeVaris Brown and Ali Hamidi of Meroxa

light mode
about the episode

In episode 105 of JAMstack Radio, Brian Douglas speaks with DeVaris Brown and Ali Hamidi of Meroxa. They discuss the future of real-time data, the pain points in the connector ecosystem, developing skills through open source projects, and their personal experiences growing their team during the COVID-19 pandemic.

DeVaris Brown is the CEO and co-founder of Meroxa, a VC backed company enabling teams of any size and level of expertise to build real time data pipelines in minutes not months. Prior to founding Meroxa, DeVaris was a product leader at Twitter, Heroku, VSCO, and Zendesk. When he’s not sitting in front of a computer, you can find DeVaris behind a camera capturing moments in time, at the stove whipping up the finest delicacies, or behind a set of turntables, moving a sea of people through music.

Ali Hamidi is the CTO and co-founder of Meroxa. Prior to starting Meroxa, Ali was the Lead Engineer on the Heroku Data Team at Salesforce where he worked on the core control plane that managed millions of databases for tens of thousands of customers globally. When he’s not hacking away at code or gaming with his sons, Ali is likely to be mountain biking on local trails or at the bike park, racing downhill.

transcript

Brian Douglas: Welcome to another installment of JAMstack Radio. On the line we've got DeVaris Brown and Ali Hamidi, both from Meroxa. So DeVaris, why don't you introduce yourself first and then we'll hear from Ali?

DeVaris Brown: Yeah. Hi, everybody. I'm DeVaris Brown, CEO and co-founder of Meroxa.

Ali Hamidi: Hi. Ali Hamidi, I'm the CTO and the other co-founder at Meroxa.

DeVaris: Thanks for having us, B.

Brian: Yeah, honestly I've been hearing about Meroxa so much. Y'all open sourced some stuff pretty recently as well so that's been in my feed. Hearing about Meroxa from VCs as well, which we don't need to get into how I'm talking to VCs right now. I'll guess we'll start with what is Meroxa? What are y'all working on?

DeVaris: Yeah, we are a real-time data platform that allows people to solve problems with real-time data very easily.

Our whole shtick is we believe that the world will be real-time by default in the future, and they should not care about the infrastructure, but yet they should care about bringing customer value in real-time. And so we are, yeah, just building a platform that allows them to build data products faster with real-time data.

Brian: Excellent. So Ali, I'm curious, is this something that y'all are working together on? How did you get involved?

Ali: Not necessarily this sort of space. But yeah, previously both DeVaris and I were working at Heroku. I was a lead engineer on the Heroku data team, mainly focused on the Kafka and the Cassandra offerings, and so really working around large scale streaming data. Yeah, DeVaris and I were partnered fairly frequently on these customer success summits or meetings where we'd get to talk to customers and learn about issues that they were facing and the problems that they wanted us to help them solve.

And so Heroku and many other companies provide really good managed data services, so it's trivial to get a database or a Kafka hosted and managed for you, and it's sort of hands free operations.

But we kept hearing about the similar, or a really common theme was running the infrastructure is handled but the data integration part is missing. What's the experience beyond that? How do we get data from PostgreSQL to Kafka or from Kafka to Snowflake, or any of these permutations? That tooling was really the missing piece.

So we dug into the problem space a bit more and really what you find is real-time specifically is really poorly served. The tooling sucks, the experience is pretty awful and no one really addressed it that well, and so that was the opportunity for us as... "If we were to build something that really solves this problem well, well, what would that look like?"

Brian: Awesome. Yeah. DeVaris, you had mentioned real-time, data is going to be real-time, it doesn't matter, like infrastructure, and that's what you're trying to solve for customers. Can you expand on that? What are we talking about, real-time data? Who needs real-time data? What are some examples that people should be thinking about?

DeVaris: Yeah. Everybody needs real-time data, right? Because you want to provide your customers the most contextual, relevant experience in the moment that they need it. I've talked about the weekend in Vegas before, it's like you get there on a Friday and you leave on a Sunday but because of all of the hospitality management systems not being able to get hooked up, they don't have any visibility into when their most valued VIP customers are checking in or doing activities until well after they've left. And so they're losing revenue because just simply that their data can't get processed fast enough.

You think about people that are in finance looking at leading indicators and things like that. I always remember watching Billions and looking at Bobbie Axelrod reading the news and then trying to make a trade based off of something that he saw happened in New Zealand and all this other stuff. You need that data to help you make the most informed decision in the moment that you need it in. You think about a large fintech company.

We go on vacation, we swipe our card once and it works, and then you swipe it again and it doesn't, that's because they're updating their fraud detection models and it takes two to three hours to do so right. So if you can basically look at anomalies in real-time, that's one of the things that essentially real-time data can help you start building and rebuilding those models in real-time. So there are so many of these use cases that are out there that can benefit from real-time, but the biggest distinction for me is to not conflate event driven and real-time data like so many other people.

They think that because that the event happens you have to act on it in the minute, and what we're saying is, "No, you can just capture the data in real-time. You can decide what to do with it in the minute, or you can just put it into a static place where you can do analysis later." I think that that's something that you need to have better tooling, better frameworks, better rails to help people get that stuff.

I know that's really why we started Meroxa, it's to usher in that new world of experiences. There are so many things that we could be doing today that we need data in the moment that, man, I can even give you a real simple example. If I'm in the eCommerce website and I want to figure out, "All right. Well, what's the time that it takes people to add something to the cart and check out."

Well, that's a real easy thing to do when you have the granularity of this real-time event that happens inside of your database, and so there's any number of solutions that you can build based around, or intelligences that you can glean from that. Including, building your own cart retention software or cart abandonment software. That's something that a ton of companies and vendors are doing that today, but it's literally like five lines of code if you have that data available to you.

Those are the things where we just feel that there's going to be a whole new set of experiences that are going to get built because now real-time data is the default. We don't necessarily have to just depend on this database tracking this thing and capturing this snapshot at a time. Now we can see all the events that led up to that snapshot, and who knows what's going to happen next? That's really what we're banking on.

Brian: You started the explanation with stock trades and I know from folks who did stock trading in Chicago, which you're originally from Chicago, right?

DeVaris: Yeah, man. All the junk trading folks and Renaissance Technologies and all those cats, yeah.

Brian: Yeah, there was a race to the commodities trade, actually they ran a cable from Chicago to New York so that whoever had the results as fast as possible had the competitive edge. Because as we saw with folks like Robin Hood when they had issues last year, it's like who can have the deal go through quicker is who's going to win and milliseconds matter. I'm intrigued by the product and intrigued by what you're offering.

I guess my ask, or my question rather is customers. I have a PostgreSQL database for the product that I'm working on, there's tons of GitHub data and my goal is to provide insights for maintainers and product owners and OSPOs to identify what's happening in their product. The challenge is most GitHub data services, you get 24 hours and it's mainly because everything is built on a GitHub archive. How would I approach leveraging Meroxa? I know you have a couple of different products in open source projects as well, but how would I approach it as a developer who has data?

Ali: Yeah, I guess we can start with the Meroxa platform proper. This is our commercial offering, and essentially what we focus on is really giving direct access to your data, so your PostgreSQL data, through CDC (Change Data Capture). Essentially the platform makes it really, really easy to point the platform at your database and we'll set up a CDC stream that will just push all of the changes into the platform. From there you can actually deploy any kind of custom logic you want and have it triggered based on changes.

I don't know, obviously, the details of what your product is but I'm imagining something like something happens on a GitHub repo, that change ends up in your database somehow, that will get filtered into the Meroxa platform and you can write some custom logic to say, "Alright, if I see one of these events transform it in this particular way, enrich it with some additional data, maybe hit the GitHub API, get some richer information, do whatever you want really.

Then pass that down the pipeline and so something else with it. Maybe you're posting that to another web API, maybe you're writing it to another database, maybe you're writing it back to the original database. You can essentially do anything you like. But the key is we make CDC access really, really trivial, basically. You just tell us about your database and then we figure out how to do it.

Then you can deploy arbitrary code and have it act on your date in real-time. That's really the core offering of what we're building. Recently we launched Turbine which is our code first interface for building these data apps and data products, and so that's what you'd be using. You'd draw up a Turbine app that says, "Whenever I see this type of thing in my data, manipulate it in this particular way or enrich it or do whatever you need, and then pass it down the line.

Brian: Excellent, yeah. That's genius. It sounds genius to me because this is not... The infrastructure world, I don't spend a lot of time in and I always look for tools like this that I can just go to pick off the shelf because, to be honest, I didn't choose the frontend developer life. It was chosen for me at my last job to stop writing Go code, started writing JavaScript so I spent less and less time working on the problems you're working on. I love benefiting from solved problem solutions, things that I can just install into my project or just point through API end points, so yeah, I'm trying to say I'm sold right now.

DeVaris: I mean, that's good, man.

Brian: I want to hear more about the product though.

DeVaris: That's the whole point, right? You as a developer shouldn't have to worry about all of that infrastructure. To your point about standing on the shoulders of giants, that's what we do. The infrastructure that we use is all open source components and we just figured out the best way to stitch these things together and then we put our secret sauce of automation and optimization on top of that.

For us it's really about when you connect to something, when you start streaming it, when you transform, when you distribute, all of that type of stuff, notice we didn't say anything about, "Man, you got to learn about this random ingestion tool, learn the intricacies of Kafka, learn the intricacies of Spark to do string processing, learn the intricacies of all these other tools." It's just like, "No, man. If you've got an idea just point Meroxa at the data source, we figure out how to get data out of it."

Now you have this consistent data model for you to go interact with that you can just transform, augment, enrich, mask, whatever it is you want to do with this stream and then distribute it wherever it is that you want to. But you as a frontend engineer don't have to worry about the intricacies of the configuration and the infrastructure, and that's really what we want to do, is democratize that access.

The cool part about it is because it's all open source, we give you the ability to configure the infrastructure however it is that you need it, once you get more and more complex with your understanding.

Brian: Yeah, that's wild too as well because the open source element itself is so intriguing, because I'm a big open source fan and I love the fact that I can also stand on the shoulders of giants in Open Sourcing their solved projects, companies have gifted us, come down from their large pedestals and gifted us an open source project to slice and dice to our will. So what's the distinction between your enterprise offering and what people can do in open source?

Ali: So in our case, our main open source offering is a tool called Conduit. That's the main thing that we're really putting our weight behind, and Conduit itself is a lower level tool for data integration. Conduit's main focus is really taking data from a source and pushing it into a destination, and doing it easily and efficiently and in a portable way. It's an alternative to tools like maybe Kafka Connect where it's much more focused on just the pure data movement part.

The Meroxa platform as a whole is an abstraction on top of that, and so the Meroxa platform uses Conduit, as obviously the benefit there is if Conduit succeeds, as it grows, as more people use it and build more connectors, then the Meroxa platform inherently benefits because now it has more connectors and more usage. Conduit is really the main open source component. Beyond that we have a number of other things that we have contributed too and are open source as well as part of the platform itself.

Yeah, there's a number of components. But I think the really interesting part for us is Conduit itself. We made a very early decision to make it open source, from the get go we knew that it should be open source. It's even split into its own GitHub organization, we have a separate team that works on it. Really we approach that as, "Here is a legitimate open source product,"and we wanted to really get the support of the community and we wanted to telegraph that, yes, we're taking this seriously, we are treating it as a proper open source project.

We have a public roadmap, we develop in the open, we accept PRs, we're really treating it like a proper open source project because I think that's what we need. In order for it to be successful it needs to be embraced by the community, and in order to do that we need to be clear about how we plan to work with the community and how we plan to support the community.

Brian: That's interesting too as well, having a separate organization. I don't know if this is a good comparison, but I think of Kubernetes and GKE. There is the open source product of it that most people know, but if you're a large enterprise you're probably reaching for GKE pretty quickly because you don't want to be involved or if licensing is an issue and stuff like this. So do you feel like you're going to move into the same pattern where you have this ecosystem of open source contributors, community perhaps events and collaboration, even with other companies and then you have the paid service which is Meroxa?

DeVaris: Yeah, absolutely. I think for us really Conduit scratches our own itch so we needed it because developing on Kafka Connect was not the most desirable experience. The other piece of that too is that really outside of the Confluent folks, Kafka, there really wasn't a lot of other voices in the streaming data world, and for us it was like if Confluent didn't do it then nobody else would. It took them, what?

Six years for somebody in the open source world to give Kafka Connect a UI six years after the fact. It's like, yo, with Conduit that was the first thing that we did because we were just like, "Yeah, we're just tired of troubleshooting and not knowing where topics were going," and all this other stuff. The other thing that we ended up doing is it's a single Go binary, and so instead of what is it?

The resource hog that's on the JVM, it's like 30 megabytes and so for us... Ali can explain it better, but we needed this, we needed to develop this in order for us to realize our true vision for it as a platform. Then we realized as a community... Our philosophy is like, "Look, man. We don't think that everybody should be paying an arm and a leg for automated copy and paste, especially when you have to do so much to get copy paste going up right."

I got to set up Debizian, Kafka, Kafka Connect, Kubernetes, Docker, all this stuff. Yeah, I can run a Docker Compose and get it all going, but that Docker Compose is looking pretty gnarly at some point with all the dependencies and all this stuff. To troubleshoot that was a pain, so think about like, "Yo, if I'm testing this in development or in staging, and then I have to do this in production, I have two behemoth infrastructures that I have to manage and maintain."

For us it was like, "No, just go install Conduit right and now you got a binary that's on your machine that we can simulate the traffic locally and it works the same in production and it's only 30 megabytes, it's written in Go, the transforms are written in JavaScript, connectors can be written in pretty much anything because we have a plugin interface.

There's just so many different things that we do that we were just very, very intentional about the experience because we needed it, and so we knew that from our experience that it was an itch that we needed to scratch and we've seen a good number of people using it in the wild. Ali, did you want to talk about our motivations for building Conduit?

Ali: Yeah, for sure. Initially when we set out to build the platform we focused pretty heavily on Kafka and Kafka Connect, obviously my experience at Heroku and Salesforce enabled me to build that out. But we almost immediately realized that running Kafka Connect and Kafka at scale is not an easy thing, and Kafka Connect in particular, it has some fairly sharp edges.

We really thought about if we want to do this we want it to be successful, we want it to financially make sense for us, Kafka Connect is the pathological worst case scenario for resource usage for a managed service. A big motivation for us is, all right, we want this to succeed and we want it to be financially feasible so what do we do to fix this? And so we set about looking at different technologies and fundamentally what made sense for us is to build it in something that's a little bit more or significantly more resource efficient.

Then along the way we collected all these laundry list of issues that we'd faced running Kafka and Kafka Connect at scale and so we just set out to address them all basically. Having a single binary, we actually relaxed a constraint on Kafka. So at a minimum if you want to write a Kafka Connect connector then you need Kafka Connect, Kafka itself and Z Keeper, so that's three distributed systems in order to test a connector.

That doesn't even include the thing that you're connecting to. So if you want to do PostgreSQL, that's like four things that you're deploying. In our case we relaxed that constraint and so you just have Conduit and you can use an internal memory buffer to connect to your source and destination. Super useful for testing and development, it's pretty straightforward.

Then you want to write the connectors, Kafka Connect you're limited to basically writing it in Java which is great if you love Java, but if you don't then you're kind of stuck. So in our case we made the decision to make that a little bit more flexible, so building on top of HashiCorp's Go plugin mechanism so essentially you can write a plugin in any language and as long as you implement that interface you can connect it.

A side effect of that which is pretty awesome for us is we have a Kafka Connect connector wrapper, and so essentially if you already use Kafka Connect you can bring those Connectors with you, drop them into the wrapper and use them with Conduit. And so we have a nice migration path and now you can remove the constraint of Kafka, use Conduit, benefit from all the other changes and improvements that we've made to Conduit but protect your investment in an actual connector ecosystem. So yeah, we basically set out to address all of these pain points for us.

Brian: Excellent, yeah. I love it, I love the story, I love what y'all are working on. I also totally understand why so many people have reached out to me and said I need to talk to the Meroxa team about what I'm working on. I can get into it on the picks, to be honest. But yeah, I guess now my question is y'all have been around for a couple of years at Meroxa, right?

DeVaris: Yeah, man. Almost three years, it's crazy.

Brian: That's amazing. So you got started right before the pandemic then?

DeVaris: Literally right before.

Brian: Okay. So how has growing the team been during the past couple years?

DeVaris: Knock on wood, the pandemic hasn't really affected us because we've been remote first. We set out to build an extremely diverse team from jump, and so for us it's like, "Yo, we'll just go find the dopest engineer wherever they may be." I used to say like, "Yo, we want the illest dude Iowa," until we found the illest dude in Iowa from a Twitter DM and that's the type of stuff that we wanted to do, is just really democratize access to opportunity and I think that's what we've done.

We have one of the most skilled, diverse teams that's out there. You look at this call, I don't know how many founders that look like us that you've had on this as technical founders but you explode that out to the rest of the team and we're 60% black and brown, we're over 50% women, the exec team is over 80% under represented. We're just building a great culture, we're building a great product and we're building a great business and doing so in a way that embraces the community and makes people feel welcome. I think that's something that has given us the ability to be able to do the things that we've done in the pandemic.

Brian: Yeah, that's awesome to hear and that's extremely impressive as well, having such a diverse team. No knock on other founders and companies, but the assumption is that the diverse teams usually don't come with such a deep, technical, back knowledge platform and you're also bring that too as well so that's pretty amazing.

DeVaris: People don't realize where me and you met at, B Dougie. Me and you spoke at a lesbians in tech panel together.

Brian: That is true, years ago. Yeah.

DeVaris: Years ago. Because we just understood the power of creating these safe spaces. I think for me and Ali, the biggest thing that we can do is show that, yeah, you can be a founder of color and be technical. You can do all of this and be excellent. We're doing something that I don't think even the big companies are enabled to do, and it's a combination of all of our experiences and really that's the thing that separates us from everybody else.

Ali doesn't like to thump his chest much, but the Heroku Kafka offering was doing billions of events a minute for tens of thousands of Kafka clusters for thousands of customers, five-nines for years, and the height of the resourcing was six people. To do that you need to be on your stuff right, you need to know how to do this and so that's really the thing where it's like, all right.

Then you have me on the other side that's the developer experience guy and really if you look at all the things that we've done together, we've really figured out a way between us and our teams, the way that developers should be writing software for years. You can't go to anybody and they not say, "Yo, where's the Heroku for X?"

It's really the ethos that we've ingrained in our team. I know that it's tough and there's some random ML engineer at some big company that's like, "Man, I want to go start a startup and do something,"but open source is a great way to get started, build your expertise up and then take those ideas and try to commercialize them at some point.

Brian: Excellent. Yeah, I mean if y'all are listening and you haven't already signed up to use this product or signed up to apply for an open role, y'all need to get in on this right now.

DeVaris: Appreciate that.

Brian: I'm excited. I'm about to apply for one of your jobs right now.

DeVaris: All right, man. All right.

Brian: Excellent. Yeah, so I did want to round up the conversation on Meroxa now and transition us to picks. I do appreciate y'all sharing the vision and the product, I'm confident folks are probably reaching out and probably starting to install this stuff and check out the open source repos. But Jam Picks, these are things that we're jamming on. Could be music, food, tech related, all is relevant, and my first pick is PostHog. I haven't really used Google Analytics for a minute and I know there's been issues with GDPR and how do you do that with Google Analytics, and I think mostly that's just not possible.

So PostHog, I don't need analytics to be quite honest, what I want to know is what onboarding looks like for people to use my platform. What happens after you've logged in? As you know, starting up a company and running this for a couple of years, you've got to know what the conversion looks like, so how many people, how many monthly active users, daily active users.

That was all stuff I've done six, seven years ago I set up those platforms, but MixPanel was the go to I think at that time. But PostHog, it's open sourced, it is a YC backed company as well, and they're providing all that stuff but also you get to export and walk away with your data as well. That's what really had me interested in this, and the reason for that is the product I'm working on which everybody listening has heard that I've been working on Open Sauced for a while.

At the time of you listening to this, I'll be working on Open Sauce full time, so stepping away from my role at GitHub in the next couple weeks so when you listen to this on your iPod, I don't know if people have iPods, iPhone.

DeVaris: iPod? I was about to say, man. Bring out your Motorola Razr.

Brian: Honestly, the way the kids are dressing today, people are bringing out the old school stuff. I saw a Microsoft Zune not too long ago on Twitter.

DeVaris: That's true, that's true.

Brian: Yeah, so I've got this product, I'm partnering with the founder of GitSense who's also been a guest on this show. We're going to co-found OpenSauced together, and if you check out OpenSauced.GitSense.com, that is our data. We've been basically indexing GitHub projects and are going to provide insights for folks. That's part of the reason why people keep reaching out to me to talk to you, because it sounds like we could probably leverage some of your tooling.

DeVaris: That's pretty dope, man. Yeah, I would like to say things that I picked recently, man. I don't know if you saw yesterday, but that Kendrick Lamar video that came out.

Brian: No, I didn't. I did not see that.

DeVaris: Yeah, he has a new album coming out May 13th, this Friday. It's called Mr Morale and the Big Steppers. That is something that I am very, very interested in. I don't really have a lot of time so my time is mostly dedicated to the Conduit ecosystem, and these days my pick is definitely Conduit and writing connectors. That's all we do, man, is just try to find ways that we can write connectors for things. Check it out, Conduit.io. C-O-N-D-U-I-T.IO. Yeah, man. That's really what it is, trying to help grow this thing.

Brian: Excellent, yeah. I'm definitely checking that out after we jump off this call. Ali, you got picks for us?

Ali: Yeah, just one of the things that I've been playing around with recently is Litestream. I think it made the front page on Hacker News, but it's an interesting idea because I feel like for anyone who's not familiar with it, it's basically building on top of SQL Lite and turning it into a reliable data store with replication to [inaudible] storage. I just think it's refreshing to see a 180 degree turn away from bigger and more powerful, more elaborate, more distributed systems. It takes it back, SQL Lite is about as simple as you can get in terms of lightweight databases.

It seems like an interesting approach to take something so simple and then try to make it reliable and gear it up towards more enterprise usage, I just think it's a cool approach. So yeah, I've been playing around with that, that's super, super fun. Maybe more of a general category, but for us our development environment is pretty elaborate and it's a beast to deal with because, for a developer, they basically have to standup two Kubernetes clusters and a bunch of other systems. And so looking into these tools where you have these hosted developer environments, we've been working with a YC startup called Nimbus. I think their website is UseNimbus.com, but there is a bunch of other ones. What was the one that we spoke to, DeVaris? I'm drawing a blank on the name.

DeVaris: Okteto.

Ali: Yeah, Okteto. Yeah, these products are amazing. I think especially now as amplications are getting much more elaborate, you're going deep on micro services and all these other dependencies, that developer experience is beginning to suck pretty bad. So these tools kind of rein it in and make it sane again. Obviously we pitch ourselves as a developer experience company, that's what we're trying to deliver is a better developer experience, so yeah, I like these tools and I think it's also interesting to see this push towards bringing sanity back to developer experience.

DeVaris: I forgot the other thing that I was checking out too, man. Charm. Charm.SH.

Brian: Oh yeah, I've seen that.

DeVaris: Yeah, shout out to Toby and Christian and the fete guys, right? There's been this renaissance on the CLI that I think is really, really dope for two reasons. One, because I learned on MS Dos back in the day how to build, just trying to run these full featured apps and having to know commands to do so. Even my mom knew MS Dos commands.

The Terminal was a thing and then it just went away and now it's starting to come back. I think as developers we spend most of our time in the Terminal, that's something that, yo, it's pretty cool to see people actually building tools for us to build these better experiences. Yeah, I forgot about that until Ali said, the Okteto guys, and I'm like, "Oh man, yeah. Charm and Fig." Yeah, I dig those.

Brian: Yeah, I do like the terminal tools and these services to bring... Heroku had a very good developer experience where everybody knew the patterns of how to deploy and manage your deployment even in production, and I like that now we're bringing that to things like Kafka and all these other services that I just haven't had time to learn because, again, I've been so far removed and working in JavaScript that when I need to go reach for something it's like, "Ah, I have no idea what I'm doing."

DeVaris: You aren't alone, man. You aren't alone. That's why we're here, we're here to help.

Brian: Yeah, there's definitely a pattern. Excellent, yeah. So appreciate the help and talking through this, and listeners, keep spreading the jam.