Building A Business On Building Data Driven Businesses - Episode 246

Summary

In order for an organization to be data driven they need easy access to their data and a simple way of sharing it. Arik Fraimovich built Redash as a way to address that need by connecting to any data source and building attractive dashboards on top of them. In this episode he shares the origin story of the project, his experiences running a business based on open source, and the challenges of working with data effectively.

linode-banner-sponsor-largeDo you want to try out some of the tools and applications that you heard about on Podcast.__init__? Do you have a side project that you want to share with the world? Check out Linode at linode.com/podcastinit or use the code podcastinit2020 and get a $20 credit to try out their fast and reliable Linux virtual servers. They’ve got lightning fast networking and SSD servers with plenty of power and storage to run whatever you want to experiment on.



Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Arik Fraimovich about Redash, an open source business intelligence platform that helps you make sense of your data.

Interview

  • Introductions

  • How did you get introduced to Python?

  • Can you start by describing what Redash is and its origin story?

    • What are the primary ways that it is used?

    • The business intelligence market is quite mature and has many commercial and open source projects to choose from. What are the aspects of Redash that have allowed you to be successful?

    • What would you consider to be your closest competitors?

  • What was your background with data before starting on Redash?

    • What are some of the most notable lessons that you have learned about business intelligence since starting the project?
    • How has the landscape for business intelligence and data analysis changed since you began the project?
  • Beyond just accessing data, Redash focuses on enabling visualization of the results. What types of visualizations do you support and how do you support users in choosing the most effective ways to represent the information?

  • What are some of the common challenges that your users and customers encounter when communicating with data?

  • One of the critical aspects of enabling data access in an organization is the ability to collaborate on asking and answering questions. How do you approach that challenge in Redash?

  • How is Redash implemented and how has the overall design and architecture evolved since you first started working on it?

    • How do you manage the complexity of supporting so many different data sources?
    • If you were to start over today, what would you do differently?
  • Beyond the code of Redash, you also have a business around providing it as a hosted service. What are some of the most interesting, challenging, or unexpected lessons that you have learned in the process of building and growing that service?

  • How do you approach the direction and governance of the open source project and balance that against the wants and needs of the community?

  • What are some of the most interesting, innovative, or unexpected ways that you have seen Redash used?

  • When is Redash the wrong platform to use?

  • What do you have planned for the future of the Redash business and project?

Keep In Touch

Picks

Closing Announcements

  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Click here to read the raw transcript...
Tobias Macey
0:00:13
Hello, and welcome to podcast, the podcast about Python and the people who make it great. When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at linode. With 200 gigabit private networking, scalable shared block storage, node balancers, and a 40 gigabit public network all controlled by a brand new API, you've got everything you need to scale up. For your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. They also have a new object storage service to make storing data for your apps even easier. Go to Python podcast.com slash linode. That's l i n o d today to get a $20 credit and launch a new server and under a minute, and don't forget to thank them for their continued support of this show. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen and learn from your peers you don't want to miss out on this year's conference season. We have partnered with organizations such as O'Reilly Media chronium Global intelligence, od sc and data Council. Upcoming events include the software architecture conference, the strata data conference, and pi con us. Go to Python podcasts comm slash conferences to learn more about these and other events and take advantage of our partner discounts to save money when you register today. Your host, as usual is Tobias Macey, and today I'm interviewing Arik Fraimovich about Redash, an open source business intelligence platform that helps you make sense of your data. So Eric, can you start by introducing yourself?
Arik Fraimovich
0:01:50
Yeah, Hi, I'm Eric. I created Redash. I'm a software engineer that happened to become a CEO, small SAS company. I still try to code So I guess I'm still software engineer more than CEO.
Tobias Macey
0:02:03
And you remember how you first got introduced to Python?
Arik Fraimovich
0:02:05
Yes. So I've been thinking about that question. And I realized that was when Google App Engine was released. So I guess that's when I picked up Python. So that's 2008.
Tobias Macey
0:02:15
Can you start by describing a bit about what the product is and some of its origin story and why you started building it in the first place?
Arik Fraimovich
0:02:23
Sure. So Redash is basically like, you can call it a BI tool. But in simple terms, it's a web interface that you can use to connect to your databases or actually data sources because we support more than just databases like Google spreadsheets, JIRA, Salesforce, or any JSON API. Then once you connected you can query those sources. and visualize the results in different forums like charts, maps, whatever or just a plain table is fine as well. group that in a dashboard and share within your team or company organization. So redish was born actually as a hackathon project the beat over Six years ago at the, at the the previous company, I was working at everything me, we we were just starting to use redshift. And we needed a tool to share the data for merchants. And we didn't find anything at the time that would work well with with redshift. And we had hackathon. So at that hackathon, I created the first iteration of freedom. And that's how it started.
Tobias Macey
0:03:22
That's funny that it started as just sort of an accident. And now it's become your main source of revenue.
Arik Fraimovich
0:03:28
Yeah, I mean, like, it's not like I have any bi background or anything, I basically stumbled into it. And I happen to really like the field, I'm going to enjoy, like the product. And I really like enjoy seeing things that are whether like, people use it for like extra stuff. It's not like a game. It's like driving their business.
Tobias Macey
0:03:47
So it's really fun. And so in the context of the hackathon, I guess, how long did you have to work on? And what did the end result look like by the time that you were done with the hackathon and what was your decision? point where you thought that okay, this is something that I can actually make a business out of, or something that I can keep hacking on, you know, what, what was the story beyond just that point? And how did it lead you to where you are now
Arik Fraimovich
0:04:12
the hackathon prove that, yeah, we can build something that will be useful for us. And then I kept working on it as a sort of 20% project at the company about I think, two months after the hackathon, we open sourced it. And that's when we started seeing adoption from other companies. It's been slower at first, but then it became like we saw more and more adoption. And around something like two years into the project, everything me was shut down. It was a startup and like, didn't find a business model and was shut down. And at this time, I saw enough adoption of redish by real companies that use it for their daily stuff that they figured there is a good opportunity here. So I wanted to make sure that redish has a sustainable future and that's how I decided to start company.
Tobias Macey
0:05:00
So in terms of the product itself, what are some of the primary ways that it's used? Given that you're able to use it for gaining access to all these different data sources and build dashboards and visualizations around them?
Arik Fraimovich
0:05:13
Yeah, so I have the, all these things, but windows are nothing to recording. So the main use cases, our guests, I guess, our bi and other analytics, and by analytics, it's both like usage analytics of different products, but also operational analytics, like understanding the business and understanding how like I don't know like from like my personal experience, what we are using redish it internally at redish for we like analyzing our revenues, so we use redish. For that we basically have our charges stable and we put redish on top and build different dashboards that show our revenues and Sharon and stuff like that, because it connects to so many data sources. The usage is very diverse. Some people use I guess that the main usage is business users, but there are some people that using it for operations proper, just don't realize their infrastructure. I think that's not a great use of religion. There are better tools for that. But it has the benefit that you have all your sources in one place.
Tobias Macey
0:06:15
And in terms of the business intelligence market, there are a number of products that have been available for a while both commercial and open source, and they've gone through their own evolutions over the years. But I'm curious how you view read ash in the context of the business intelligence market in particular, and what are the elements of the redish product specifically that have allowed it to be successful given the maturity the market and the number of competitors that there are?
Arik Fraimovich
0:06:40
Yeah, the market is very, like there's so many solutions, but I think that what helped Rita she's combination of two things. One is the fact that it's very easy to start with. I mean, you deploy redish, and you can either like use our sauce, and that takes a few minutes or you can use one of the cloud We provide them that also takes a few minutes to start. And once you have redish ready, you can connect it to your database. And you can have dashboards ready to share with your team in an hour or even less, sometimes more, it really depends on the kind of questions you're trying to answer. So the fact that it's so easy to use is one aspect that help. The other thing is the fact that we support so many data sources, and specially that we were early on to support things like redshift, BigQuery, and Amazon, Athena, where they didn't have that many support from more traditional tools that really helped with adoption. I think that especially BigQuery at this I'm not sure if it's still the case, but it's like the developer evangelist for beekler. I used to use redish for his demos. So that's definitely helped with people getting to know the project.
Tobias Macey
0:07:47
Yeah. In terms of the competitors that I view as being closest to read ash. They're things like meta base and Apache superset, which are good projects in their own right but you're right in that read ash definitely has the edge in terms of the number of data sources that are supported. And that's actually a big reason of why I started using it in the first place, particularly for things like Elastic Search that weren't supported a number of other projects. So that's definitely one thing that continues to be the case. And I'm curious sort of what your approach was in terms of being able to enable connections to so many different data sources as far as how the plug in interface was defined, or what's involved in adding data sources to read ash that has allowed you to maintain that velocity as you continually add new data sources with each release.
Arik Fraimovich
0:08:33
I think it's based on the fact that to add a new data source in reverse, there is a very, very simple API you need to implement, you basically need to implement how to execute the query. And then you can add support for implementing how to get the schema of the database that you're querying, but that's optional. So it's basically two things that you need to implement. It's quite simple most of the time, sometimes with things like Elastic Search, it's actually not that simple. Because then we work with a table concept of results. So you need to like mesh, the data set that comes back if it's nested and stuff like that. But usually it's not not really that complex. And I think this model and the fact that we don't have our own query language, whenever you use redish, whatever you use it two ways. That's the Korean language you will use. I think that really helps us being able to add almost any data source very fast and very easy to I guess that that's the thing.
Tobias Macey
0:09:29
Yeah, the fact that you're using the native syntax is definitely beneficial. As far as anybody who's familiar with that datasource can pick up Bree dash and start being effective without having to learn the specific peculiarities of whatever interface is being exposed in the other tools. And a lot of other older business intelligence suites will actually use more of a drag and drop editor for being able to define queries and answer questions. And so I think the fact that you're relying on just the native interface helps, particularly In this day and age where people are more likely to want to use the code interface than anything else, at least from a developer perspective.
Arik Fraimovich
0:10:07
Yeah, exactly. I mean, the trigger for me to like, look into developing redish was trying to use Tableau and Tableau have this really rich interface and old dragon grow. But when we connected it with spreadsheets, it was so, so slow, when the queries it was generating were very, not performance. It was really bad experience. I mean, it looked all shiny and nice. But when you try to create something, it was a really terrible experience. So yeah, just having a sequel box that they can type in my query was super helpful.
Tobias Macey
0:10:39
And you mentioned that you didn't really have much of a background in business intelligence before you get started on this project. And so I'm curious, what are some of the most notable lessons that you've learned about that landscape and about the market since you first started the project and some of the ways that learning has reflected in the product
Arik Fraimovich
0:11:00
learned that much about BI. I'm just trying to build the product based on our customers feedback. But something that I did realize at some point is that all their big tools promise self service bi, basically the idea that the business users can ask questions themselves, but in practice, almost nobody really delivers it. Look are quite close to that. But it requires a huge effort to set up. And if the organization doesn't invest that effort, they won't get self service. And I early on decided that we're not going to promise self service bi, we are not going to have a drag and drop interface. We're just gonna let you use your database, it's probably gonna change at some point, like, we were like, I have a certain roadmap in in my head and I think that at some point, we will be ready to introduce a concept of drag and drop, but I'm pretty sure we will do it differently from how it's usually done, because usually They, they try to stitch a drag and drop interface to really bad schema. And that doesn't allow you to ask the interesting questions. That's when people go back to writing SQL. So that's why we want to give a really great sequel experience and then add on top of that.
Tobias Macey
0:12:17
And in terms of the overall market itself, what are some of the shifts that you have seen as somebody who's working within that market and the types of things that businesses are asking for from their Business Intelligence Suite and the types of data that they're dealing with?
Arik Fraimovich
0:12:33
I think that there are three things I've noticed. One is the fact that there is more data that organizations have so and that makes tools that work with extracts like Tableau or Power BI, and beat obsolete now we have really powerful data warehouses or even data lakes like Athena, or Spark and stuff like that. And really, you want the road power of these tools and not something to be Put on top of them. So that's one friend. That really helped me that shift the fact that people have these massive data sets that they want to analyze using their databases. The other one is the rise of open source. I don't think like there have been some open source solutions in the past. But I don't think that any of them was like, really an open source as in, like having a big community and stuff like that. And that's something quite new, like you mentioned, meta base and super said, they're probably like the biggest examples, aside from redish. And I, but none of us are still at the size of the commercial solutions, and that will change that will change. I think that in the future, we will see the open source solutions becoming as massive if not more, as the commercial solutions, and then interesting things will happen. Another thing that I'm noticing is that more companies want to use data, like to expose data to their customers like basically embedded analytics users. cases, we see more and more people ask about that. And
Tobias Macey
0:14:03
then beyond just the different data sources that are supported, the other core element of read Ash is actually visualizing the results of the queries that are being executed, you know, you can just use the tabular results and be able to parse through that visually. But the primary ways that I've used it, and that I've seen others use it is by actually creating these visualizations and combining them into a dashboard to try and tell an overall story of the data so that it's accessible to those business users who don't necessarily want to dive into the SQL queries or whatever the particular syntax is. And I'm curious, how will you support users in choosing effective visualizations for the answers that they're trying to provide and the context that they're trying to create for those different dashboards?
Arik Fraimovich
0:14:47
So unfortunately, I can't say that we are doing a great job in like helping the user choose an effective way to to show their data. We have a few experiments, things like our funnel visualization where we Basically you provide us with funnel data and we show it in a meaningful way. Like it's a bit beyond of like just letting you define the visualization, it has some presets of its own. But aside from that, like most of our visualizations, we try just to give you the options you might need to create effective visualizations. But we don't really help you with that. I recently picked up a few books, books on the topic, actually, by Stephen few and started like just exploring that more of like, what it really means to create meaningful visualizations like things that not just look good, but actually communicate the data better. And I hope that we would start providing better guidance in this area, but it will take time.
Tobias Macey
0:15:45
And in terms of the challenges in communicating with those visualizations. What do you see as being some of the common stumbling blocks that your users and customers encounter when they are trying to build these dashboards and userid As a communication tool,
Arik Fraimovich
0:16:01
one thing that is not unique to redish is basically knowing what the question to ask and what data to look at. But on top of that, there is also some challenges sometimes of really knowing your data base schema and like different, like issues that you might have in your data. Like sometimes you might create a query that shows the data you want. But apparently, if you dig into it, you realize that you have something that you need to clean up or exclude like, I don't know, test users or whatever. Another challenge that is actually is a bit unique to redish is the fact that you need to know the core language of your database. And sometimes it might seem that sequel is doesn't let you answer the question you want. But actually sequel is quite powerful. So there is usually a solution. It might be tricky, but I guess that's the kind of challenges you sometimes see.
Tobias Macey
0:16:52
And then the other element of communicating with data in the context of read Ash is the ability for people to are familiar with the query language, and also people who are using the output of that to be able to collaborate on asking and answering the different questions. And I'm curious what your approaches to that challenge and read ash or what you have seen as being effective patterns for people who are leveraging read ash for trying to fulfill those goals.
Arik Fraimovich
0:17:20
Yeah, that's a great one. I mean, our like collaboration around data is something that's really been on my mind when I started with redish. It's actually reflected in our logo. It's like a speech bubble. And the sort of chart which basically is supposed to communicate, collaboration and data. And obviously, we still have a way to go here. But something that really helps is the fact that you have the query itself next to the data that you're looking at. So that allows others who look at what you're doing like to to really understand how you got the results that you got, and then if they want to explore it further, they can like for the for your query and Like dig further, we are going to expand the collaboration features we have, it's probably gonna start with allowing collaboration on the same query. And that's basically very easy to enable. It's like just change some logic. But I think that to make it effective, we need to make sure that we have good versioning of the queries or the other objects that you can editing redish and then commenting and stuff like that. But it takes time there is in we have lots of things to do.
Tobias Macey
0:18:28
And so digging deeper into the platform itself, can you describe how read ash itself is implemented? And some of the overall design and architecture changes that have happened since you first began working on it?
Arik Fraimovich
0:18:39
Yeah, sure. So basically, something that I tried to follow is to use boring technologies. And that basically both to make it easy for people to maintain like those people who deploy redish for their usage, to make it easy for them, to maintain it and to support it. So we try to think that they probably will be familiar with them. And the other side of that is that if people want to contribute to our code base there will probably find technologies and patterns they're very familiar with. We basically use Python for the backend, which uses flask, SQL alchemy, and then ready simple squares to support that. And we were also using celery. But we are actually switching to our cue now, but it's actually this practically it's not really an architecture change. It's just swapping in library. So our architecture was quite stable for the past, I think three years, at least. At the beginning, I experimented with various stuff like it use things tornado, and actually changose or RAM at some point. But in the past three or so years, there were no big changes in terms of like what the kind of tools we use. Another big part of the code bases are front end and there we started with AngularJS six years ago, and about a year ago, we started the transition from Angular to react. We We're trying to decide between react and Vue js. But we picked react mainly because there was a really nice way to keep working, of having like a dual of hybrid code base where we have both Angular JS code and react code side by side. And that's what we've been releasing for the past year. Basically, with every release, there were more code that was in react. And now we are really, really close to the finish line, like, practically all these leftist are switched to router, and we are Angular free. So that's, that's nice.
Tobias Macey
0:20:35
And because of the fact that read ash was born out of a hackathon, it seems somewhat obvious that Python would be the language that you chose for implementing it as a way of just being able to get something done quickly. But I'm curious if you were to start over today, if there were any design elements or foundational pieces of the code that you would do differently, either choosing different languages or different frameworks or just overall different system architecture.
Arik Fraimovich
0:21:00
That's an interesting one, I might choose to use node for the backend, just so that we have a single in which you know, codebase it sometimes like start putting semicolons in your Python code when you switch too much. And node has the benefit of being a synchronous, which is really helpful when what you doing most of the time is IO. Now, obviously, Python has some as I think support today, but it's very different when you have some libraries that support a synchronous code and some not. We're in Node. It's all a synchronous, but I really liked buyten. So I'm not sure I would really do something different in that fence today.
Tobias Macey
0:21:37
I think that read ash has also benefited from being in Python because of the fact that there are so many different libraries to support the various data sources that you're working with and that I think you'd be hard pressed to find that same level of support and other ecosystems, though node might be a close competitor in that regard.
Arik Fraimovich
0:21:53
Yeah, that that's for sure. Although, like, the like we have, I think a bit over 40 types of data sources supported today. But I think something like 10 of them are mostly us. And the rest is like a long tail. So I'm not sure. Like if that was the only reason, I'm not sure, it's a big reason to choose Python over node, you can always like start another process and just delegate to it. And it can be written in any language. But I don't know, like, I find like the libraries and the tools nicer on the Python side.
Tobias Macey
0:22:27
And then in terms of the system design of how it's implemented and how the queries are executed. I'm curious what you have seen as far as challenges, particularly when it comes to people trying to execute queries that are returning large volumes of data and being able to represent that back on the front end or being able to handle the data in the query execution.
Arik Fraimovich
0:22:49
Yeah, so that's something that we don't really handle really well. And that almost like it's not intentional, like in hindsight, I might be implemented differently. Then it would be easy like it would be probably easier to implement it from the get go differently than now trying to redo. But we definitely don't support large data sets well. But considering the use cases where this used for it's not a big issue, because most of the time, the data you're going to visualize is not going to be big, because there is no like, there is no point in having lots of data points. When you create a visualization, like people don't see the industry delicti. And if a person is going to look at the results, they're also not going to review lots of result. People do want larger data sets, usually when they trying to connect redish with some other system, like use redish as an API or when they want to download the data set and crunch it in Excel. So yeah, we don't really support large data sets results that really well. You can basically just give redish more memory and then we'll be fine. But the main issue is the fact that we load everything into memory then converted to JSON and only then dumping into our local cache. I mean that that's not super great.
Tobias Macey
0:24:04
Yeah. But as you said, the use case the dash is designed for isn't really one where you want to be processing large volumes of data because you want your queries to be structured in a way that they're actually going to condense the information down into something that's digestible by somebody who's trying to gain some insight from that information, rather than just say, here is all of the data for this query of, you know, 10 million rows.
Arik Fraimovich
0:24:28
Yeah, although one thing that's happened over time is for a long time, my message was use your database to crunch your data and use redish to visualize it. And basically, we we don't store your data we don't like we just visualize it. But what happened is that I'm not sure when exactly it was but a few years back, we introduced the core results feature basically, the ability to run queries on top of other query results. And the idea here was to like allow different use cases of where you might want to join data. different data sources. Or sometimes it's a bit easier to run some query, like another computation of on top of existing query results, or whatever. But people are very creative. And people tend to abuse the tools you give them. But obviously, you need to be mindful of that. And like, it's good. When people abuse your product, it means that it brings them value, and you just need to like, look at what they do and try to, like give them a better solution. And basically, once we gave this feature, people started using readeth, sometimes as a form of database. And then they want to have ability to, like load larger results it into redish. And that's something that I've been looking into recently in like trying to figure out what we can do better there.
Tobias Macey
0:25:46
Yeah, it's definitely interesting the ways that people will work around the sharp edges of a tool and make it do things that it was never actually intended to do just because it's the tool that they have, rather than seeking outside and looking for the tool that's more well suited. To the particular problem that they have, because it's only 5% of their use case and the other 95% is filled by the tool that they have. beyond just the open source code base, read ash, there's the business that you've built around it. And so I'm curious what you have seen as the sort of benefits of having a hosted solution in terms of the adoption of the product and how you balance the needs of the business against the desires and needs of the open source community that are using and contributing to read ash.
Arik Fraimovich
0:26:33
Yeah, so having a hosted solution really helps because when you have an open source product, it's really hard to know what people use it for, in what ways. So having a hosted solution really gives us a way to look into, like how people use it, what kind of his organizations they use, what kind of data sources are they use, and obviously, it's not a perfect representation of how the general population uses redish but it definitely gives you some ideas of what's more common and what's less, you need to be mindful because like, for example, we export IBM DB to or whatever and like that's less common to be used in a cloudy environment, I guess. So you won't see that on the SAS. But there might be people who use with the open source version, but it's still like, you get so much visibility on how people might use the product. And whenever, like, whenever you make a release, it's always after significant time, when we had that code base running on SAS, and we stumble it like different stupid bags and mistakes. So it helps us make more stable releases. Because like it's, it's harder for people to upgrade often. So we try to make sure that when we make your release, it's worth the worth of their time to to upgrade.
Tobias Macey
0:27:50
And that's an interesting point, too, is because you have this hosted platform, I guess is that you're deploying the current state of the master branch, and I'm wondering why What your decision points are as far as when to say that this particular point of the code is a major release that one of the open source users is going to deploy as an artifact versus people who want to just deploy straight for master themselves.
Arik Fraimovich
0:28:15
So it's usually a combination of Okay, enough time passed since the previous release. And there is enough interesting stuff in this release for people to upgrade. I was hoping to have regular releases every month. But it just so happens that it's really, really hard for us to to maintain that schedule. So it's usually I think, a release every three, four months. So basically, when we feel okay, we have enough stuff there that it's worth upgrading, and it feels that it's stable enough, like we had these from version running for quite some time, nothing major came up, we make a better release. And then after the better release, basically that helps with whoever like all the early adopters who might deployed Prem and then find out issues that we don't experience on the south version, we then make another round of fixes, and we make the final release.
Tobias Macey
0:29:09
The other thing that I'm interested on the business side is the overall business model that you have and some of the ways that it has grown or evolved since you first started the company, and just your overall lessons learned in terms of managing the business behind the product.
Arik Fraimovich
0:29:26
Yeah, so when I started, I researched into like how people basically what are the business models that people use for open source projects, and what I learned is basically people doing everything and the bigger companies definitely do like all the stuff like support, SAS different versions and all this stuff. But I took inspiration from what century been doing, which is basically a SaaS offering of the same code base you have on on the open source side, which I really liked, because it's like, very simple There is no conflict of interest. And I figured, yeah, let's do that now. Because I'm bootstrapping. And I really like really quickly burned all my savings on that perience. And SAS takes time to ramp up. I was hustling for, like any stream of income at the beginning. So we do have a few companies that pay us for support like that. That's the bigger users that reached out and really wanted like someone to be able to answer their questions when they need to. And at the beginning, it felt that wow, like, SAS is such a bad such a terrible business model. And support is so much better because we were making so much more money from support and dislike firm for customers versus the SAS platform. But lucky enough, I was patient awake. And today, SAS makes most of our revenues like something like over 90% of our revenues coming from SAS, and I definitely see the benefits, like, it's a very stable in a way business model. Like, especially when you deal with lots of small customers versus the big ones. So that was nice. Every year, I've been telling myself Yeah, this year, we're gonna introduce some offering for for the enterprise users, because basically all the big companies that use redish, they use the open source version. And they use that not because they want to save money or anything, they just use the open source version, because they're not going to trust some SAS vendor with their database. So it made sense to offer them something they can pay for, which isn't support because part is not like, Oh, I want to be a software company and I want to sell software. I don't want to sell support. I want to make my product easy to use that people don't really need support. So every year I've been telling, okay, this year, we're going to do something for the bigger customers that deploy on prem and every year it was pushed back because we were so busy with like, Building the product itself, working on the SAS stuff. And over time, I think that what happened is that the world changed a bit like more and more companies are more comfortable with SAS offering. Now obviously, I don't know, like, Bank of America will not adopt, assess offering anytime soon, but that's fine. Like I don't really need to serve all the customers in the world. So and more and more companies are definitely willing to use the SAS offerings even for things like their database access. So I think I'm not sure if we will ever have some kind of an enterprise offering. But on the other hand, you never know,
Tobias Macey
0:32:36
it's definitely good that you held out with the SAS approach because as you as you said, you can scale it much more readily and you're much less susceptible to customer churn if somebody drops off versus if you have a smaller number of support contracts where you're gaining more revenue per customer, but if one of them then decides to go with a different solution or they go out of business or whatever the reason is, That they no longer maintain that support contract, that's a much bigger hit to you. And then you have to scramble to try and find somebody to replace them. And I would imagine to that by having that direct support contract, it's a much bigger burden of time on your end versus somebody who comes and signs up for the SAS platform. And then they just use the aggregate support network that you have built around that product.
Arik Fraimovich
0:33:22
So to be honest, and I hope that none of our support customers is listening. The support contracts have been great. I mean, don't reach out that much. And usually, their questions are very reasonable, but I don't think that scales like if we work for if we try to scale that eventually, like we will need to scale people, instead of servers to handle the load. will sometimes people would like support just for like, their peace of mind of knowing, hey, if we ever have a question, someone can answer us, but it's sometimes needed because the product isn't easy enough and I don't want to be there like I want to make it super easy like If today you want to deploy redish, you go to our website, click getting started, go to our setup page, there are links to like the popular clouds like AWS, Google digitalocean, we should probably add support for Azure. And few minutes later, you have redish running, that that's something that we might not want to have, if we're like building our business around support and people having to reach out to us and stuff like that I don't want to be in to have this conflict.
Tobias Macey
0:34:29
And yeah, that's definitely a good point is if your business is built around support, then you end up making it harder to actually use the open source product, which is never going to benefit anybody because it will just create a conflict between you and your users. Whereas you want it to be as friction free to help adoption so that if somebody comes in on the open source channel then decides that they don't want to actually be in the business of running their own server, they can just easily switch over to the SAS platform. So yeah, definitely appreciate your clarity on that point. Yep. In terms of the uses of the platform, you mentioned that you've seen some people abusing it for various cases. I'm curious what you have seen as some of the most interesting or innovative or, or unexpected ways that people are leveraging read ash.
Arik Fraimovich
0:35:15
So people are a bit shy on sharing how they use redish. So I don't really have a good visibility on that. But I do hear stories from time to time. I think the one that I'm most proud about is an organization that does cancer redish and uses redish to support their efforts. That's awesome. I really hope they're successful. Whatever they do, and I guess the most unexpected usage is the French Navy like that. They're the people that do see rescues. They use redish to like analyze their efforts, and that's really unexpected.
Tobias Macey
0:35:48
And then when is the wrong platform to use and somebody would be better suited by going with a different solution.
Arik Fraimovich
0:35:55
So right now, he The first thing is when you don't have any You wanted the organization who knows sequel or whatever, the quarantine which your database users, it doesn't have to be sequel most of time it is. So if there is no such person in your organization, then yeah, redish is not a good fit. It doesn't mean that everybody needs to know sequel to benefit from redish. But there has to be a place a single person and other cases when you want to support self service. And then you might want to choose loker and invest in like defining your data models and stuff like that. But you need to be really honest with yourself and make sure that you really need that false self service bi thing. Because many times, there are still people who create the reports. And in that case, you could just go with redish I get that in other cases when people trying to use religious, some sort of an admin to mean that in that case, there would be like we do that actually ourselves. Because it's very easy for us. It's already connected to the database so we can create different views when Like for our support use cases. But eventually if you really need an end mean to you will be better served with a fool like crowd tool like reetou first, that means stuff like that. So people sometimes use redish for operations like infrastructure stuff. Sometimes Swedish can be a good solution there. We again, we use redish for that as well. But we are a bit biased. But I guess for these use cases, graph fana would probably be a better solution, especially if you connect with some time series database that they support.
Tobias Macey
0:37:34
And for the future of read ash, what do you have planned both in terms of the business and the open source project,
Arik Fraimovich
0:37:40
so I try not to make commitments because life is surprising. Right now we are focused on finishing the two big efforts of like migrating to Python three, and our queue on the backend. So that's like takes our focus to make sure that we deliver a stable version on that end and Then, on the front end side to finish with the React migration. once they're done, we finally free to get to really dig into developing some stuff. They've been waiting for a long time. And we actually did deliver some new features this year. But mostly we've been focused on the React migration, I guess that when when we like, come back from that, that effort, or just have to review like that the kind of feedback we have and try to assess what's really the next thing. There's some interesting stuff that we really want to really want to experiment with and to deliver. But they're really trying not to make commitments, because you can see that in our GitHub director were like, We have this make email reports from six years ago and people like, hey, when's that gonna be available? So I kind of plan to, let's commit to stuff that we're actually working on.
Tobias Macey
0:38:50
All right. Well, are there any other aspects of the reed ash product or the business you've built around it or the overall business intelligence market that we didn't discuss yet that you'd like to cover with close out the show.
Arik Fraimovich
0:39:01
Oh, wow, that's, that's a big one. Now I know like we covered lots of stuff. If you have any questions, I'm happy to keep discussing stuff. Otherwise, I'm good.
Tobias Macey
0:39:09
Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And so with that, I'll move us into the pics and because we've been talking about a lot of things having to do with business intelligence and data and managing it, I'm going to pick my other show the data engineering podcast, so you can listen to interviews about a number of the different projects and tools and topics that we've been talking about here and a little bit greater depth, so I'll plug that again. So with that, I'll pass it to you Eric. Do you have any pics this week?
Arik Fraimovich
0:39:41
Yeah, so one pic is Pee Wee. It's like B double e w w e. It's a Python or RAM that we were using before we switched to SQL alchemy, and it's probably one of the decisions that they regret the most. I wish we stayed with fury. I think it's like the most pythonic orem there is it's just Really great engineering, super easy to use, and I miss it. Another one is Amazon a ECS. Everybody seemed to use either serverless or Kubernetes these days, but ECS is sometimes overlooked. But it really matured a lot in the past years. And it makes it very easy to have a very resilient infrastructure. And it really helps us sleep better at night. All right,
Tobias Macey
0:40:22
well, thank you very much for taking the time today to join me and discuss your experience of building and managing the dash project and the business that you've built on top of it. It's definitely a useful tool that I have been using for a while. So I appreciate all the efforts on that front end. I hope you enjoy the rest of your day
Arik Fraimovich
0:40:38
Sure, you too. Have a great day and thank you for having me.
Tobias Macey
0:40:44
Thank you for listening. Don't forget to check out our other show the data engineering podcast at data engineering podcast.com for the latest on modern data management, and visit the site of Python podcasts. com to subscribe to the show, sign up for the mailing list and read the show notes If you've learned something or tried out a project from the show, then tell us about it. Email host at podcast in a.com with your story. To help other people find the show, please leave a review on iTunes and tell your friends and co workers
Liked it? Take a second to support Podcast.__init__ on Patreon!