Summary
There are a large and growing number of businesses built by and for data science and machine learning teams that rely on Python. Tony Liu is a venture investor who is following that market closely and betting on its continued success. In this episode he shares his own journey into the role of an investor and discusses what he is most excited about in the industry. He also explains what he looks at when investing in a business and gives advice on what potential founders and early employees of startups should be thinking about when starting on that journey.
Announcements
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
- Your host as usual is Tobias Macey and today I’m interviewing Tony Liu about his perspectives on the landscape of Python in the data ecosystem from his role as an investor
Interview
- Introductions
- How did you get introduced to Python?
- Can you start by sharing your background in the data ecosystem?
- What led you to your current role as a venture investor?
- What is your current area of focus in your investments?
- What do you see as the major strengths of Python in the current landscape for data and analytics?
- What are the areas where the ecosystem is still lacking?
- Where are you seeing growth in the space and what do you see as the motivating factors?
- As an investor, what are the qualities that you look for in a startup that is trying to compete in the data ecosystem?
- What is your process for learning about and identifying companies that demonstrate the potential to succeed?
- Do you focus on a particular problem domain and research a grouping of companies that are focused on that problem, or do you start from a given company to determine where to place your bets?
- How has COVID changed the competitive landscape?
- Can you share some of the companies that you have invested in?
- What was noteable about their respective businesses that provided you with the confidence that they were worth investing in?
- What are some of the most interesting, unexpected, or challenging lessons that you have learned from your experience as a venture investor?
- What are some of the companies that you are keeping a close eye on, whether as potential investments or as competitors to your existing portfolio?
- What are some of the problem spaces that you would like to see companies try to tackle?
- What advice do you have for engineers who might be considering building a new business?
- Do you have any advice for engineers who are working at a startup as to how best to compete in the current market?
Keep In Touch
Picks
- Tobias
- The Sleepover movie
- What do ya do with a Bernie Sanders? music video
- Tony
Closing Announcements
- Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
- Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links
- Costanoa Ventures
- Sports Analytics
- Turo
- Databricks
- Koalas
- DataRobot
- Faust
- Oozie
- Azkaban
- Airflow
- Prefect
- Dagster
- Kubeflow
- MLFlow
- Metaflow
- Pandas
- Spark
- DBT
- SnowflakeDB
- Coiled
- Noteable
- Dask
- Data Engineering Podcast Episode About Notebooks at Netflix
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great. When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at Linode. With the launch of their managed Kubernetes platform, it's easy to get started with the next generation of deployment and scaling powered by the battle tested Linode platform, including simple pricing, node balancers, 40 gigabit networking, dedicated CPU and GPU instances, and worldwide data centers.
Go to python podcast.com/linode, that's l I n o d e, today and get a $100 credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show.
[00:00:57] Unknown:
Your host as usual is Tobias Macy. And today, I'm interviewing Tony Liu about his perspectives on the landscape Cosimo Ventures,
[00:01:08] Unknown:
which is an early stage seed and series a fund. At COSIMO Ventures, which is an early stage seed and series a fund investing across the stack. I primarily invest in data infrastructure and machine learning infrastructure companies. And do you remember how you first got introduced to Python? It was in my first ever computer science class in my undergraduate degree. It was a Java class,
[00:01:28] Unknown:
and the last class was like a bonus section on Python. But the funny thing now is that the class, I think, is primarily taught in Python. Yeah. It's funny how a lot of the coursework has flipped from being Java heavy to being Python heavy because of its approachability and the fact that it has grown in terms of its overall use in the industry.
[00:01:45] Unknown:
Yep. Absolutely. The increase in usage just over the last 5 to 10 years has been crazy. I don't think people 10 years ago would have seen this. Yeah. But I think my first programming class when I was doing my degree was in c plus plus and then we went to Java.
[00:01:58] Unknown:
We actually c because I did computer engineering, and I actually had to find Python on my own sort of toward the end of my degree program.
[00:02:05] Unknown:
It makes sense. Once I did the c intro class, I decided that software engineering was not for me, and I wanted to stick with statistics.
[00:02:14] Unknown:
And so digging a bit more into your experience with Python and some of the backstory of how you ended up today, can you give a bit of your background in the data ecosystem
[00:02:23] Unknown:
and why that's a particular area of interest for you now? I was always interested in data. Early on in college, the thing that really got me into data was sports analytics, to be honest, specifically NBA analytics, which at the time was very immature. And I spent, you know, summers working on this with, you know, professors in the statistics department, and it really got me excited about, you know, the applications of data essentially. And since then, my first industry experience was working on core search ranking at Amazon. In hindsight, I didn't know, like, how nice people had it there. Everything is just easy to use. I've learned the hard way that actually doing data in the real world is really hard. After that, I joined a small startup at the time, Turo, where I ended up building their ID fraud model, which essentially stopped more cars from getting stolen. So that was a cool, like, real world project to work on. But I found myself wanting to get, you know, closer to the business side of things. And before joining Cressida, I was a product manager at Datarix, where I led the workspace, which is the interactive data science platform, and a few other data science initiatives. I was a product manager for Koalas as well, which is the Pandas compatible
[00:03:35] Unknown:
Spark API. Given your interest in getting closer to the business, that make sense that you would end up as a venture investor. But I'm wondering if you can maybe give a bit more context about how you ended up where you are and maybe some of the selection process that made you decide to end up at Costanoa versus 1 of the other funds? I think that that's exactly right. Venture vesting marries 2 of my, like,
[00:03:59] Unknown:
main interest data and, you know, understanding how businesses work. I think in venture world, you have to kind of grasp how the entire ecosystem works. I think there will be less depth in your knowledge. I probably was much deeper in 1 particular part when I was in Databricks than I am in any part now. But it's a really fun intellectual challenge to try to piece together things that half the time, things still don't really make sense to me. There's just so much overlap and so much going on in the entire data infrastructure ecosystem.
In terms of how I successfully landed in this role, I'd say that my background is probably less traditional for a typical VC. I think you'll probably see more people come from finance and, you know, consulting backgrounds. Though for early stage venture firms, I have noticed that there are more product managers entering the space. And my sense is that it's because today, when you're investing in, you know, a seed stage company, series a company, because of how the markets evolved, oftentimes, they don't really have that much revenue or even product.
So what's really left is to understand how the product works, to understand if customers really like it, to get a sense of the market. And I think that, you know, this background actually lends itself decently to that type of endeavor. In terms of picking, you know, co signer specifically, you know, there are, I'd say, not that many opportunities in BC, and each opportunity that comes up has a very specific person in mind and, you know, vice versa. And I think it was a great situation where I was you know, they're looking for someone with a heavy data infrastructure background and wanted to focus on it versus other funds that were looking for more general investors. So that's how I ended up landing in my specific role.
[00:05:52] Unknown:
In terms of the actual day to day, what is involved in actually working as a venture investor, particularly somebody who is focused on the data ecosystem? And how do you keep up to date with the opportunities in the market? Yeah. It's a great question and, you know, something that I think we all try to continually figure out.
[00:06:13] Unknown:
I think that different people have very specific approaches to venture investing, and it takes a long time to figure out how you wanna approach it. Even now, you're always figuring it out. In this particular case, my, you know, preference is to go really deep in a space, so to be very intentional with how I spend time. And to keep up to date with things, I, you know, regularly chat with companies. Obviously, new companies are forming, but also with, you know, people who are working at, you you know, later stage companies or even public companies. So in the data space, data robots, Databricks, these are all great people to talk to who have deep insight into their particular focus area. And the more you dig in, I think, the more you understand the types of problems that exist and also might even run to different projects that are really interesting to pursue.
So in the latter case, talking to data scientists, data engineers is actually also very helpful in a part of my, I wouldn't say day to day, but, like, do a lot of these conversations on a weekly basis.
[00:07:16] Unknown:
In terms of Python specifically, it's obviously a very strong contender in the data science ecosystem, and I've been seeing it leak into other aspects of data. But from your perspective, as somebody who has worked in the space and been a product manager for a company where Python was actually 1 of the core projects that you were involved with and somebody who's working as an investor who is talking to data scientists and deciding whether to invest in these various companies. What do you see as being the major strengths of Python in the current landscape for data and analytics?
[00:07:50] Unknown:
Yeah. I think there are several things. The first 1 is that it's very flexible, especially when compared to something like SQL. You get the types of, you know, computational analysis you can do, machine learning, like and even, like, data wrangling is just something that is much harder to do in SQL. Second part is that it's a very usable language. I know several people who, you know, came from a SQL background, and with that understanding of SQL were actually able to pick up Python without too much difficulty. So those are, you know, 2 parts in the usable side. And then the third part, I guess, is just that it's exploding popularity probably because of those 2 things as well as the ecosystem that has grown. So I think all this in aggregate make the foundation a very usable and, you know, rich environment to work in. Are there any particular
[00:08:39] Unknown:
problem domains or use cases that you see being used for more often than others? And for the cases where you don't see it used as often, what are some of the other dominant languages?
[00:08:51] Unknown:
So in terms of use cases where, you know, it's more prevalent, I think that the obvious 1 is data science workloads. Like, whether it's, you know, computational analysis, coming up with charts that much more sophisticated than what you can do in some BI tool and also in machine learning. I think that, you know, where it's lacking and expanding into are in the dot engineering land and also in the BI space. I think that a lot of Python users actually also know SQL and view them as interchangeable when they're doing work. And there is more and more tools coming out that kind of support both as almost first class, that push support for SQL so that people can have this more blended experience. Now on the other hand, there's more tooling coming out to make Python, more robust, like, data engineering platform, whether it's a stuff that's an example of a project like Faust, which enables, like, Python stream processing.
[00:09:48] Unknown:
It's filling the gaps that, like, make it a ETL platform on par with Spark. I've also been seeing it come up a lot in the data orchestration and workflow management space where in the initial stages of the big data revolution that with the Hadoop ecosystem, there were projects like Uzi and Azkaban that were written in Java that tied into the Hadoop platform. And then Airflow was kind of the breakout success in the Python ecosystem for a while, and then that is being somewhat superseded by newcomers such as Prefect and Dagster as well.
[00:10:22] Unknown:
Yep. That makes sense. That's an area that's received a lot of attention. And I think along those lines of, like, maybe abstracting away DevOps from the data science experience, where you have the tools, like, on the machine learning side, like Kubeflow, MLflow, Metaflow that aim to help data scientists focus on just the machine learning, the actual, like, data science part of their work, rather than having to care about the other parts of ensuring reproducibility, ensuring scalability.
[00:10:53] Unknown:
And in the broader data ecosystem, are there any areas where you see Python support still being fairly lacking and that you see potential for growth in the coming years? I think that there are a couple of themes that I look out for rather than specific use cases.
[00:11:09] Unknown:
I'd say 1 is collaboration. I think so much of Python tooling has been built for, you know, single single player mode. For example, if you think about Jupyter Notebooks, wildly popular, you know, very useful and powerful, but it's designed to work with your laptop and with your local file system. And when, you know, you take that tool and just, you know, host it in a cloud setting, there are all sorts of challenges that arise because of what it was initially designed for. So I think that there's a lot more tools coming out now that are bringing collaboration, like making it a first class citizen in the Python experience, which has just been lacking, especially in the early days of Python.
And another aspect is related to Python tools being designed for, you know, your laptop is that a lot of tools are designed for single node usage. So when it comes to scale, when it comes to scaling your code to, you know, larger datasets, that becomes very challenging. And in many of these cases, people might, you know, push the limits of pandas and decide that they need a new solution and have no choice but to go to somewhere, use something like Spark, which is, you know, a completely different architecture and, you know, that has more limited support for Python.
So I'd say that on 1 hand, there is the collaboration, there's scalability, and I guess the 3rd challenge is just around making it really usable. I mean, if you think about package management today, it's always a nightmare when people are migrating from 1 environment to another, and this greatly impacts collaboration. So I think this whole overall experience that needs to be reimagined, but the core is there. Like, Python is extremely usable. There are many great libraries that people use every day to do their work, But now it's about tying those together in an enterprise setting. In terms of the
[00:13:08] Unknown:
growth in the space, what do you see as being some of the motivating factors for Python finding its way into some of these other areas that have largely been dominated by other languages. With data in particular, it's been a lot of Java, but in the high performance computing or real time machine learning. There's been a lot of, like, Fortran or c plus plus code, but Python has been able to edge its way in because of a lot of the integrations that it offers. But what do you see as being the driving factors that push people to trying to use it in these different use cases that have traditionally been a little more difficult? Yeah. I think it's really
[00:13:47] Unknown:
tied to both a huge growth adoption, and as we mentioned, like, CS classes are now teaching Python first. There's just a lot more people who are learning Python at an earlier stage and, you know, extensive ecosystem around Python that just makes it so flexible that when you think about having an overall workflow that makes sense, you would like tools to be compatible and to be, you know, interoperable. And, you know, given that Python is emerging as the, you know, de facto language for data science, presumably, you'd want the tools, use cases that, you know, touch on data science to also be compatible with Python. And I think too, 1 of the
[00:14:28] Unknown:
driving forces is kind of in a couple of directions where, particularly in small companies, a lot of data scientists end up having to do their own data engineering. And so they don't want to have to jump over into a Java framework. They just wanna stick with the Python they're familiar with. And then in the other direction, data engineers want to be able to collaborate more fluidly with the data scientists. And so having all of the tooling in Python makes that an easier bridge to cross because everybody's working in the same language without having to do translations back and forth. Absolutely. I think that makes a lot of sense. And, you know, in
[00:15:06] Unknown:
business analysts and other, you know, potential other less technical users who know SQL into the fold. In your role as an investor,
[00:15:15] Unknown:
as you're looking at different companies who are operating in the data space, what are some of the qualities that you look for in a given startup that's trying to compete within that ecosystem?
[00:15:26] Unknown:
There are, I think, lots of things to look out for. I'd say that the ones that are top of mind are for the product that they're working on. Data scientists have to love that product. If it's a product that they are spending a lot of time in on a daily or, you know, weekly basis, I think that is a starting point. And then I think as from an investor perspective, you also want to know the problem that the company is solving at the early stage is a significant enough wedge into some platform opportunity. So if the product is solving a critical problem and is also widely used by heavily adopted by individual data scientists, then that's a great sign that there are probably ways that this product can expand into more parts of the workflow.
On a similar note, like, if the product is core to the the data infrastructure and also, you know, touches the end user experience, that's also a good sign in terms of being able to expand your platform opportunity and move up the stack and build more applications. And then probably the most important 1 point actually is the team. I think there are so many companies that are emerging in the space, and understandably so so, you know, you see, like, pretty crazy evaluations in data infrastructure space these days. So and it sounds cliched as well, but you have to believe that the team has some kind of unfair advantage over others.
And this could come in many forms. It could be the case that, you know, this team has built this thing at some large organization before and where it's been battle tested, and now they're able to bring that same experience to the rest of the world. It could be the case that, you know, it's the leader of an open source project, and that differentiates, you know, that founder from other people. So once you have
[00:17:19] Unknown:
identified a potential company and decided that you want to dig a bit more into what they're building and their overall potential within the market. How do you approach that company? How do you go about learning more about their opportunity that they're pursuing, the product that they're trying to build, and just getting a better understanding of the area in which they're competing and some of the either market factors or competing businesses that they're trying to
[00:17:52] Unknown:
operate within the constraints of? Yeah. So I think there are multiple levels to this. I think there was, like, the initial kind of selection process of which spaces or which projects or companies that you wanna focus on, and then there's progressive layers of getting into more depth. So at the top level, as I said earlier, like, try to meet people that hear about projects. You know, I also keep up to date with, you know, many data publications and communities. Once I've identified some company or project that I'm really interested in, the most obvious thing to do is just to talk to the custom talk to the users, talk to the customers, and see how that product fits into, you know, a data scientist workflow, if it's a product that he or she loves using and uses every day and, you know, kinda replaces something else.
So that type of qualitative information is really important. There's also the part that's around digging into the community, whether that's going to some you know, in the case of OpenCore companies, often they have Slack communities, Discord communities, really drilling into those and seeing how vibrant those communities are. In many cases, you know, the most vibrant ones, you know, DBK is a good example, have users who are solving each other's problems and, like, having conversations that are outside of data too. I think that's when you really have, you know, built a community that is really great. So that's 1 side of the user side. There's also you know, you wanna understand for open core companies what kind of traction it's getting among developers, and you wanna see activity on no meaningful activity on GitHub, not just like SaaS or whatever, but, like, engagement and, you know, filing issues and the number of, you know, developers contributing to the project.
[00:19:37] Unknown:
And so once you've identified a company with a decent amount of potential, what is the actual process of working with that company to determine whether you're going to invest and whether they would like to work with you versus any of the other investment funds available for actually
[00:19:55] Unknown:
securing a round of investment. And it's a very delicate balance of buying and selling at the same time as you can imagine. Now at the same time, you wanna gain more confidence. At some point, you'll reach enough confidence or you're in a full selling mode. But, you know, early on especially, you want to learn as much as possible while presenting, you know, yourself in a positive light. So that's a very delicate skill to handle that I imagine many others are constantly working on. In terms of differentiation that's a great question because the market, especially for early stages, flooded with capital is the reality.
And finding ways to differentiate, you know, becomes harder and harder. You know, that's partly why, you know, in some cases, it helps to specialize just because by spending so much time in this space beyond my previous data experience, just know more people and data and much more engaged with the community So I can, you know, hopefully bring insights into conversations that an investor who is not focused on the space would not be able to bring into the conversation.
[00:21:02] Unknown:
As you're working with the companies and exploring a given problem domain, in terms of your overall approach, do you typically start with a list of companies that are interesting and then try to branch out into the broader problem domain to determine what are the opportunities there, or do you go in the reverse where you start with a problem domain and then look to see what companies are operating in that space?
[00:21:27] Unknown:
Think of it goes both ways. Especially in the case of data, there are just so many projects that are brewing inside some organization, whether it's closed source or open source. And I can't claim that I just, like, you know, magically think of gaps that exist. More often than not, it's seeing these use cases in companies that might not have been commercialized yet. So I'd say that is in terms of going very early in investing, that's the main approach. And then, you know, once, you know, I get excited about some space, I'll inevitably spend more time and, like, understand the market and, you know, learn who the competitors are and to really educate myself on that opportunity.
[00:22:09] Unknown:
And 1 of the interesting balancing acts too when identifying what to commercialize and what to offer as your given product is how broad or narrow to go. You know, do you go narrow and deep on a particular pinpoint, or do you try to go broad and orchestrate a platform play for a given subset of the market? And I'm curious what you have seen as being the most broadly applicable strategy and some of the challenges that founders run into
[00:22:41] Unknown:
as they're trying to strike this balancing act? I think that particularly in the earliest stages, which is in terms of investments where I spend my time, I think the focus is to solve a single problem deeply. You know, there's a class of companies that fall into that space. There will be some companies or projects where it's harder to see how it branches into a platform opportunity, where it's able to move up the stack and build applications on top or, you know, vice versa, move lower in the stack. I've seen, you know, both types of, you know, companies. I'd say that my preferred path, and it's obviously up for debate, is that you wanna solve an important problem deeply.
And once you are in the space of companies or projects that are solving this type of problem, there are some that are in the more strategic position to become a platform where they're able maybe to build applications on top of the core project. You know, in some cases, it might be the reverse. Maybe they're able to build down the sector. Then there are also projects that solve a particular problem really well, but, you know, where it's harder to see that platform opportunity. So I think that the starting point is to solve a problem deeply, but then, you know, whether or not it's a venture backhaul business depends on the ability to expand into other use cases.
[00:24:09] Unknown:
Another interesting thing to dig into, given the current state of the world, is how you have seen the pandemic play into the viability of different companies and different markets and how that changes the business strategy for the companies that you're investing in and working with? The pandemic, I think, had many
[00:24:31] Unknown:
effects on the early stage venture investing broadly and also on the strategies that some, you know, early stage companies adopt. In terms of the impact of early stage venture investing, I think some, you know, sectors have actually become more competitive and valuable than ever, including data infrastructure. You see that in the public markets and in the very late growth stage side of investing with Snowflake, Databricks reaching pretty astronomical valuations. In terms of the impact on strategies that companies adopt, we sense early on in the pandemic that the bottoms up adoption plays make more sense in this climate.
And, you know, by that means word-of-mouth, you know, viral, like, adoption of some product. And the reason was that there is so much uncertainty across the budgets of many potential customers that, particularly early in the pandemic, were not sure what the budget for data engineering, data science would be. So the idea is that for early stage startups, that during this period of time, given the potentially more cautious buyers, that they should, you know, work on a bottoms up go to market motion. So that's an example of an impact on strategies that an early stage company might have adopted in light of the pandemic.
[00:26:00] Unknown:
Turning now to the businesses that you in particular have invested in and are working with, can you give a bit of an overview about the businesses that you have decided are worthy of backing and that you see the potential for success and maybe discuss some of the aspects of their businesses that were particularly notable and gave you the confidence necessary to decide to invest capital and place a bet on those companies.
[00:26:27] Unknown:
Absolutely. I think 2 that are very relevant for this podcast are Coiled and Notable. So Coiled is a company that is commercializing Dask led by the lead maintainer, Matt Rocklin. And it addresses 1 of the issues that I've seen, you know, and at the same time, opportunity in the Python ecosystem, which is the scalability side of the equation. So as I, you know, talked about earlier, a lot of data scientists in organizations, once they push the limits of, you know, single node compute, they're often faced with a tough decision of what do we do next? Do we get a beefier machine? Do we switch to Spark? And Dask was a project that, you know, many people adopted to, you know, bridge that gap. And, you know, it seemed like there was a really interesting opportunity to make Dask the centerpiece of a Python company where a user that hits that point of, like, I need to scale now can actually say, there's this Python native approach to doing this that is, you know, backed by some company. And on the team point, Matt Rocklin, you know, fits perfectly in the mold of a founder with an unfair advantage of being the lead maintainer and, like, almost a spiritual leader of the community.
So that was 1. And then in the case of Notable, it's really, I'd say, taking Jupyter Notebooks to the next level with a focus on exactly the parts where Jupyter is not well suited for. So the focus is to make, you know, collaboration of first class citizen rather than to focus just on the single player mode. This means that instead of, you know, collaborating by sending a Jupyter Notebook file to someone else or doing slower collaboration through Git on Jupyter Notebooks, You can work and collaborate in a notebook together where you're, you know, viewing the same results. You're able to interact with the charts that have been created. And when you want to collaborate with external stakeholders, it provides a really easy interface for you to work with even less technical users that can engage with the different charting capabilities rather than, you know, screenshotting something and sliping on a slide, which is, you know, what takes place a lot of time today. And another point there is that it's built with, you know, enterprise, like, security in mind. And, again, Jupyter Notebooks were designed for local use and just don't have that type of architecture built in, so Notable is taking a very careful approach there. Again, on the team front, you know, we believe that the team had an unfair advantage. They were led by Michel Euford. They were the notebooks team at Netflix, where they built the internal notebooks platform there. And Michelle is also the thought leader in this space as well, and that has its own advantages of building out a leads list and now that she's starting this company. Once you have invested in a company, what is your involvement from that point forward? I think that particularly at Coursonoa, we like to be as hands on as the founder would like. In the case of, you know, Coiled, at the time we invested, it was 1 person about to hire the second person.
Now they're scaled to 10, 15. So early on, you know, lots of conversations, you know, with Matt and investors about who to hire, you know, where to go next. And, you know, I think as much as you provide advice, you also provide candidates for hiring, which is like the most practical use for the company. So, you know, that's a way that investors also work, you know, pretty closely with portfolio companies.
[00:30:11] Unknown:
In terms of your experience of working as an investor in this space, what are some of the most interesting or unexpected or challenging lessons that you've learned in the process of transitioning into this role and any interesting aspects of the learning curve of getting involved in venture investing based on your experience as an engineer and a product manager?
[00:30:34] Unknown:
So something I heard before in the industry was that you'll have no idea what you're doing for at least your first 6 months, probably longer. That was absolutely true in my case. I think that because it's such a unique environment where you need to differentiate yourself from other investors, you need to have your really own personal approach to the role. And, you know, I'm constantly still trying to evolve and see how I can improve and can pretty definitively say that the comment that people made about the 6 months is very true as you're trying to figure things out. I think another challenge that has come up even more now is just how brutally competitive the market is. I don't know if that's surprising, but it's the reality of things now.
And as an early stage investor, that means I want to, you know, go even earlier if possible, you know, as we're seeing pretty astronomical, know, gross stage rounds and even series a rounds, it's, you know, sharpened my typical focus on investing in a company at the earliest stage possible. In terms of the broader market,
[00:31:42] Unknown:
what are some of the problem areas that you see as right for opportunity or that you would like to see companies try to tackle that aren't currently being addressed?
[00:31:52] Unknown:
I think there is a lot of focus now on SQL, which is really interesting because I think a few years ago, machine learning seemed like all the rage, but it turns out that most organizations did not have mature enough data infrastructure to handle that. And, you know, it seems like we're at the point where now there's this very robust environment around SQL, which is still, you know, developing. Might be lame to say this, but I do think that Python does represent a really interesting opportunity as, like, the next set of users to really unlock in an enterprise setting.
So that's why, you know, I've been very excited and, you know, focused on the different technologies that really enable, you know, data science to uplevel into doing stuff that they wouldn't have been able to do otherwise in a reliable
[00:32:41] Unknown:
production grade setting. For engineers who are considering going out on their own and building a business around a particular problem space, what are some of the pieces of advice that you have for them that will help to make sure that they set off on the right foot rather than, you know, maybe hitting a stumbling block early on because they don't have the right product market fit or they're not sure how to address the marketing or how to architect the interface for being approachable and just the, you know, infinite decision space that's available for them when they have this greenfield and they wanna decide, okay. I'm I'm gonna go into business and solve this problem. Like, what are some of the useful pieces of advice that you might have for them? I think that at the core,
[00:33:25] Unknown:
you want to build something that an end user loves. And what that means is often, you know, getting out of your comfort zone and just talking to a lot of users and, you know, iterating closely with users. I would say that, as you mentioned, there's, like, kind of an infinite, like, space of different configurations of companies, but I'd you know, be very wary about being intentional about the type of problem you're solving and the type of opportunity that it represents. I think that given some problem you're solving, it's worth thinking about, you know, is this a or do I wanna make this a venture backhaul business or not? That's, like, 1 dimension.
Go to market, do I want to what's more suited? Is it more of a typical enterprise top down sales motion or a bottoms up motion? And making that kind of decision also has direct impacts on, you know, the type of founding team that you'd like to have. You'd wanna find someone with complimentary skills. In the case of bottoms up play, maybe, like, another technical leader is makes sense to complement you because you're engaging with a very technical community. If you wanna sell to enterprises, maybe, you know, that's not the best fit. It could still be, but maybe someone who has had that experience makes more sense. So I think it's really just about don't wanna scare people off, but, like, being very deliberate about each decision that's made and being, you know, pretty clear eyed about it. So there's, you know, 1 part that's talking to customers, but the other part is just talking to people who have, you know, done this before, whether it's founders, operators and even investors, although, like, investors probably offer the least useful advice at the early stages.
[00:35:02] Unknown:
For any employees or engineers who might be working at a startup or considering moving to an early stage venture, what are some of the pieces of advice that you might have for them as to how best to contribute to competing in the current market and just advice for considerations that they might have going into that type of opportunity?
[00:35:24] Unknown:
I would, again, you know, highlight being close to the user. Especially at a small startup where things are moving very quickly, there's not much overhead, and lots of decisions have to be made all the time. I'd argue that an engineer who is very empathetic towards the end user will be able to be much more productive than someone who is not. So that would be my number 1 recommendation to really engage with the user, not just in terms of, you know, let's say, the energy of the data centers, helping that person figure out some bug issue or walking them through something, but actually understanding, you know, who they are, you know, what they do for their jobs, and why they've chosen to do these jobs. So I think that getting that deep empathy would really take that engineer to the next level.
[00:36:11] Unknown:
Are there any other aspects of the work that you're doing as a venture investor or the overall opportunities and use cases for Python and the data ecosystem and for building businesses around or anything tangential to what we've already discussed that you'd like to cover before we close out the show? The 1 thing I'd reiterate is that
[00:36:32] Unknown:
it's such an interesting problem space where you have huge adoption of this language and also have a very extensive ecosystem and a very usable tool that many situations, but not in others, including many enterprise settings. So I just think that reiterating the point that it's a really exciting space. We're seeing, like, you know, data mature at, like, slower pace than some might have predicted in the past, but it seems like now things are changing. Like, SQL has, like, matured significantly more,
[00:37:03] Unknown:
and Python is, you know, taking the next step as well. Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And with that, I will move us into the picks. And this week, I got a couple. So I watched the movie The Sleepover recently on Netflix. Hilarious. Just a lot of good fun, really well delivered. And then I serendipitously came across a song on Spotify and then found out that there was a music video to go with it called what do you do with a Bernie Sanders that's just playing off of the meme that has refused to die since the inauguration. So just another few minutes of humor to spend during your day. So with that, I'll pass it to you, Tony. Do you have any picks this week?
[00:37:46] Unknown:
Yeah. I will go in a very different direction from movie. A film I that left a lasting impression on me during, you know, this past year and a half where I've watched a lot of movies at home is Uncut Gems, which is very fast paced, kind of gory, a bit of a crazy film, totally the opposite of The Sleepover. In terms of a music video, following your lead of an entertaining music video, I really enjoyed Stu from Saturday Night Live, which is basically Pete Davidson's version of Sand, the Eminem song, but Christmas themed. If you haven't seen it, highly recommend you check it out even though it's not Christmas anymore. Well, thank you very much for taking the time today to join me and share your experience
[00:38:27] Unknown:
working in the venture space with data oriented businesses and particularly who are leaning on Python for being able to solve some of the open problems in the ecosystem. Definitely appreciate your perspective and energy that you spend on that space, and I hope you have a good rest of your day. Thanks so much. It was a pleasure, and you too.
[00:38:49] Unknown:
Thank you for listening. Don't forget to check out our other show, the Data Engineering Podcast at data engineering podcast.com for the latest on modern data management. And visit the site of pythonpodcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email host at podcastinit.com with your story. To help other people find the show, please leave a review on Itunes and tell your friends and coworkers.
Introduction to Tony Liu and Cosimo Ventures
Tony's Journey into Data and Python
Transition to Venture Investing
Day-to-Day as a Venture Investor
Strengths of Python in Data Science
Challenges and Opportunities for Python
Motivating Factors for Python's Growth
Qualities of a Successful Data Startup
Process of Investing in a Startup
Balancing Narrow and Broad Focus in Startups
Impact of the Pandemic on Venture Investing
Notable Investments: Coiled and Notable
Lessons Learned as a Venture Investor
Opportunities in the Data Market
Advice for Aspiring Entrepreneurs
Advice for Engineers in Startups
Closing Thoughts and Future Opportunities