Open Source Automated Machine Learning With MindsDB - Episode 218

Summary

Machine learning is growing in popularity and capability, but for a majority of people it is still a black box that we don’t fully understand. The team at MindsDB is working to change this state of affairs by creating an open source tool that is easy to use without a background in data science. By simplifying the training and use of neural networks, and making their logic explainable, they hope to bring AI capabilities to more people and organizations. In this interview George Hosu and Jorge Torres explain how MindsDB is built, how to use it for your own purposes, and how they view the current landscape of AI technologies. This is a great episode for anyone who is interested in experimenting with machine learning and artificial intelligence. Give it a listen and then try MindsDB for yourself.

Clubhouse LogoThis episode of Podcast.__init__ is brought to you by Clubhouse, the first project management platform for software development that brings everyone together so that teams can focus on what matters – creating products their customers love. Clubhouse provides the perfect balance of simplicity and structure for better cross-functional collaboration. Its fast, intuitive interface makes it easy for people on any team to focus-in on their work on a specific task or project, while also being able to “zoom out” to see how that work is contributing towards the bigger picture. With a simple API and robust set of integrations, Clubhouse also seamlessly integrates with the tools you use everyday, getting out of your way so that you can deliver quality software on time.

Listeners of Podcast.__init__ can sign up for two free months of Clubhouse by visiting pythonpodcast.com/clubhouse.


linode-banner-sponsor-largeDo you want to try out some of the tools and applications that you heard about on Podcast.__init__? Do you have a side project that you want to share with the world? Check out Linode at linode.com/podcastinit or use the code podcastinit2019 and get a $20 credit to try out their fast and reliable Linux virtual servers. They’ve got lightning fast networking and SSD servers with plenty of power and storage to run whatever you want to experiment on.



Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. With such an intuitive tool it’s easy to make sure that everyone in the business is on the same page. Podcast.init listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing George Hosu and Jorge Torres about MindsDB, a framework for streamlining the use of neural networks

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what MindsDB is and the problem that it is trying to solve?
    • What was the motivation for creating the project?
  • Who is the target audience for MindsDB?
  • Before we go deep into MindsDB can you explain what a neural network is for anyone who isn’t familiar with the term?
  • For someone who is using MindsDB can you talk through their workflow?
    • What are the types of data that are supported for building predictions using MindsDB?
    • How much cleaning and preparation of the data is necessary before using it to generate a model?
    • What are the lower and upper bounds for volume and variety of data that can be used to build an effective model in MindsDB?
  • One of the interesting and useful features of MindsDB is the built in support for explaining the decisions reached by a model. How do you approach that challenge and what are the most difficult aspects?
  • Once a model is generated, what is the output format and can it be used separately from MindsDB for embedding the prediction capabilities into other scripts or services?
  • How is MindsDB implemented and how has the design changed since you first began working on it?
    • What are some of the assumptions that you made going into this project which have had to be modified or updated as it gained users and features?
  • What are the limitations of MindsDB and what are the cases where it is necessary to pass a task on to a data scientist?
  • In your experience, what are the common barriers for individuals and organizations adopting machine learning as a tool for addressing their needs?
  • What have been the most challenging, complex, or unexpected aspects of designing and building MindsDB?
  • What do you have planned for the future of MindsDB?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA