Declarative Deep Learning From Your Laptop To Production With Ludwig and Horovod
November 21st, 2021
1 hr 4 mins 48 secs
About this Episode
Deep learning frameworks encourage you to focus on the structure of your model ahead of the data that you are working with. Ludwig is a tool that uses a data oriented approach to building and training deep learning models so that you can experiment faster based on the information that you actually have, rather than spending all of our time manipulating features to make them match your inputs. In this episode Travis Addair explains how Ludwig is designed to improve the adoption of deep learning for more companies and a wider range of users. He also explains how the Horovod framework plugs in easily to allow for scaling your training workflow from your laptop out to a massive cluster of servers and GPUs. The combination of these tools allows for a declarative workflow that starts off easy but gives you full control over the end result.
- Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science.
- When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
- Your host as usual is Tobias Macey and today I’m interviewing Travis Adair about building and training machine learning models with Ludwig and Horovod
- How did you get introduced to Python?
- Can you describe what Horovod and Ludwig are?
- How do the projects work together?
- What was your path to being involved in those projects and what is your current role?
- There are a number of AutoML libraries available for frameworks such as scikit-learn, etc. What are the challenges that are introduced by applying that workflow to deep learning architectures?
- What are the use cases that Ludwig is designed to enable?
- Who are the target users of Ludwig?
- How do the workflows change/progress for the different personas?
- How is the underlying framework architected?
- What are the available extension points to provide a progressive exposure of complexity?
- How have the goals and design of the project changed or evolved as it has gained more widespread adoption beyond Uber?
- What was the motivation for migrating the core of Ludwig from Tensorflow to Pytorch?
- Can you describe the workflow of building a model definition with Ludwig?
- How much knowledge of neural network architectures and their relevant characteristics is necessary to use Ludwig effectively?
- What are the motivating factors for adding Horovod to the process?
- What is involved in moving from a single machine/single process training loop to a multi-core or multi-machine distributed training process?
- The combination of Ludwig and Horovod provide a shallower learning curve for building and scaling model training. What do you see as their potential impact on the availability and adoption of more sophisticated ML capabilities across organizations of varying scale?
- What do you see as other significant barriers to widespread use of ML functionality?
- What are the most interesting, innovative, or unexpected ways that you have seen Ludwig and/or Horovod used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on Ludwig and Horovod?
- When is Ludwig and/or Horovod the wrong choice?
- What do you have planned for the future of both projects?
Keep In Touch
- @TravisAddair on Twitter
- tgaddair on GitHub
- Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you’ve learned something or tried out a project from the show then tell us about it! Email email@example.com) with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
- Gradient Boosted Trees
- Vision Transformer Architecture
- Nvidia Collective Communications Library (NCCL)
- Training Epoch
- Raft Consensus Algorithm
- Transfer Learning
- Gordon Bell Prize
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA