Exploring The Patterns And Practices For Deep Learning With Andrew Ferlitsch - Episode 317

Summary

Deep learning is gaining an immense amount of popularity due to the incredible results that it is able to offer with comparatively little effort. Because of this there are a number of engineers who are trying their hand at building machine learning models with the wealth of frameworks that are available. Andrew Ferlitsch wrote a book to capture the useful patterns and best practices for building models with deep learning to make it more approachable for newcomers ot the field. In this episode he shares his deep expertise and extensive experience in building and teaching machine learning across many companies and industries. This is an entertaining and educational conversation about how to build maintainable models across a variety of applications.

Do you want to try out some of the tools and applications that you heard about on Podcast.__init__? Do you have a side project that you want to share with the world? With Linode’s managed Kubernetes platform it’s now even easier to get started with the latest in cloud technologies. With the combined power of the leading container orchestrator and the speed and reliability of Linode’s object storage, node balancers, block storage, and dedicated CPU or GPU instances, you’ve got everything you need to scale up. Go to pythonpodcast.com/linode today and get a $100 credit to launch a new cluster, run a server, upload some data, or… And don’t forget to thank them for being a long time supporter of Podcast.__init__!


Databand LogoDataband.ai is a unified Data Observability Platform that helps DataOps teams catch and solve data health issues fast. Databand.ai’s platform helps data engineers pinpoint pipeline issues and quickly identify their root cause so DataOps can begin working on a resolution before bad data is delivered. Whether you’re using Apache Spark, Apache Airflow, Databricks, Amazon S3, self-hosted python scripts, or combinations of these, Databand.ai allows you to monitor data health along every step of its journey. Powerful integrations to 20+ tools gives you full visibility of your stack. Our mission is to help businesses trust their data with the most powerful Data Observability Platform. Experience unified observability with a free trial today: www.databand.ai


Census LogoCensus is the operational analytics platform that syncs your cloud warehouse with all the SaaS applications used by your Sales, Marketing & Success teams. If you need to get your company data into Salesforce, Marketo, Hubspot, Intercom, Zendesk, and other tools, Census is the easiest way to do so. Just write SQL (or plug in your dbt models), set up the sync frequencies, and voila, your data is now available to be used by all of your teams.  No need to worry about incremental sync, backfilling, API quota management, API versioning, monitoring, and maintaining custom scripts. Just SQL. Start your free 14-day trial now.


Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial.
  • Scaling your data infrastructure is hard. Maintaining data quality standards as you scale is harder. Databand solves this. Their Unified Data Observability platform gives data engineers visibility over their stack without changing existing pipeline code. Get end-to-end visibility on your pipelines, and identify the root cause of issues before bad data is delivered. Seamlessly integrate with over 20 tools like Apache Airflow, Spark, Snowflake, and more. Use customizable dashboards to see where pipelines are broken and how that impacts delivery downstream. Get alerts on leading indicators of pipeline failure. Open up your pipeline and see exactly which code strings are broken – so you can fix the issue immediately. Create more reliable data products. Go to pythonpodcast.com/databand today to start your free trial!
  • Your host as usual is Tobias Macey and today I’m interviewing Andrew Ferlitsch about the patterns and practices for deep learning applications

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing the major elements of a model architecture?
  • What is the relationship between the specific learning task being addressed and the architecture of the learning network?
  • In your experience, what is the level of awareness of a typical ML engineer or data scientist with respect to the most current design patterns in deep learning?
  • Your currently working on a book about deep learning patterns and practices. What was your motivation for starting that project?
    • What are your goals for the book?
  • How have advancements in the operability of machine learning influenced the ways that the models are designed and trained?
    • How do recent approaches such as transfer learning impact the needs of the supporting tools and infrastructure?
  • Can you describe the different design patterns that you cover in your book and the selection process for when and how to apply them?
  • What are the aspects of bringing deep learning to production that continue to be a challenge?
    • What are some of the emerging practices that you are optimistic about?
  • What are some of the industry trends or areas of current research that you are most excited about?
  • What are the most interesting, innovative, or unexpected patterns that you have encountered?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on the book?
  • What are some of the other resources that you recommend for listeners to learn more about how to build production ready models?

Keep In Touch

Picks

Closing Announcements

  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Liked it? Take a second to support Podcast.__init__ on Patreon!