Federated Learning For All With Flower - Episode 314

Summary

Machine learning is a tool that has typically been performed on large volumes of data in one place. As more computing happens at the edge on mobile and low power devices, the learning is being federated which brings a new set of challenges. Daniel Beutel co-created the Flower framework to make federated learning more manageable. In this episode he shares his motivations for starting the project, how you can use it for your own work, and the unique challenges and benefits that this emerging model offers. This is a great exploration of the federated learning space and a framework that makes it more approachable.

Census LogoCensus is the operational analytics platform that syncs your cloud warehouse with all the SaaS applications used by your Sales, Marketing & Success teams. If you need to get your company data into Salesforce, Marketo, Hubspot, Intercom, Zendesk, and other tools, Census is the easiest way to do so. Just write SQL (or plug in your dbt models), set up the sync frequencies, and voila, your data is now available to be used by all of your teams.  No need to worry about incremental sync, backfilling, API quota management, API versioning, monitoring, and maintaining custom scripts. Just SQL. Start your free 14-day trial now.


Do you want to try out some of the tools and applications that you heard about on Podcast.__init__? Do you have a side project that you want to share with the world? With Linode’s managed Kubernetes platform it’s now even easier to get started with the latest in cloud technologies. With the combined power of the leading container orchestrator and the speed and reliability of Linode’s object storage, node balancers, block storage, and dedicated CPU or GPU instances, you’ve got everything you need to scale up. Go to pythonpodcast.com/linode today and get a $100 credit to launch a new cluster, run a server, upload some data, or… And don’t forget to thank them for being a long time supporter of Podcast.__init__!


Hightouch LogoHightouch is the leading Reverse ETL platform. Your data warehouse is your source of truth for customer data. Hightouch syncs this data to the tools that your business teams rely on. Hightouch has a catalog of flexible destinations including Salesforce, HubSpot, Zendesk, NetSuite, and ad platforms like Facebook or Google. Hightouch is built for data engineers and is a natural extension to the modern data stack with out-of-the-box integrations with your favorite tools like dbt, Fivetran, Airflow, Slack, PagerDuty, and DataDog.

It’s simple — connect your data warehouse, paste a SQL query, and use our visual mapper to specify how data should appear in downstream tools. No scripts, just SQL. Get started for free at pythonpodcast.com/hightouch


Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial.
  • Are you bored with writing scripts to move data into SaaS tools like Salesforce, Marketo, or Facebook Ads? Hightouch is the easiest way to sync data into the platforms that your business teams rely on. The data you’re looking for is already in your data warehouse and BI tools. Connect your warehouse to Hightouch, paste a SQL query, and use their visual mapper to specify how data should appear in your SaaS systems. No more scripts, just SQL. Supercharge your business teams with customer data using Hightouch for Reverse ETL today. Get started for free at pythonpodcast.com/hightouch.
  • Your host as usual is Tobias Macey and today I’m interviewing Daniel Beutel about Flower, a framework for building federated learning systems

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what federated learning is?
  • What is Flower and what’s the story behind it?
  • What are the trade-offs between federated and centralized models of machine learning?
  • What are some of the types of use cases or workloads that federated learning is used for?
  • Federated learning appears to be a growing area of interest. How would you characterize the current state of the ecosystem?
  • What are the most complex or challenging aspects of federating model training?
    • How does Flower simplify the process of distributing the model training process?
  • Can you describe how Flower is implemented?
    • How have the goals and/or design of Flower changed or evolved since you first began working on it?
  • One of the design principles that you list is "understandability". What are some of the ways that that manifests in the project?
  • It also mentions extensibility. What are the interfaces that Flower exposes for integration or extending its capabilities?
  • For someone who has an existing project that runs in a centralized manner, what are some indicators that a federated approach would be beneficial?
    • What is involved in translating the existing project to run in a federated fashion using Flower?
  • What is involved in building a production ready system with Flower?
  • How does your work at Adap inform the design and product direction for Flower?
  • What are some of the most interesting, innovative, or unexpected ways that you have seen Flower used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned from your work on and with Flower?
  • When is Flower the wrong choice?
  • What do you have planned for the future of the project?

Keep In Touch

Picks

Closing Announcements

  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Liked it? Take a second to support Podcast.__init__ on Patreon!