Archives: Episodes

Building A Business On Serverless Technology - Episode 214

Summary

Serverless computing is a recent category of cloud service that provides new options for how we build and deploy applications. In this episode Raghu Murthy, founder of DataCoral, explains how he has built his entire business on these platforms. He explains how he approaches system architecture in a serverless world, the challenges that it introduces for local development and continuous integration, and how the landscape has grown and matured in recent years. If you are wondering how to incorporate serverless platforms in your projects then this is definitely worth your time to listen to.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. With such an intuitive tool it’s easy to make sure that everyone in the business is on the same page. Podcast.init listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial.
  • Bots and automation are taking over whole categories of online interaction. Discover.bot is an online community designed to serve as a platform-agnostic digital space for bot developers and enthusiasts of all skill levels to learn from one another, share their stories, and move the conversation forward together. They regularly publish guides and resources to help you learn about topics such as bot development, using them for business, and the latest in chatbot news. For newcomers to the space they have the Beginners Guide To Bots that will teach you the basics of how bots work, what they can do, and where they are developed and published. To help you choose the right framework and avoid the confusion about which NLU features and platform APIs you will need they have compiled a list of the major options and how they compare. Go to pythonpodcast.com/discoverbot today to get started and thank them for their support of the show.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
  • The Python Software Foundation is the lifeblood of the community, supporting all of us who want to run workshops and conferences, run development sprints or meetups, and ensuring that PyCon is a success every year. They have extended the deadline for their 2019 fundraiser until June 30th and they need help to make sure they reach their goal. Go to pythonpodcast.com/psf2019 today to make a donation. If you’re listening to this after June 30th of 2019 then consider making a donation anyway!
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Raghu Murthy from DataCoral about his experience building and deploying a personalized SaaS platform on top of serverless technologies

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by giving a brief overview of DataCoral?
  • Before we get too deep can you share your definition of what types of technologies fall under the umbrella of "serverless"?
  • How are you using serverless technologies at DataCoral?
    • How has your usage evolved as your business and the underlying technologies have evolved?
  • How do serverless technologies impact your approach to application architecture?
  • What are some of the main benefits for someone to target services such as Lambda?
    • What is your litmus test for determining whether a given project would be a good fit for a Function as a Service platform?
  • What are the most challenging aspects of running code on Lambda?
    • What are some of the major design differences between running on Lambda vs the more familiar server-oriented paradigms?
    • What are some of the other services that are most commonly used alongside Function as as Service (e.g. Lambda) to build full featured applications?
  • With serverless function platforms there is the cold start problem, can you explain what that means and some application design patterns that can help mitigate it?
  • When building on cloud-based technologies, especially proprietary ones, local development can be a challenge. How are you handling that issue at DataCoral?
  • In addition to development this new deployment paradigm upends some of the traditional approaches to CI/CD. How are you approaching testing and deployment of your services?
    • How do you identify and maintain dependency graphs between your various microservices?
  • In addition to deployment, it is also necessary to track performance characteristics and error events across service boundaries. How are you managing observability and alerting in your product?
  • What are you most excited for in the serverless space that listeners should know about?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

A Data Catalog For Your PyData Projects - Episode 213

Summary

One of the biggest pain points when working with data is getting is dealing with the boilerplate code to load it into a usable format. Intake encapsulates all of that and puts it behind a single API. In this episode Martin Durant explains how to use the Intake data catalogs for encapsulating source information, how it simplifies data science workflows, and how to incorporate it into your projects. It is a lightweight way to enable collaboration between data engineers and data scientists in the PyData ecosystem.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Martin Durant about Intake, a lightweight package for finding, investigating, loading and disseminating data

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what Intake is and the story behind its creation?
    • Can you outline some of the other projects and products that intersect with the functionality of Intake and describe where it fits in terms of use case and capabilities? (e.g. Quilt Data, Arrow, Data Retriever)
  • Can you describe the workflows for using Intake, both from the data scientist and the data engineer perspective?
  • One of the persistent challenges in working with data is that of cataloging and discovery of what already exists. In what ways does Intake address that problem?
    • Does it have any facilities for capturing and exposing data lineage?
  • For someone who needs to customize their usage of Intake, what are the extension points and what is involved in building a plugin?
  • Can you describe how Intake is implemented and how it has evolved since it first started?
    • What are some of the most challenging, complex, or novel aspects of the Intake implementation?
  • Intake focuses primarily on integrating with the PyData ecosystem (e.g. NumPy, Pandas, SciPy, etc.). What are some other communities that are, or could be, benefiting from the work being done on Intake?
    • What are some of the assumptions that are baked into Intake that would need to be modified to make it more broadly applicable?
  • What are some of the assumptions that were made going into this project that have needed to be reconsidered after digging deeper into the problem space?
  • What are some of the most interesting/unexpected/innovative ways that you have seen Intake leveraged?
  • What are your plans for the future of Intake?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Hardware Hacking Made Easy With CircuitPython - Episode 212

Summary

Learning to program can be a frustrating process, because even the simplest code relies on a complex stack of other moving pieces to function. When working with a microcontroller you are in full control of everything so there are fewer concepts that need to be understood in order to build a functioning project. CircuitPython is a platform for beginner developers that provides easy to use abstractions for working with hardware devices. In this episode Scott Shawcroft explains how the project got started, how it relates to MicroPython, some of the cool ways that it is being used, and how you can get started with it today. If you are interested in playing with low cost devices without having to learn and use C then give this a listen and start tinkering!

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Scott Shawcroft about CircuitPython, the easiest way to program microcontrollers

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what CircuitPython is and how the project got started?
    • I understand that you work at Adafruit and I know that a number of their products support CircuitPython. What other runtimes do you support?
  • Microcontrollers have typically been the domain of C because of the resource and performance constraints. What are the benefits of using Python to program hardware devices?
  • With the wide availability of powerful computing platforms, what are the benefits of experimenting with microcontrollers and their peripherals?
  • I understand that CircuitPython is a friendly fork of MicroPython. What have you changed in your version?
    • How do you structure your development to avoid conflicts with the upstream project?
    • What are some changes that you have contributed back to MicroPython?
  • What are some of the features of CircuitPython that make it easier for users to interact with sensors, motors, etc.?
  • CircuitPython provides an easy on-ramp for experimenting with hardware projects. Is there a point where a user will outgrow it and need to move to a different language or framework?
  • What are some of the most interesting/innovative/unexpected projects that you have seen people build using CircuitPython?
    • Are there any cases of someone building and shipping a production grade project in CircuitPython?
  • What have been some of the most interesting/challenging/unexpected aspects of building and maintaining CircuitPython?
  • What is in store for the future of the project?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Building A Privacy Preserving Voice Assistant - Episode 211

Summary

Being able to control a computer with your voice has rapidly moved from science fiction to science fact. Unfortunately, the majority of platforms that have been made available to consumers are controlled by large organizations with little incentive to respect users’ privacy. The team at Snips are building a platform that runs entirely off-line and on-device so that your information is always in your control. In this episode Adrien Ball explains how the Snips architecture works, the challenges of building a speech recognition and natural language understanding toolchain that works on limited resources, and how they are tackling issues around usability for casual consumers. If you have been interested in taking advantage of personal voice assistants, but wary of using commercially available options, this is definitely worth a listen.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Adrien Ball about SNIPS, a set of technologies to make voice controlled systems that respect user’s privacy

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what the Snips is and how it got started?
  • For someone who wants to use Snips can you talk through the onboarding proces?
    • One of the interesting features of your platform is the option for automated training data generation. Can you explain how that works?
  • Can you describe the overall architecture of the Snips platform and how it has evolved since you first began working on it?
  • Two of the main components that can be used independently are the ASR (Automated Speech Recognition) and NLU (Natural Language Understanding) engines. Each of those have a number of competitors in the market, both open source and commercial. How would you describe your overall position in the market for each of those projects?
  • I know that one of the biggest challenges in conversational interfaces is maintaining context for multi-step interactions. How is that handled in Snips?
  • For the NLU engine, you recently ported it from Python to Rust. What was your motivation for doing so and how would you characterize your experience between the two languages?
    • Are you continuing to maintain both implementations and if so how are you maintaining feature parity?
  • How do you approach the overall usability and user experience, particularly for non-technical end users?
    • How is discoverability handled (e.g. finding out what capabilities/skills are available)
  • One of the compelling aspects of Snips is the ability to deploy to a wide variety of devices, including offline support. Can you talk through that deployment process, both from a user perspective and how it is implemented under the covers?
    • What is involved in updating deployed models and keeping track of which versions are deployed to which devices?
  • What is involved in adding new capabilities or integrations to the Snips platform?
  • What are the limitations of running everything offline and on-device?
    • When is Snips the wrong choice?
  • In the process of building and maintaining the various components of Snips, what have been some of the most useful/interesting/unexpected lessons that you have learned?
    • What have been the most challenging aspects?
  • What are some of the most interesting/innovative/unexpected ways that you have seen the Snips technologies used?
  • What is in store for the future of Snips?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Hacking The Government With The USDS - Episode 210

Summary

The U.S. government has a vast quantity of software projects across the various agencies, and many of them would benefit from a modern approach to development and deployment. The U.S. Digital Services Agency has been tasked with making that happen. In this episode the current director of engineering for the USDS, David Holmes, explains how the agency operates, how they are using Python in their efforts to provide the greatest good to the largest number of people, and why you might want to get involved. Even if you don’t live in the U.S.A. this conversation is worth listening to so you can see an interesting model of how to improve government services for everyone.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • Bots and automation are taking over whole categories of online interaction. Discover.bot is an online community designed to serve as a platform-agnostic digital space for bot developers and enthusiasts of all skill levels to learn from one another, share their stories, and move the conversation forward together. They regularly publish guides and resources to help you learn about topics such as bot development, using them for business, and the latest in chatbot news. For newcomers to the space they have the Beginners Guide To Bots that will teach you the basics of how bots work, what they can do, and where they are developed and published. To help you choose the right framework and avoid the confusion about which NLU features and platform APIs you will need they have compiled a list of the major options and how they compare. Go to pythonpodcast.com/discoverbot today to get started and thank them for their support of the show.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing David Holmes about his work at the US Digital Services organization

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what the USDS is and how you got involved with it?
  • The terminology that is used around "Tours of Service" is interesting. Can you explain what that entails?
    • relocation
    • what if you have a house and career?
  • Can you explain the model of how the USDS works?
    • What is involved in staffing a new project?
    • What is your typical toolkit, and how does that vary with the specific departments that you are working with?
  • What are some of the most interesting projects that you and the team at USDS have worked on?
  • What are some of the most challenging projects that you have been involved with?
  • What are some projects that you hope to be asked to work on?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Probabilistic Modeling In Python (And What That Even Means) - Episode 209

Summary

Most programming is deterministic, relying on concrete logic to determine the way that it operates. However, there are problems that require a way to work with uncertainty. PyMC3 is a library designed for building models to predict the likelihood of certain outcomes. In this episode Thomas Wiecki explains the use cases where Bayesian statistics are necessary, how PyMC3 is designed and implemented, and some great examples of how it is being used in real projects.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Thomas Wiecki about PyMC3, a project for probabilistic programming in Python

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what probabilistic programming is?
  • What is the PyMC3 project and how did you get involved with it?
  • The opening line for the project README is packed with a slew of terms that are rather opaque to the lay-person. Can you unpack that a bit and discuss some of the ways that PyMC3 is used in real-world projects?
  • How much knowledge of statistical modeling and Bayesian statistics is necessary to make effective use of PyMC3?
  • Can you talk through an example use case for PyMC3 to illustrate how you would use it in a project?
    • How does it compare to the way that you would approach the same problem in a deterministic or frequentist modeling framework?
  • Can you describe how PyMC3 is implemented?
  • There are a number of other projects that build on top of PyMC3, what are some that you find particularly interesting or noteworthy?
  • What do you find to be the most useful features of PyMC3 and what are some areas that you would like to see it improved?
  • What have been the most interesting/unexpected/challenging lessons that you have learned in the process of building and maintaining PyMC3?
  • What is in store for the future of PyMC3?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Exploring Indico: A Full Featured Event Management Platform - Episode 208

Summary

Managing an event is rife with inherent complexity that scales as you move from scheduling a meeting to organizing a conference. Indico is a platform built at CERN to handle their efforts to organize events such as the Computing in High Energy Physics (CHEP) conference, and now it has grown to manage booking of meeting rooms. In this episode Adrian Mönnich, core developer on the Indico project, explains how it is architected to facilitate this use case, how it has evolved since its first incarnation two decades ago, and what he has learned while working on it. The Indico platform is definitely a feature rich and mature platform that is worth considering if you are responsible for organizing a conference or need a room booking system for your office.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • Bots and automation are taking over whole categories of online interaction. Discover.bot is an online community designed to serve as a platform-agnostic digital space for bot developers and enthusiasts of all skill levels to learn from one another, share their stories, and move the conversation forward together. They regularly publish guides and resources to help you learn about topics such as bot development, using them for business, and the latest in chatbot news. For newcomers to the space they have the Beginners Guide To Bots that will teach you the basics of how bots work, what they can do, and where they are developed and published. To help you choose the right framework and avoid the confusion about which NLU features and platform APIs you will need they have compiled a list of the major options and how they compare. Go to pythonpodcast.com/discoverbot today to get started and thank them for their support of the show.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Adrian Mönnich about Indico, the effortless open-source tool for event organisation, archival and collaboration

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Indico is and how the project got started?
    • What are some other projects which target a similar use case and what were they lacking that led to Indico being necessary?
  • Can you talk through an example workflow for setting up and managing an event in Indico?
    • How does the lifecycle change when working with larger events, such as PyCon?
  • Can you describe how Indico is architected and how its design has evolved since it was first built?
    • What are some of the most complex or challenging portions of Indico to implement and maintain?
  • There are a lot of areas for exercising constraint resolution algorithms. Can you talk through some of the business logic of how that operates?
  • Most of Indico is highly configurable and flexible. How do you approach managing sane defaults to prevent users getting overwhelmed when onboarding?
    • What is your approach to testing given how complex the project is?
  • What are some of the most interesting or unexpected ways that you have seen Indico used?
  • What are some of the most interesting/unexpected lessons that you have learned in the process of building Indico?
  • What do you have planned for the future of the project?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Exploring Python's Internals By Rewriting Them In Rust - Episode 207

Summary

The CPython interpreter has been the primary implementation of the Python runtime for over 20 years. In that time other options have been made available for different use cases. The most recent entry to that list is RustPython, written in the memory safe language Rust. One of the added benefits is the option to compile to WebAssembly, offering a browser-native Python runtime. In this episode core maintainers Windel Bouwman and Adam Kelly explain how the project got started, their experience working on it, and the plans for the future. Definitely worth a listen if you are curious about the inner workings of Python and how you can get involved in a relatively new project that is contributing to new options for running your code.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host is Tobias Macey and today I’m interviewing Adam Kelly and Windel Bouwman about RusPython, a project to implement a new Python interpreter in Rust

Interview

  • Introduction
  • How did you get introduced to Python?
  • Can you start by explaining what Rust is for anyone who isn’t familiar with it?
  • How did RustPython got started and what are your goals for the project?
  • Can you discuss what is involved in implementing a fully compliant Python interpreter?
  • What are some of the challenges that you face in replicating the capabilities of the CPython interpreter?
    • Are you attempting to maintain bug parity?
    • How much of the stdlib needs to be reimplemented?
    • Can you compare and contrast the benefits of Rust vs C?
    • Will the end result be compatible with libraries that rely on C extensions such as NumPy?
  • What is the current state of the project?
    • What are some of the notable missing features?
  • Can you talk through your vision of how the WebAssembly support will manifest and the types of applications that it will enable?
    • How much effort have you put into size optimization for the webassembly target to reduce client-side load time?
    • Are there any existing options for minification of Python code so that it can be delivered to users with less bandwidth?
  • What have been some of the most interesting/challenging/unexpected aspects of implementing a Python runtime?
  • What do you have planned for the future of the project?
  • What are the risks that you anticipate which could derail the project before it becomes production ready?

Contact Info

Picks

Links

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Version Control For Your Machine Learning Projects - Episode 206

Summary

Version control has become table stakes for any software team, but for machine learning projects there has been no good answer for tracking all of the data that goes into building and training models, and the output of the models themselves. To address that need Dmitry Petrov built the Data Version Control project known as DVC. In this episode he explains how it simplifies communication between data scientists, reduces duplicated effort, and simplifies concerns around reproducing and rebuilding models at different stages of the projects lifecycle. If you work as part of a team that is building machine learning models or other data intensive analysis then make sure to give this a listen and then start using DVC today.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • Bots and automation are taking over whole categories of online interaction. Discover.bot is an online community designed to ​serve as a platform-agnostic digital space for bot developers and enthusiasts of all skill levels to learn from one another, share their stories, and move the conversation forward together. They regularly publish guides and resources to help you learn about topics such as bot development, using them for business, and the latest in chatbot news. For newcomers to the space they have the Beginners Guide To Bots that will teach you the basics of how bots work, what they can do, and where they are developed and published. To help you choose the right framework and avoid the confusion about which NLU features and platform APIs you will need they have compiled a list of the major options and how they compare. Go to pythonpodcast.com/discoverbot today to get started and thank them for their support of the show.
  • You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Dmitry Petrov about DVC, an open source version control system for machine learning projects

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what DVC is and how it got started?
  • How do the needs of machine learning projects differ from other software applications in terms of version control?
  • Can you walk through the workflow of a project that uses DVC?
    • What are some of the main ways that it differs from your experience building machine learning projects without DVC?
  • In addition to the data that is used for training, the code that generates the model, and the end result there are other aspects such as the feature definitions and hyperparameters that are used. Can you discuss how those factor into the final model and any facilities in DVC to track the values used?
  • In addition to version control for software applications, there are a number of other pieces of tooling that are useful for building and maintaining healthy projects such as linting and unit tests. What are some of the adjacent concerns that should be considered when building machine learning projects?
  • What types of metrics do you track in DVC and how are they collected?
    • Are there specific problem domains or model types that require tracking different metric formats?
  • In the documentation it mentions that the data files live outside of git and can be managed in external storage systems. I’m wondering if there are any plans to integrate with systems such as Quilt or Pachyderm that provide versioning of data natively and what would be involved in adding that support?
  • What was your motivation for implementing this system in Python?
    • If you were to start over today what would you do differently?
  • Being a venture backed startup that is producing open source products, what is the value equation that makes it worthwile for your investors?
  • What have been some of the most interesting, challenging, or unexpected aspects of building DVC?
  • What do you have planned for the future of DVC?

Keep In Touch

Picks

  • Tobias
  • Dmitry
    • Go outside and get some fresh air 🙂

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Building Scalable Ecommerce Sites On Saleor - Episode 205

Summary

Ecommerce is an industry that has largely faded into the background due to its ubiquity in recent years. Despite that, there are new trends emerging and room for innovation, which is what the team at Mirumee focuses on. To support their efforts, they build and maintain the open source Saleor framework for Django as a way to make the core concerns of online sales easy and painless. In this episode Mirek Mencel and Patryk Zawadzki discuss the projects that they work on, the current state of the ecommerce industry, how Saleor fits with their technical and business strategy, and their predictions for the near future of digital sales.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Check out the Practical AI podcast from our friends at Changelog Media to learn and stay up to date with what’s happening in AI
  • You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
  • Your host as usual is Tobias Macey and today I’m interviewing Mirek Mencel and Patryk Zawadzki about their work at Mirumee, building ecommerce applications in Python, based on their open source framework Saleor

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing the types of projects that you work on at Mirumee and how the company got started?
  • There are a number of libraries and frameworks that you build and maintain. What is your motivation for providing these components freely and how does that play into your overall business strategy?
  • The most substantial project that you maintain is Saleor. Can you describe what it is and the story behind its creation?
    • How does it compare to other ecommerce implementations in the Python space?
    • If someone is agnostic to language and web framework, what would make them choose Saleor over other options that would be available to them?
  • What are some of the most challenging aspects of building a successful ecommerce platform?
    • How do the technical needs of an ecommerce site differ as it grows from small to medium and large scale?
  • Which components of an online store are often overlooked?
  • One of the common features of ecommerce sites that can drive substantial revenue is a well-built recommender system. What are some best practice strategies that you have discovered during your client work?
  • What are some projects that you have seen built with Saleor that were particular interesting, innovative, or unexpected?
  • What are your predictions for the future of the ecommerce industry?
  • What do you have planned for the future of the Saleor framework and the Mirumee business?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA