Summary
One of the most common causes of bugs is incorrect data being passed throughout your program. Pydantic is a library that provides runtime checking and validation of the information that you rely on in your code. In this episode Samuel Colvin explains why he created it, the interesting and useful ways that it can be used, and how to integrate it into your own projects. If you are tired of unhelpful errors due to bad data then listen now and try it out today.
Did you know Data science is a fast-growing career field, with a 650% growth in jobs since 2012 and a median salary of around $125,000? Springboard has identified that data careers are going to shape the future, and has responded to that need by creating the Springboard School of Data, comprehensive, end-to-end data career programs that encompass data science, data analytics, data engineering, and machine learning.
Each Springboard course is 100% online and remote, and each course curriculum is tailored to fit the schedule of working professionals. This means flexible hours and a project-based methodology designed to get real world experience: every Springboard student graduates with a portfolio of projects to showcase their skills to potential employers. Springboard’s unique approach to learning is centered on the very simple idea that mentorship and one-on-one human support is the fastest and most efficient way to learn new skills. That’s why all of Springboard’s data courses are supported by a vast network of industry expert mentors, who are carefully vetted to ensure the right fit for each program. Mentors provide valuable guidance, coaching, and support to help keep Springboard students motivated through weekly, 1:1 video calls for the duration of the program.
Before graduation, Springboard’s career services team supports students in their job search, helping prepare them for interviews and networking, and facilitates their transition in the tech or data industry. Springboard’s tuition-back guarantee allows students to secure the role of their dreams and invest in themselves without risk. Meaning students are not charged if they don’t get a job offer in the field they study. Springboard’s support does not end when students graduate. All Springboard graduates benefit from an extensive support network encompassing career services, 1:1 career coaching, networking tips, resume assistance, interview prep, and salary negotiation.
Since Springboard was founded in 2013, around 94% of eligible graduates secured a job within one year, earning an average salary increase of $26,000. Want to learn more? Springboard is exclusively offering up to 20 scholarships of $500 to listeners of Podcast.__init__. Simply go to pythonpodcast.com/springboard for more information.
Do you want to try out some of the tools and applications that you heard about on Podcast.__init__? Do you have a side project that you want to share with the world? With Linode’s managed Kubernetes platform it’s now even easier to get started with the latest in cloud technologies. With the combined power of the leading container orchestrator and the speed and reliability of Linode’s object storage, node balancers, block storage, and dedicated CPU or GPU instances, you’ve got everything you need to scale up. Go to pythonpodcast.com/linode today and get a $100 credit to launch a new cluster, run a server, upload some data, or… And don’t forget to thank them for being a long time supporter of Podcast.__init__!
Announcements
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
- You listen to this show because you love Python and want to keep your skills up to date. Machine learning is finding its way into every aspect of software engineering. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. Podcast.__init__ is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to pythonpodcast.com/springboard and apply today! Make sure to use the code AISPRINGBOARD when you enroll.
- Your host as usual is Tobias Macey and today I’m interviewing Samuel Colvin about Pydantic, a library for enforcing type hints at runtime
Interview
- Introductions
- How did you get introduced to Python?
- Can you start by describing what Pydantic is and what motivated you to create it?
- What are the main use cases that benefit from Pydantic?
- There are a number of libraries in the Python ecosystem to handle various conventions or "best practices" for settings management. How does pydantic fit in that category and why might someone choose to use it over the other options?
- There are also a number of libraries for defining data schemas or validation such as Marshmallow and Cerberus. How does Pydantic compare to the available options for those cases?
- What are some of the challenges, whether technical or conceptual, that you face in building a library to address both of these areas?
- The 3.7 release of Python added built in support for dataclasses as a means of building containers for data with type validation. What are the tradeoffs of pydantic vs the built in dataclass functionality?
- How much overhead does pydantic add for doing runtime validation of the modelled data?
- In the documentation there is a nuanced point that you make about parsing vs validation and your choices as to what to support in pydantic. Why is that a necessary distinction to make?
- What are the limitations in terms of usage that you are accepting by choosing to allow for implicit conversion or potentially silent loss of precision in the parsed data?
- What are the benefits of punting on the strict validation of data out of the box?
- What has been your design philosophy for constructing the user facing API?
- How is Pydantic implemented and how has the overall architecture evolved since you first began working on it?
- What have you found to be the most challenging aspects of building a library for managing the consistency of data structures in a dynamic language?
- What are some of the strengths and weaknesses of Python’s type system?
- What have you found to be the most challenging aspects of building a library for managing the consistency of data structures in a dynamic language?
- What is the workflow for a developer who is using Pydantic in their code?
- What are some of the pitfalls or edge cases that they might run into?
- What is involved in integrating with other libraries/frameworks such as Django for web development or Dagster for building data pipelines?
- What are some of the more advanced capabilities or use cases of Pydantic that are less obvious?
- What are some of the features or capabilities of Pydantic that are often overlooked which you think should be used more frequently?
- What are some of the most interesting, innovative, or unexpected ways that you have seen Pydantic used?
- What are some of the most interesting, challenging, or unexpected lessons that you have learned through your work on or with Pydantic?
- When is Pydantic the wrong choice?
- What do you have planned for the future of the project?
Keep In Touch
- samuelcolvin on GitHub
- Website
- @samuel_colvin on Twitter
Picks
- Tobias
- Samuel
- Flash Boys by Michael Lewis
- Algorithms To Live By by Brian Christian and Tom Griffiths
- NGrok.com
Closing Announcements
- Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
- Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links
- Pydantic
- Matlab
- C#
- FastAPI
- Marshmallow
- Cerberus
- 12 Factor App
- Django
- Python Type Hints
- Cython
- MyPy
- Duck Typing
- Haskell
- Higher Order Types
- PyCharm Pydantic Plugin
- Django Rest Framework
- Avro
- Parquet
- Dagster
- Starlette
- Flask
- Ludwig
- Deep Pavlov
- Fast MRI
- Reagent
- Pynt
- Open Source Has Failed article
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA