Intelligent Dependency Resolution For Optimal Compatibility And Security With Project Thoth - Episode 367

Summary

Building any software project is going to require relying on dependencies that you and your team didn’t write or maintain, and many of those will have dependencies of their own. This has led to a wide variety of potential and actual issues ranging from developer ergonomics to application security. In order to provide a higher degree of confidence in the optimal combinations of direct and transitive dependencies a team at Red Hat started Project Thoth. In this episode Fridolín Pokorný explains how the Thoth resolver uses multiple signals to find the best combination of dependency versions to ensure compatibility and avoid known security issues.

Select Star LogoDoes everyone in your team ask you which database table they should use? Or if you can help them with their SQL query? If so, check out Select Star! It’s an automated data discovery portal that can save you hours of time every week.

From analyzing your metadata, query logs, and dashboard activities, Select Star will automatically document your datasets. For every table in Select Star, you can find out where the data originated from, which dashboards are built on top of it, who’s using the data in the company, and how they’re using it, all the way down to the SQL queries. Best of all, it’s simple to set up, and easy for both engineering and operations teams to use.

With Select Star’s data catalog, a single source of truth in data is built in minutes, even across thousands of datasets.

Try it out for free at pythonpodcast.com/selectstar. If you’re a Podcast.__init__ subscriber, we’ll double the length of your free trial and send you a swag package when you continue on a paid plan.


Do you want to try out some of the tools and applications that you heard about on Podcast.__init__? Do you have a side project that you want to share with the world? With Linode’s managed Kubernetes platform it’s now even easier to get started with the latest in cloud technologies. With the combined power of the leading container orchestrator and the speed and reliability of Linode’s object storage, node balancers, block storage, and dedicated CPU or GPU instances, you’ve got everything you need to scale up. Go to pythonpodcast.com/linode today and get a $100 credit to launch a new cluster, run a server, upload some data, or… And don’t forget to thank them for being a long time supporter of Podcast.__init__!


Shipyard is an orchestration platform that helps data teams build out solid data operations from the get-go by connecting data tools and streamlining data workflows. Shipyard offers low-code templates that are configured using a visual interface, replacing the need to write code to build workflows while enabling engineers to get their work into production faster. If a solution can’t be built with existing templates, engineers can always automate scripts in the language of their choice to bring any internal or external process into their workflows.

Observability and alerting are built into the Shipyard platform, ensuring that breakages are identified before being discovered downstream by business teams. With a high level of concurrency, scalability, and end-to-end encryption, Shipyard enables data teams to accomplish more without relying on other teams or worrying about infrastructure challenges, while also ensuring that business teams trust the data made available to them. Go to pythonpodcast.com/shipyard to get started automating powerful workflows with their free developer plan today!


Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Need to automate your Python code in the cloud? Want to avoid the hassle of setting up and maintaining infrastructure? Shipyard is the premier orchestration platform built to help you quickly launch, monitor, and share python workflows in a matter of minutes with 0 changes to your code. Shipyard provides powerful features like webhooks, error-handling, monitoring, automatic containerization, syncing with Github, and more. Plus, it comes with over 70 open-source, low-code templates to help you quickly build solutions with the tools you already use. Go to dataengineeringpodcast.com/shipyard to get started automating with a free developer plan today!
  • Your host as usual is Tobias Macey and today I’m interviewing Fridolín Pokorný about Project Thoth, a resolver service that computes the optimal combination of versions for your dependencies

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you describe what Project Thoth is and the story behind it?
  • What are some examples of the types of problems that can be introduced by mismanaged dependency versions?
  • The Python ecosystem has seen a number of dependency management tools introduced recently. What are the capabilities that Thoth offers that make it stand out?
    • How does it compare to e.g. pip, Poetry, pip-tools, etc.?
    • How do those other tools approach resolution of dependencies?
  • Can you describe how Thoth is implemented?
    • How have the scope and design of the project evolved since it was started?
  • What are the sources of information that it relies on for generating the possible solution space?
    • What are the algorithms that it relies on for finding an optimal combination of packages?
  • Can you describe how Thoth fits into the workflow of a developer while selecting a set of dependencies and keeping them up to date over the life of a project?
  • What are the opportunities for expanding Thoth’s application to other language ecosystems?
  • What are the interfaces available for extending or integrating with Thoth?
  • What are the most interesting, innovative, or unexpected ways that you have seen Thoth used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Thoth?
  • When is Thoth the wrong choice?
  • What do you have planned for the future of Thoth?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Liked it? Take a second to support Podcast.__init__ on Patreon!