Open Source

Let The Robots Do The Work Using Robotic Process Automation with Robocorp - Episode 310

One of the great promises of computers is that they will make our work faster and easier, so why do we all spend so much time manually copying data from websites, or entering information into web forms, or any of the other tedious tasks that take up our time? As developers our first inclination is to “just write a script” to automate things, but how do you share that with your non-technical co-workers? In this episode Antti Karjalainen, CEO and co-founder of Robocorp, explains how Robotic Process Automation (RPA) can help us all cut down on time-wasting tasks and let the computers do what they’re supposed to. He shares how he got involved in the RPA industry, his work with Robot Framework and RPA framework, how to build and distribute bots, and how to decide if a task is worth automating. If you’re sick of spending your time on mind-numbing copy and paste then give this episode a listen and then let the robots do the work for you.

Read More

Keep Your Code Clean And Maintainable Using Static Analysis With Flake8 - Episode 309

When you are writing code it is all to easy to introduce subtle bugs or leave behind unused code. Unused variables, unused imports, overly complex logic, etc. If you are careful and diligent you can find these problems yourself, but isn’t that what computers are supposed to help you with? Thankfully Python has a wealth of tools that will work with you to keep your code clean and maintainable. In this episode Anthony Sottile explores Flake8, one of the most popular options for identifying those problematic lines of code. He shares how he became involved in the project and took over as maintainer and explains the different categories of code quality tooling and how Flake8 compares to other static analyzers. He also discusses the ecosystem of plugins that have grown up around it, including some detailed examples of how you can write your own (and why you might want to).

Read More

Be Data Driven At Any Scale With Superset - Episode 307

Becoming data driven is the stated goal of a large and growing number of organizations. In order to achieve that mission they need a reliable and scalable method of accessing and analyzing the data that they have. While business intelligence solutions have been around for ages, they don’t all work well with the systems that we rely on today and a majority of them are not open source. Superset is a Python powered platform for exploring your data and building rich interactive dashboards that gets the information that your organization needs in front of the people that need it. In this episode Maxime Beauchemin, the creator of Superset, shares how the project got started and why it has become such a widely used and popular option for exploring and sharing data at companies of all sizes. He also explains how it functions, how you can customize it to fit your specific needs, and how to get it up and running in your own environment.

Read More

Go From Notebook To Pipeline For Your Data Science Projects With Orchest - Episode 304

Jupyter notebooks are a dominant tool for data scientists, but they lack a number of conveniences for building reusable and maintainable systems. For machine learning projects in particular there is a need for being able to pivot from exploring a particular dataset or problem to integrating that solution into a larger workflow. Rick Lamers and Yannick Perrenet were tired of struggling with one-off solutions when they created the Orchest platform. In this episode they explain how Orchest allows you to turn your notebooks into executable components that are integrated into a graph of execution for running end-to-end machine learning workflows.

Read More

Write Your Python Scripts In A Flow Based Visual Editor With Ryven - Episode 303

When you are writing a script it can become unwieldy to understand how the logic and data are flowing through the program. To make this easier to follow you can use a flow-based approach to building your programs. Leonn Thomm created the Ryven project as an environment for visually constructing a flow-based program. In this episode he shares his inspiration for creating the Ryven project, how it changes the way you think about program design, how Ryven is implemented, and how to get started with it for your own programs.

Read More

CrossHair: Your Automatic Pair Programmer - Episode 302

One of the perennial challenges in software engineering is to reduce the opportunity for bugs to creep into the system. Some of the tools in our arsenal that help in this endeavor include rich type systems, static analysis, writing tests, well defined interfaces, and linting. Phillip Schanely created the CrossHair project in order to add another ally in the fight against broken code. It sits somewhere between type systems, automated test generation, and static analysis. In this episode he explains his motivation for creating it, how he uses it for his own projects, and how to start incorporating it into yours. He also discusses the utility of writing contracts for your functions, and the differences between property based testing and SMT solvers. This is an interesting and informative conversation about some of the more nuanced aspects of how to write well-behaved programs.

Read More

Exploring Literate Programming For Python Projects With nbdev - Episode 300

Creating well designed software is largely a problem of context and understanding. The majority of programming environments rely on documentation, tests, and code being logically separated despite being contextually linked. In order to weave all of these concerns together there have been many efforts to create a literate programming environment. In this episode Jeremy Howard of fast.ai fame and Hamel Husain of GitHub share the work they have done on nbdev. The explain how it allows you to weave together documentation, code, and tests in the same context so that it is more natural to explore and build understanding when working on a project. It is built on top of the Jupyter environment, allowing you to take advantage of the other great elements of that ecosystem, and it provides a number of excellent out of the box features to reduce the friction in adopting good project hygiene, including continuous integration and well designed documentation sites. Regardless of whether you have been programming for 5 days, 5 years, or 5 decades you should take a look at nbdev to experience a different way of looking at your code.

Read More

Making The Sans I/O Ideal A Reality For The Websockets Library - Episode 299

Working with network protocols is a common need for software projects, particularly in the current age of the internet. As a result, there are a multitude of libraries that provide interfaces to the various protocols. The problem is that implementing a network protocol properly and handling all of the edge cases is hard, and most of the available libraries are bound to a particular I/O paradigm which prevents them from being widely reused. To address this shortcoming there has been a movement towards “sans I/O” implementations that provide the business logic for a given protocol while remaining agnostic to whether you are using async I/O, Twisted, threads, etc. In this episode Aymeric Augustin shares his experience of refactoring his popular websockets library to be I/O agnostic, including the challenges involved in how to design the interfaces, the benefits it provides in simplifying the tests, and the work needed to add back support for async I/O and other runtimes. This is a great conversation about what is involved in making an ideal a reality.

Read More

Project Scaffolding That Evolves With Your Software Using Copier - Episode 297

Every software project has a certain amount of boilerplate to handle things like linting rules, test configuration, and packaging. Rather than recreate everything manually every time you start a new project you can use a utility to generate all of the necessary scaffolding from a template. This allows you to extract best practices and team standards into a reusable project that will save you time. The Copier project is one such utility that goes above and beyond the bare minimum by supporting project _evolution_, letting you bring in the changes to the source template after you already have a project that you have dedicated significant work on. In this episode Jairo Llopis explains how the Copier project works under the hood and the advanced capabilities that it provides, including managing the full lifecycle of a project, composing together multiple project templates, and how you can start using it for your own work today.

Read More

Add Anomaly Detection To Your Time Series Data With Luminaire - Episode 293

When working with data it’s important to understand when it is correct. If there is a time dimension, then it can be difficult to know when variation is normal. Anomaly detection is a useful tool to address these challenges, but a difficult one to do well. In this episode Smit Shah and Sayan Chakraborty share the work they have done on Luminaire to make anomaly detection easier to work with. They explain the complexities inherent to working with time series data, the strategies that they have incorporated into Luminaire, and how they are using it in their data pipelines to identify errors early. If you are working with any kind of time series then it’s worth giving Luminaure a look.

Read More