Programming

Managing Application Secrets with Brian Kelly - Episode 181

Summary

Any application that communicates with other systems or services will at some point require a credential or sensitive piece of information to operate properly. The question then becomes how best to securely store, transmit, and use that information. The world of software secrets management is vast and complicated, so in this episode Brian Kelly, engineering manager at Conjur, aims to help you make sense of it. He explains the main factors for protecting sensitive information in your software development and deployment, ways that information might be leaked, and how to get the whole team on the same page.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Join the community in the new Zulip chat workspace at podcastinit.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Brian Kelly about how to store, deploy, and use sensitive information in your applications

Interview

  • Introductions
  • How did you get introduced to Python?
  • To begin with, how do you define a secret in the context of an application?
  • What are the broad categories for solutions to secrets management?
  • What are the different aspects of secrets management in the lifecycle of developing, deploying, and maintaining an application?
  • How does the scale of a project or organization impact the strategies that are reasonable for secrets management?
  • What are some of the most challenging aspects of secrets management at the different stages of usage?
    • What are some of the common reasons that secrets management strategies fail?
    • What are some of the vulnerabilities or attack vectors that development teams should be thinking about when working with credentials?
  • What are your thoughts on versioning of secrets?
  • Beyond storing and deploying sensitive information, what are some of the secondary concerns around secrets management that development teams should be thinking about?
  • How does the use of multiple environments (e.g. dev, QA, production, etc.) affect the strategies used for secrets management?
  • What are some of the most useful resources that you have found for anyone looking to learn more about this subject?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Don't Just Stand There, Get Programming! with Ana Bell - Episode 175

Summary

Writing a book is hard work, especially when you are trying to teach such a broad concept as programming. In this episode Ana Bell discusses her recent work in writing Get Programming: Learn To Code With Python, including her views on how to separate the principles from the implementation, making the book evergreen in its appeal, and how her experience as a lecturer at MIT has helped her maintain the perspectives of beginners. She also shares her views on the values of learning about programming, even when you have no intention of doing it as a career and ways to take the next steps if that is your goal.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute.
  • As you know, Python has become one of the most popular programming languages in the world, due to the size, scope, and friendliness of the language and community. But, it can be tough learning it when you’re just starting out. Luckily, there’s an easy way to get involved. Written by MIT lecturer Ana Bell and published by Manning Publications, Get Programming: Learn to code with Python is the perfect way to get started working with Python. Ana’s experience as a teacher of Python really shines through, as you get hands-on with the language without being drowned in confusing jargon or theory. Filled with practical examples and step-by-step lessons to take on, Get Programming is perfect for people who just want to get stuck in with Python. Get your copy of the book with a special 40% discount for Podcast.__init__ listeners at podcastinit.com/get-programming using code: Bell40!
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Join the community in the new Zulip chat workspace at podcastinit.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Ana Bell about her book, Get Programming: Learn to code with Python, and her approach to teaching how to code

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing your motivation for writing a book about learning to program?
    • Who is the target audience for this book?
    • What level of competence do you want the reader to have when they have completed it?
  • What were the most challenging aspects of writing a book for beginning programmers?
    • What did you do to recapture the “beginner mind” while writing?
  • There are a large variety of books on learning to program and at least as many approaches. Can you describe the techniques that you use in your book to help readers grasp the concepts that you cover?
  • One of the problems of writing a book about technology is that there is no stationary target to aim for due to the constant advancement of the industry. How do you reconcile that reality with the need for a book to remain relevant for an extended period of time?
    • How do you decide what to include and what to leave out when writing about learning how to program?
  • What advice do you have for people who have read your book and want to continue on to a career in development?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Destroy All Software With Gary Bernhardt - Episode 159

Summary

Many developers enter the market from backgrounds that don’t involve a computer science degree, which can lead to blind spots of how to approach certain types of problems. Gary Bernhardt produces screen casts and articles that aim to teach these principles with code to make them approachable and easy to understand. In this episode Gary discusses his views on the state of software education, both in academia and bootcamps, the theoretical concepts that he finds most useful in his work, and some thoughts on how to build better software.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute.
  • Finding a bug in production is never a fun experience, especially when your users find it first. Airbrake error monitoring ensures that you will always be the first to know so you can deploy a fix before anyone is impacted. With open source agents for Python 2 and 3 it’s easy to get started, and the automatic aggregations, contextual information, and deployment tracking ensure that you don’t waste time pinpointing what went wrong. Go to podcastinit.com/airbrake today to sign up and get your first 30 days free, and 50% off 3 months of the Startup plan.
  • To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • Your host as usual is Tobias Macey and today I’m interviewing Gary Bernhardt about teaching and learning Python in the current software landscape

Interview

  • Introductions
  • How did you get introduced to Python?
  • As someone who makes a living from teaching aspects of programming what is your view on the state of software education?
    • What are some of the ways that we as an industry can improve the experience of new developers?
    • What are we doing right?
  • You spend a lot of time exploring some of the fundamental aspects of programming and computation. What are some of the lessons that you have learned which transcend software languages?
    • Utility of graphs in understanding software
    • Mechanical sympathy
  • What are the benefits of ‘from scratch’ tutorials that explore the steps involved in building simple versions of complex topics such as compilers or web frameworks?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Mycroft with Steve Penrod - Episode 82

Summary

Speech is the most natural interface for communication, and yet we force ourselves to conform to the limitations of our tools in our daily tasks. As computation becomes cheaper and more ubiquitous and artificial intelligence becomes more capable, voice becomes a more practical means of controlling our environments. This week Steve Penrod shares the work that is being done on the Mycroft project and the company of the same name. He explains how he met the other members of the team, how the project got started, what it can do right now, and where they are headed in the future.

Brief Introduction

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable.
  • When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app.
  • You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan.
  • Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.
  • To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
  • Join our community! Visit discourse.pythonpodcast.com to talk to previous guests and other listeners of the show.
  • Your host as usual is Tobias Macey and today I’m interviewing Steve Penrod about the company and project Mycroft, a voice controlled, AI powered personal assistant written in Python.

Interview with Steve Penrod

  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Mycroft is and how the project and business got started?
  • How is Mycroft architected and what are the biggest challenges that you have encountered while building this project?
  • What are some of the possible applications of Mycroft?
  • Why would someone choose to use Mycroft in place of other platforms such as Amazon’s Alexa or Google’s personal assistant?
  • What kinds of machine learning approaches are being used in Mycroft and do they require a remote system for execution or can they be run locally?
  • What kind of hardware is needed for someone who wants to build their own Mycroft and what does the install process look like?
  • It can be difficult to run a business based on open source. What benefits and challenges are introduced by making the software that powers Mycroft freely available?
  • What are the mechanisms for extending Mycroft to add new capabilities?
  • What are some of the most surprising and innovative uses of Mycroft that you have seen?
  • What are the long term goals for the Mycroft project and the business that you have formed around it?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Annapoornima Koppad - Episode 81

Summary

Annapoornima Koppad is a director of the PSF, founder of the Bangalore chapter of PyLadies, and is a Python instructor at the Indian Institute of Science. In this week’s episode she talks about how she got started with Python, her experience running the PyLadies meetup, and working with the PSF.

Brief Introduction

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable.
  • When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app.
  • You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan.
  • Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.
  • To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
  • Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas.
  • Your hosts as usual are Tobias Macey and Chris Patti
  • Today we’re interviewing Annapoornima Koppad about her career with Python and her experiences running the PyLadies chapter in Bangalore, India and being a director for the Python Software Foundation.

Interview with Annapoornima Koppad

  • Introductions
  • How did you get introduced to Python? – Tobias
  • I noticed that you have been freelancing for several years now. How much of that has been in Python and how has that fed back into your other activities? – Tobias
  • While preparing for this interview I came across the book that you self-published on Amazon. What was your motivation for writing it and who is the target audience? – Tobias
  • Can you tell us about your experience with starting the PyLadies group in Bangalore? What were some of the biggest challenges that you encountered and how have you approached the task of growing awareness and membership of the group? – Tobias
  • You recently started teaching Python at the Indian Institute of Science. What kinds of subject matter do you cover in your lessons? – Tobias
  • What is it about Python and its community that has inspired you to dedicate so much of your time to contributing back to it? – Tobias
  • In what ways would you like to see the Python ecosystem improve? – Tobias
  • You were voted in as a director of the Python Software Foundation in the most recent election. Can you share what responsibilities that entails? – Tobias
  • What would you like to achieve with your time in the PSF? – Tobias

Keep In Touch

Picks

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Python for GIS with Sean Gillies - Episode 80

Summary

Location is an increasingly relevant aspect of software systems as we have more internet connected devices with GPS capabilities. GIS (Geographic Information Systems) are used for processing and analyzing this data, and fortunately Python has a suite of libraries to facilitate these endeavors. This week Sean Gillies, an author and contributor of many of these tools, shares the story of his career and contributions, and the work that he is doing at MapBox.

Brief Introduction

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable.
  • When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app.
  • You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan.
  • Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.
  • To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
  • Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas.
  • Your host as usual is Tobias Macey
  • Today I’m interviewing Sean Gillies about writing Geographic Information Systems in Python.

Interview with Sean Gillies

  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Geographic Information Systems are and what kinds of projects might take advantage of them?
  • How did you first get involved in the area of GIS and location-based computation?
  • What was the state of the Python ecosystem like for writing these kinds of applications?
  • You have created and contributed to a number of the canonical tools for building GIS systems in Python. Can you list at least some of them and describe how they fit together for different applications?
  • What are some of the unique challenges associated with trying to model geographical features in a manner that allows for effective computation?
    • How does the complexity of modeling and computation scale with increasing land area?
  • Mapping and cartography have an incredibly long history with an ever-evolving set of tools. What does our digital age bring to this time-honored discipline that was previously impossible or impractical?
  • To build accurate and effective representations of our physical world there are a number of domains involved, such as geometry and geography. What advice do you have for someone who is interested in getting started in this particular niche?
  • What level of expertise would you advise for someone who simply wants to add some location-aware features to their application?
  • I know that you joined Mapbox a little while ago. Which parts of their stack are written in Python?
  • What are the areas where Python still falls short and which languages or tools do you turn to in those cases?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

K Lars Lohn - Episode 79

Summary

K Lars Lohn has had a long and varied career, spending his most recent years at Mozilla. This week he shares some of his stories about getting involved with Python, his work with Mozilla, and his inspiration for the closing keynote at PyCon US 2016. He also elaborates on the intricate mazes that he draws and his life as an organic farmer in Oregon.

Brief Introduction

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com
  • Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project
  • We also have a new sponsor this week. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan.
  • Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.
  • To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
  • Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas.
  • Your host as usual is Tobias Macey
  • Today we’re interviewing K Lars Lohn about his career, his art, and his work with Mozilla

Interview with K Lars Lohn

  • Introductions
  • How did you get introduced to Python?
  • You have an interesting pair of articles on your website that attempt to detail how you perceive code and why you think that formatting should be configured in a manner analogous to CSS. Can you explain a bit about how your particular perception affects the way that you program?
  • On your website you have some images of incredibly detailed artwork that are actually mazes. Can you describe some of your creation process for those?
  • What is it about mazes that keeps you interested in them and how did you first start using them as a form of visual art?
  • At Mozilla you have helped to create a project called Socorro which utilizes complexity analysis for correlating stacktraces. How did you conceive of that approach to error monitoring?
  • Can you describe how Socorro is architected and how it works under the covers?
  • At this year’s PyCon US you presented the closing keynote and it was one of the most engaging talks that I’ve seen. Where did you get the inspiration for the content and the mixed media approach?
  • For anyone who hasn’t seen it, you managed to weave together a very personal story with a musical performance, and some applications of complexity analysis into a seamless experience. How much did you have to practice before you felt comfortable delivering that in front of an audience?
  • In addition to your technical career you are also very focused on living in a manner that is sustainable and in tune with your environment. What kinds of synergies and conflicts exist between your professional and personal philosophies?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Lorena Mesa - Episode 78

Summary

One of the great strengths of the Python community is the diversity of backgrounds that our practitioners come from. This week Lorena Mesa talks about how her focus on political science and civic engagement led her to a career in software engineering and data analysis. In addition to her professional career she founded the Chicago chapter of PyLadies, helps teach women and kids how to program, and was voted onto the board of the PSF.

Brief Introduction

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable.
  • Check out our sponsor Linode for running your awesome new Python apps. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project
  • You want to make sure your apps are error-free so give our other sponsor, Rollbar, a look. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan.
  • Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.
  • By leaving a review on iTunes, or Google Play Music it becomes easier for other people to find us.
  • Join our community! Visit discourse.pythonpodcast.com to help us grow and connect our wonderful audience.
  • Your host as usual is Tobias Macey
  • Today we’re interviewing Lorena Mesa about what inspires her in her work as a software engineer and data analyst.

Interview with Lorena Mesa

  • Introductions
  • How did you get introduced to Python?
  • How did your original interests in political science and community outreach lead to your current role as a software engineer?
  • You dedicate a lot of your time to organizations that help teach programming to women and kids. What are some of the most meaningful experiences that you have been able to facilitate?
  • Can you talk a bit about your work getting the PyLadies chapter in Chicago off the ground and what the reaction has been like?
  • Now that you are a member of the board for the PSF, what are your goals in that position?
  • What is it about software development that made you want to change your career path?
  • What are some of the most interesting projects that you have worked on, whether for your employer or for fun?
  • Do you think that the bootcamp you attended did a good job of preparing you for a position in industry?
  • What is your view on the concept that software development is the modern form of literacy? Do you think that everyone should learn how to program?

Keep In Touch

Twitter

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Podbuzzz with Kyle Martin - Episode 77

Summary

Podcasts are becoming more popular now than they ever have been. Podbuzzz is a service for helping podcasters to track their reviews and imporove SEO to reach a wider audience. In this episode we spoke with Kyle Martin about his experience using Python to build Podbuzzz and manage it in production.

Brief Introduction

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable.
  • You need a place to run your awesome new Python apps, so check out our sponsor Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project.
  • You want to make sure your apps are error-free so give our next sponsor, Rollbar, a look. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan.
  • Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.
  • By leaving a review on iTunes, or Google Play Music it becomes easier for other people to find us.
  • Join our community! Visit discourse.pythonpodcast.com to help us grow and connect our wonderful audience.
  • Your hosts as usual are Tobias Macey and Chris Patti
  • Today we’re interviewing Kyle Martin about Podbuzzz

Interview with Kyle Martin

  • Introductions
  • How did you get introduced to Python? – Chris
  • Can you start by explaining what Podbuzz is? – Tobias
  • Why did you end up choosing Python as the language for building thx#is service? – Tobias
  • What have been the biggest engineering challenges in building Podbuzzz? – Tobias
  • How did you conceive of the idea to build Podbuzzz and what inspired you to provide it as a service? – Tobias
  • Part of the service that you are building is a widget that encourages listeners to rate a podcast on iTunes. Why is that important and what are some of the techniques that you have leveraged to determine the most effective messaging? – Tobias
  • What are some of the features that you plan on adding to your service? – Tobias
  • Do you intend to run Podbuzzz as a side project or do you envision it becoming a company with its own staff? – Tobias
  • In addition to your work with Podbuzzz as a way for podcasters to gain visibility for their shows, you’re also working on an analytics platform for the same target audience. Can you explain a bit about that and the problems that you’ve had to overcome? – Tobias
  • What is it about podcasting that makes it hard to gain useful metrics and what is your strategy for overcoming some of those obstacles? – Tobias

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

GenSim with Radim Řehůřek - Episode 71

Summary

Being able to understand the context of a piece of text is generally thought to be the domain of human intelligence. However, topic modeling and semantic analysis can be used to allow a computer to determine whether different messages and articles are about the same thing. This week we spoke with Radim Řehůřek about his work on GenSim, which is a Python library for performing unsupervised analysis of unstructured text and applying machine learning models to the problem of natural language understanding.

Brief Introduction

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com
  • Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project
  • We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry’s real-time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at getsentry.com and use the code podcastinit at signup to get a $50 credit on your account.
  • Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.
  • To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
  • Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas.
  • Your hosts as usual are Tobias Macey and Chris Patti
  • Today we’re interviewing Radim Řehůřek about Gensim, a library for topic modeling and semantic analysis of natural language.

Interview with Radim Řehůřek

  • Introductions
  • How did you get introduced to Python? – Chris
  • Can you start by giving us an explanation of topic modeling and semantic analysis? – Tobias
  • What is Gensim and what inspired you to create it? – Tobias
  • What facilities does Gensim provide to simplify the work of this kind of language analysis? – Tobias
  • Can you describe the features that set it apart from other projects such as the NLTK or Spacy? – Tobias
  • What are some of the practical applications that Gensim can be used for? – Tobias
  • One of the features that stuck out to me is the fact that Gensim can process corpora on disk that would be too large to fit into memory. Can you explain some of the algorithmic work that was necessary to allow for this streaming process to be possible? – Tobias
    • Given that it can handle streams of data, could it also be used in the context of something like Spark? – Tobias
  • Gensim also supports unsupervised model building. What kinds of limitations does this have and when would you need a human in the loop? – Tobias
    • Once a model has been trained, how does it get saved and reloaded for subsequent use? – Tobias
  • What are some of the more unorthodox or interesting uses people have put Gensim to that you’ve heard about? – Chris
  • In addition to your work on Gensim, and partly due to its popularity, you have started a consultancy for customers who are interested in improving their data analysis capabilities. How does that feed back into Gensim? – Tobias
  • Are there any improvements in Gensim or other libraries that you have made available as a result of issues that have come up during client engagements? – Tobias
  • Is it difficult to find contributors to Gensim because of its advanced nature? – Tobias
  • Are there any resources you’d like to recommend our listeners explore to get a more in depth understanding of topic modeling and related techniques? – Chris

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA