Illustrating The Landscape And Applications Of Deep Learning - Episode 234


Deep learning is a phrase that is used more often as it continues to transform the standard approach to artificial intelligence and machine learning projects. Despite its ubiquity, it is often difficult to get a firm understanding of how it works and how it can be applied to a particular problem. In this episode Jon Krohn, author of Deep Learning Illustrated, shares the general concepts and useful applications of this technique, as well as sharing some of his practical experience in using it for his work. This is definitely a helpful episode for getting a better comprehension of the field of deep learning and when to reach for it in your own projects.

Do you want to try out some of the tools and applications that you heard about on Podcast.__init__? Do you have a side project that you want to share with the world? With Linode’s managed Kubernetes platform it’s now even easier to get started with the latest in cloud technologies. With the combined power of the leading container orchestrator and the speed and reliability of Linode’s object storage, node balancers, block storage, and dedicated CPU or GPU instances, you’ve got everything you need to scale up. Go to today and get a $100 credit to launch a new cluster, run a server, upload some data, or… And don’t forget to thank them for being a long time supporter of Podcast.__init__!


  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, the Data Orchestration Summit, and Data Council in NYC. Go to to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Jon Krohn about his recent book, deep learning illustrated


  • Introductions
  • How did you get introduced to Python?
  • Can you start by giving a brief description of what we’re talking about when we say deep learning and how you got involved with the field?
    • How does your background in neuroscience factor into your work on designing and building deep learning models?
  • What are some of the ways that you leverage deep learning techniques in your work?
  • What was your motivation for writing a book on the subject?
    • How did the idea of including illustrations come about and what benefit do they provide as compared to other books on this topic?
  • While planning the contents of the book what was your thought process for determining the appropriate level of depth to cover?
    • How would you characterize the target audience and what level of familiarity and proficiency in employing deep learning do you wish them to have at the end of the book?
  • How did you determine what to include and what to leave out of the book?
    • The sequencing of the book follows a useful progression from general background to specific uses and problem domains. What were some of the biggest challenges in determining which domains to highlight and how deep in each subtopic to go?
  • Because of the continually evolving nature of the field of deep learning and the associated tools, how have you guarded against obsolescence in the content and structure of the book?
    • Which libraries did you focus on for your examples and what was your selection process?
      • Now that it is published, is there anything that you would have done differently?
  • One of the critiques of deep learning is that the models are generally single purpose. How much flexibility and code reuse is possible when trying to repurpose one model pipeline for a slightly different dataset or use case?
    • I understand that deployment and maintenance of models in production environments is also difficult. What has been your experience in that regard, and what recommendations do you have for practitioners to reduce their complexity?
  • What is involved in actually creating and using a deep learning model?
    • Can you go over the different types of neurons and the decision making that is required when selecting the network topology?
  • In terms of the actual development process, what are some useful practices for organizing the code and data that goes into a model, given the need for iterative experimentation to achieve desired levels of accuracy?
  • What is your personal workflow when building and testing a new model for a new use case?
  • What are some of the limitations of deep learning and cases where you would recommend against using it?
  • What are you most excited for in the field of deep learning and its applications?
    • What are you most concerned by?
  • Do you have any parting words or closing advice for listeners and potential readers?

Keep In Touch


Closing Announcements

  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at


The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Click here to read the raw transcript...
Tobias Macey
Hello, and welcome to podcast, the podcast about Python and the people who make it great. When you're ready to launch your next app, or you want to try and project to hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at the node. With 200 gigabit private networking, scalable shared block storage node balancers, and a 40 gigabit public network all controlled by a brand new API, you've got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to Python slash the node that's LINOD today to get a $20 credit and launch a new server and under a minute, and don't forget to thank them for their continued support of this show. And you listen to this show to learn and stay up to date with the ways that Python is being used, including the latest and machine learning its data and our This is for even more opportunities to meet, listen and learn from your peers you don't want to miss out on this year's conference season. We have partnered with organizations such as diversity, cranium, global intelligence, Alexia, which data Council. Upcoming events include the combined events of the data architecture summit and graph forum, the data orchestration summit and data Council in New York City. Go to Python slash conferences today to learn more about these and other events and take advantage of our partner discounts to save money when you register. Your host, as usual is Tobias Macey. And today I'm interviewing john Crone about his recent book, deep learning illustrated. So john, can you start by introducing yourself?
Jon Krohn
Thank you very much, Tobias. It's an honor to be on the show. I am the chief data scientist at a machine learning company called untapped and our focus is automating aspects of business operations, particularly related to human resources and recruiting. So that kind of human side side of things where we need to be building algorithms that carefully remove bias that might exist in the training data, for example. So that's my day job. But I also on the side had been writing a book that just came out a couple of weeks ago called Deep Learning illustrated. It was published by Pearson. And that book is a product of me doing a number of different ways of teaching, deep learning. So I've been running a deep learning study group community in New York for a number of years. I teach graduate electrical engineers at Columbia University. Every once in a while, and I also have my own curriculum, which is a 30 hour deep learning curriculum that I offer at a professional Academy here in New York called the New York City Data Science Academy. And yeah, so wearing a lot of different hats and recently added one final thing to that which is that As of September 2019, I have a National Institutes of Health grant, to work with medical researchers at Columbia University to automate aspects of diagnosing infant brain scans. So lots of different things going on. But generally the thread that ties them all together is deep learning.
Tobias Macey
And do you remember how you first got introduced to Python?
Yes, I do remember being first introduced to Python. I was doing my PhD at Oxford University at the time, and I was working in MATLAB and our and somebody who I respected a lot of postdoc in my lab came up to me and said, You know, there's really not any point in working in our anymore. Everything's moving over to Python. And that began my quest.
Tobias Macey
And I noticed too, that you have a background in neuroscience. So I'm curious how that has played into your overall understanding and engagement with the field of deep learning.
Exactly. So that PhD was in neuroscience, I did specialize in machine learning a bit even then. So while the data that I was working with were neuroscience data, so brain imaging data, and genome data, genetic data, I was learning how to apply machine learning techniques as the kind of primary focus of that PhD. So it has been something post PhD in the last few years that these artificial neural networks which formed the basis of deep learning neural networks, started to become useful enough in in a lot of applications. And thats related to compute becoming a lot cheaper in recent years data storage becoming a lot cheaper in recent years. And so this deep neural network approach that is in Inspired by the way that biological brain cells work, by the way that biological neural systems work, has started to become, it started to become useful. And so there was this kind of this this part of me because of that neuroscience background, I really took to learning about deep neural networks after my PhD, and I, wherever I can, I draw threads between the biological inspirations that are behind many of the innovations in neural network and deep learning research.
Tobias Macey
And before we go too much further, can you just give your description of how you would define the term deep learning for somebody who's not familiar with it?
That is a great question, Tobias. I'm glad you're asking it at this point. So deep learning is a very specific technique. So while it gets used in the popular press is kind of a synonym for artificial Intel. Well, artificial intelligence is an almost impossible to define term. Deep Learning is a very specific term which can be defined quite concretely. So since the 1950s. computer scientists are inspired by the way that the biological brain works, and biological brain cells worked, have been creating computer simulations, simple algorithms that are inspired by the way the biological brain cells work. And so we call those algorithms artificial neurons, those artificial neurons can be linked together so that the output from one artificial neuron conform the input to several other artificial neurons. And in that way, we can have a network of artificial neurons. So these artificial neural networks, if you have several layers of them, so you have an input layer that contains whatever your input into your model is, and then you have an output layer that that represents whatever prediction you're trying to make with your model. And then in between that input and the output, you have as many will be called hidden layers of artificial neurons as you like. And so if you have, if you layer it in this way, and you have at least three hidden layers, so total of five layers when you include that input, and that output layer, you can call this a deep neural network, or a deep learning network.
Tobias Macey
And you mentioned that the majority of the work that you're doing and your day job is around machine learning applications for being able to apply it to these various use cases. And I'm wondering how you are leveraging deep learning techniques in your own work?
Yeah, great question. So the, that structure that I just described, having many layers of these artificial neurons allows these deep learning neural networks to be able to Automatically extract the most important aspects of the data that you're inputting into your model for predicting whatever the outcome is that you're trying to predict with your model. So to have a biological visual system analogy, the way that this works is if you build a machine vision algorithm with a deep learning network, then your input layer will have pixels of an image as the input. And your output layer of your neural network might then be the class that that image corresponds to. So let's say you're building a image classifying machine vision system that is designed to distinguish cats from dogs. In that case, you might have 100 images of dogs that you input and 100 images of cats that you input, and you label all of those images as being either cats or dogs. So we're setting up our deep learning model in a way so the They can learn to approximate pixels that represent a cat with the label cat and pixels to represent a dog with the label dog. And so the hidden layers of this machine vision network automatically learn how to extract the most important information about those pixels in order to represent a cat or a dog or more specifically, to distinguish a cat from a dog and the first layer of those artificial neurons in this many layered artificial neural network, that first layer will come to represent very, very simple aspects of the pixels. So essentially just straight lines at particular orientation. So some of the artificial neurons in that first layer will represent vertical lines, some of the horizontal lines, and 45 degree angles and so on. And then the second layer of artificial neurons and this deep learning network can take in that information about straight line detection. And and those straight lines can be nonlinear Lee recombined so that that second layer of artificial neurons can detect curves and corners. And then you can have a third layer after that, that does even more complex abstraction on the curves and corners, and so on and so on. You can have many, many such layers of artificial neurons and your deep learning network. And each one as you move deeper, can handle more complex, more abstract representations of the input data. And the really, really cool thing about the learning models is that it is able to figure out what these important high level abstract representations are fully automatically from the training data alone. So you don't need to program any of that specifically. And so that's what's made deep learning models so popular suddenly is, as we've had the compute power and the availability of data in the last few years, to be training these relatively beefy models, they can then on their own extract All of these these features from the raw data and solve all kinds of complex problems. So that visual analogy in a way you can you can imagine that then in, in my line of work in my day job at untapped. We're concerned with various models related to human resources. A really common model is predicting the fit of a given job applicant for for a particular job. So, we have clients, big corporate clients, or recruitment agencies that handled millions of applications a year to thousands of different roles. And instead of sifting through all of those applicants with say, a Boolean keyword search, our model can rank all of the applicants, the million applicants that you had over the last year for any one of the rules that you are hiring for and it Does that based on the natural language of the job descriptions, the natural language of the applicants resume? And we've trained this up on hundreds of millions of decision data points where
where a client be that a hiring manager or a recruiter has said, okay, based on this candidate profile, based on this job description, yes, I would like to speak to this candidate or know this candidate is not appropriate for this role. So by having this huge data set, and then a deep learning model that's taking in that natural language from the job descriptions and the resumes at one end, and then this outcome that we're trying to predict is this person a good fit or not a good fit for the rule. We then have this deep learning architecture in the middle, where the earliest levels of the architecture can look for very simple aspects of the natural language. And as you move deeper and deeper into the network, we can model increase singly complex, increasingly abstract aspects of the natural language that is being used on the resumes and the job descriptions. So you could find, because of because of the way that that works, you could end up in a situation where two candidates who have none who have no overlapping words whatsoever on the resumes could be the top two candidates for a given job description. Because this deep learning hierarchy is able to distill from individual words, the the contextual, holistic meaning of an entire candidate profile.
Tobias Macey
So with your background in neuroscience and your practical applications of deep learning techniques and your engagement in the education space of helping to upscale people who are trying to understand how to use the same technologies for their own purposes. It seems like a natural progression to then write a book about it, but I'm wondering if can just talk a bit more about your motivations for doing that. And some of the decision making process that went into figuring out how to approach it and how the idea of including illustrations came about and the benefits that those provide.
Excellent question. So by running the deep learning study group that I run, and by teaching, say, at the New York City Data Science Academy, I developed a pretty good understanding of what topics needed to be covered in order to give somebody a wide ranging education and deep learning. So this is, you know, covering the fundamentals of how deep learning works, as well as the applications that people are most interested in, which are machine vision, natural language processing, these these technical generative adversarial networks that can create what appears to be artwork, and then as well as these game playing algorithms, deep reinforcement learning algorithms, so I yes So kind of I gradually became more and more familiar with his body of knowledge. And by teaching it to students, I started to understand where they were most easily able to understand the content and where things were tricky. And what I found was by using a whiteboard, and this was actually something something that I've always been doing is I love teaching on whiteboards. And so drawing figures that represent concepts. So So instead of trying to it for a lot of people, and equation can be a lot easier to understand if I can draw it kind of visually in terms of, you know, how the, the matrices of data are being used and transformed what how these operations are happening and kind of a visual way. So that's always been kind of a natural thing to me. And that and it became clear to me through teaching that this is something that works for a large number of students. It's a way that they really take to learning this relatively complex content. So At brunch one day, on a Sunday in New York, I was out with one of my best friends who has been at at alphabet at working at Google or YouTube for about 12 years. And at the time, his girlfriend now his wife, ugly Baskins. She is a professional artist. And I pitched her this idea over brunch of, you know, I think if we made this as, as a book, I think if we had an illustrated approach to learning about deep learning, this is something that a lot of people would really benefit from. What do you think about that? And perhaps because through her now husband being exposed so much to machine learning techniques at alphabet, she was immediately very interested and she was an absolute joy to work with over the entire process. So yeah, that's how it all came about.
Tobias Macey
And I'm curious if you can give a bit of a comparison to some of the other books that you've encountered on the subject of machine learning or deep learning and some of the benefits that you see your book providing comparatively. And some of the some of the ways that the target audience that you're focusing on would gain better understanding or better value than some of the other books that they might be able to pick up. Perfect. So
that's not a question I've been asked before. And it's an interesting one, because all of the books that I guess my book, quote, unquote, competes with, you know, there's some there's some benefit to them, relative to my book, and, you know, there's just there's always some kinds of trade off with all the great books and deep learning. The the seminal academic text in deep learning is called Deep Learning, and it's by Ian Goodfellow and Joshua Ben geo and Aaron Carville. So these are these are academics between the University of Montreal and the Google brain team. And they developed this. Yeah, this academic textbook that covers the the mathematical theory of deep learning very thoroughly. However, it doesn't have any hands on examples. So it's so that so that book is all about learning theory. And and so our book is completely distinguished from that by by being focused on application. And while we do cover the essential parts of the theory, we do that in a way that is quite different from that MIT Press book. So where we have to use equations, they are in full color. So we have, so any variables that are used in the book, their coloring is continuous throughout the entire book, and you see those colors replicated in both the body of the text and in the equations and in the illustrations. into that kind of continuity. And that color can make it easier to understand the underlying theory. And then of course, just are having lots of hands on applications means that it can have a lot more value to somebody who's a hands on practitioner. And that those kinds of hands on examples can also give you a great sense of how these things work in practice, in a way that just looking at the equations and understanding the equations might not. So that's, you know, that's kind of, you know, one of our primary quote unquote, competitors in the space. The most popular book probably today in terms of hands on applications is, are really in journals, hands on machine learning book. And that book well, introducing machine learning in general, also does talk a lot about deep learning, which is a type of machine learning approach in particular, and it is a really, really great book. His second edition is coming out shortly. And I was one of the technical reviewers of that second edition. It's it's no surprise that it's the most popular book in machine learning today. because it offers such a wide ranging look at all of the kinds of machine learning approaches that are that are out there and and is replete with hands on examples. So that book is really great. And where we distinguished from that is again we're focused specifically on deep learning. So while that book is focused on being a general machine learning introduction, our book is very specifically about deep learning and so we can go into that in more depth than a really enjoyable had time for and then again, of course, we do have these colorful illustrations and the way that we tied together all of the all of the variables with in full color throughout the figures and the equations and the body text. So, it is a it is a different kind of, of book. To my knowledge, there is no book on the market that makes use of color for Explaining any kind of theory, any kind of mathematical or statistical theory in the way that we have. And actually, that's kudos to my co author grant bailed, who had that insight he suggested, early on that wherever we have to include equations, it would be beneficial to the user to have those be in full color.
Tobias Macey
Yeah, it definitely helps to pick apart the most notable components of the equation rather than just seeing everything and playing black and white, because then it all just sort of blends together. And it requires a lot more effort, a lot more effort to be able to parse it and understand how the different pieces are reacting with each other.
Yeah, exactly. That's the idea, Tobias. I'm glad that you that you see it that way as well. So when we you know, when you see an equation in the book, and then below it, there's an explanation of what this equation is, you can just very quickly say, Ah, well, there's the purple part of the equation and here's, you know, that purple part in the text, and you can very quickly make that connection and then in some cases where we've been on top of that, say okay here, here's a figure that kind of explains, you know how these pieces fit together visually. And you can, again, just at a glance see across the figure, the equation and the body of the text very clearly, this is the purple part and it I'm glad that you see the value in that too.
Tobias Macey
And for the target audience of the book, I'm curious how much background understanding of programming or statistics or machine learning is necessary and to what level of facility you expect them to get to by the end of the book.
So I deliberately designed the book so that the first four chapters of 14 chapters has no code and no equations. So the first four chapters of the book are intended for any kind of interested learner. So anybody with an interest in how deep learning or artificial intelligence works, and is interested in getting exposed to the range applications that it has. So whether you know machine vision, natural language processing, creativity and complex decision making, regardless of which of those you're interested in, or all of them and just seeing kind of what does it mean to have artificial intelligence today, or where's this field going? what's possible in my field, anybody can get that from reading the first four chapters. In chapter five, we begin introducing Python code. And and then through the rest of the book all the way through to chapter 14. There are examples in Python, they are especially in earlier chapters, these examples are fairly straightforward. So if you have experience with any object oriented programming language, not necessarily Python, then it should still be quite straightforward to see what's happening in these code examples. And I went through to great lengths to make sure that That I explained in detail every single line of code in the body of the text. So even if you're not already familiar with Python, or even if you're, if this is your first exposure to object oriented programming, those thorough explanations should make it possible, though maybe not as easy for somebody who does have Python experience to to follow along and see what's happening in these examples. So yeah, so some Python experience or at least object oriented programming language experience would definitely make taking in chapters five and onwards easier. And then it's the same kind of thing for machine learning or statistics experience. So if you if you happen to have experience with statistics, or some other machine learning approaches like regression modeling or support vector machines, or random forest or what have you, or just the psychic learn library in general, then the book would definitely be easier. But again, I went to great lengths to make sure that I was explaining everything as clearly as I could, so that even if you didn't have experience in machine Statistics, you should be able to follow along at a high level. And then I provide lots of resources in footnotes. So that if something is if there's something that you need to dive deeper on, you can do that on your own time.
Tobias Macey
And one of the challenges that exists anytime somebody is trying to encapsulate a technical topic in printed form is the idea of timeliness and how you guard against the information becoming obsolete as new techniques evolve, new libraries come about as the libraries themselves evolve. And so I'm curious how you approach that particular problem and your selection process for the technologies and techniques that you decided to incorporate ultimately,
I love that question, Tobias. So that is a tricky one. things move very quickly in the machine learning field and expected that there will be a second edition of this book coming in the next few years that will be updated to the latest libraries, you know, the latest TensorFlow pytorch Mackerras library Or whatever is the invoke Deep Learning Library of the day, a couple of years from now, however, in terms of maybe specifics of the particular packages that get used will definitely change. The nice thing about deep learning is that the vast majority of the theory is quite old already. So the theory around the artificial neurons that make up a deep learning network, that theory has been around since the 1950s, and hasn't changed very much. And then, in terms of actually networking, those artificial neurons together into a deep learning network, most of that theory was figured out in the 80s, a little bit in the 90s. And then in the early 2010s, we had a few key breakthroughs, but those those breakthroughs through the 90s and more recent years, they're kind of they tack on to the earlier theory. And so in deep learning, at least for We're not seeing old theory kind of wiped away entirely. And starting with a completely new approach to some theoretical concept. Instead, what we've been seeing from the 1950s, through today so far is that we build upon existing theory. And so in that sense, I think the vast majority of the content in this book is future proof, at least for a decade or so like, you know, it's hard to imagine that it's possible that some completely different kind of approach will make deep learning obsolete in the coming years. But there's no signs of that yet. And so, you know, when I, when I sit down to write the second edition a couple of years from now, I think it will, you know, I'm not going to need to rewrite all of theory. Instead, I'll just be tacking on more of the new techniques, new approaches that have come about in the intervening couple of years,
Tobias Macey
and now that it has been published, I'm curious if there are any elements of The topics that you covered or the specifics of the code examples that you think you would have done differently or that you think might need updating in the near future.
There isn't anything that I look at now that I feel would need a complete overhaul or that I wish was done completely differently. The main thing that I look forward to being able to do, as I sit down to write a second edition is being able to add more. This already this book is already a fair bit more dense than the publisher Pearson was looking for. So they were hoping for at least 250 pages. The book now is it's 416 pages. And so it does have a ton of detailed content, but there's so much more that I would like to add. And so there is Yeah, there isn't really anything that I would like to do differently. I just look forward to having the time to add in even more information. And that's also the kind of thing that we saw with a brilliant Cheryl's book, which I mentioned earlier. So that first edition already was so comprehensive as an introduction to machine learning. But with his second edition, he was able to add in even more detail on so many different topics and make a much thicker book. So I look forward to being able to do that with my second edition as well.
Tobias Macey
And you covered a few different problem domains where a deep learning can be applied, such as natural language processing and computer vision, which you mentioned and generative adversarial networks. So I'm wondering what your selection process was for the specific problem domains and how you approached determining what the sufficient level of depth was to be able to cover it appropriately so that the reader could get a good understanding of it without spending too much time and exploding the length of the book beyond what would really be tenable for somebody to be able to consume in a reasonable amount of time.
So the initial seed for what content went into the book was the content that we were covering in the deep learning study group that I run. So at the end of every study group session, our final agenda item was always All right. Now let's talk about what else we should be learning what should we be learning for next time? or What should we be putting on the list for learning at some point. And so these particular applications, computer vision, natural language processing, generative adversarial networks, and deep reinforcement learning for complex sequential decision making these four areas stood out is clearly the most important areas. So that's how I came up with the initial list of kind of high level topics to cover in the book. And then in terms of how much for every one of those topics, there is at least one very deep and detailed hands on code example. So for some of these techniques, like generative adversarial networks, or deep reinforcement learning, you know, those are those are the two most complex topics covered in the book. And so for the Those two topics having one thorough code notebook example. And covering that from beginning to end was more than enough material for the reader, in my opinion, for the other topics for machine vision and natural language processing, those are topics with a lot of different things with a lot of different things that we can be doing in them. So, you know, with machine vision, we can be classifying images as being in a particular category. We can be segmenting images into, you know, pixel by pixel into what the different elements of the image are. In natural language processing. There's a huge variety of tasks that can be handled. So classifying documents, auto generating content, translation between languages chatbots. And so some of those start to get way too complex to cover in this kind of overview book. So things like machine translation, or chatbots. Well, everything that we cover in the book serves as a great foundation for those applications. You know, they really need they would need hold chapters to cover properly. And so, with those machine vision and natural language processing topics, what I did was I have several complete thorough examples of kind of intermediate complexity topics. And then I say, Hey, if you're interested in these even more complex topics here, you know, are a few paragraphs that summarize what's possible today. And here are links to the key papers and GitHub repositories so that you can go off in and learn about those things on your own and for that natural language processing topic, in particular, because that is what I do at my day job at untapped. It's a particular interest to me and and it's I also know that it's a particular interest to to readers because I teach online, you know, Riley Safari twice a month, so do a three hour tutorial. And, and, you know, I do I do lectures around New York to various meetup conferences and at at the end of each of those kinds of venues and some of these have hundreds of people in the audience. Okay, and what are you most interested in learning about next? And I say, is it machine vision? And some hands go up? Okay. Is it generative adversarial networks? And some hands go up? Is it deeper enforcement learning and some hands go up? But when I asked Is it natural language processing? Or is it handling time series information, because natural language is just an example of time series data. Because it you know, whether it's words on a page or audio of speech, it flows in one dimension over time. And so it's that topic that you see a huge number of hands go up. So I know that that's a huge interest in AI. And so my my next book, actually, and I have a verbal agreement with Pearson on this already there awaiting my you know, my full proposal that my next book is going to be focused entirely on natural language processing. And so that will give me the opportunity to expand more on that particular topic.
Tobias Macey
And one of the other things that I liked in the way that you structured the book is that at the tail end of it, you had some examples What else you can do and ways that you can continue your learning and some different project ideas or categories that the reader can engage with. And I especially liked the fact that you were encouraging people to do things that will have beneficial social impact and some resources for them to be able to find ideas for that and engage with different organizations that would benefit from that technical acumen.
I'm really glad that you enjoyed that part of it. For me, that was a really important chapter to write. And the ideas behind that final chapter were were spurred by my experience, largely teaching this content at the New York City Data Science Academy. So, you know, it's this 30 hour curriculum that I do over five Saturdays, and it's this textbook really is is the accompanying content to those lectures, that exercises that I do over those 30 hours at the Academy. So I knew from from doing that teaching That what students want to be able to do is not just be able to go through the examples that you've done in class, people want to be able to devise their own projects, they want to be creative with deep learning, they want to be able to apply deep learning to their particular field of interest. And so that chapter, that final chapter comes out of my experience, mentoring students on developing their own deep learning projects. And so a big part of that 30 hour course, from the very beginning from the first week, I say, okay, you know, you don't have to do your own self guided project, but it will really help you cement the ideas that we're covering in, in this course. And so I highly recommend that you do that. And from the very first week, I have a framework for initially ID eating and then later compromising a particular project and executing upon it over over the course of the course. So that final chapter is is the kind of is is that process where I outline Okay, you know, If you're not, if you don't have a particularly creative idea of something that you'd like to do, here's some, here's some relatively easy ideas off the shelf data sets you can use, if you want to be doing something with your own data, here's some tips for doing that. If you want to be exploring more complex data sets or scraping your own data sets off the web, here are some resources for doing that. And then the final piece there about the kind of social impact that you can have. And and and making that clear. You know, that's just something that, you know, I didn't have to be including that. And I'm not aware of many other textbooks that kind of make that social impact, summary or recommendation at the end. But for me as a relatively young person, if it's time in the history of our planet, we have terrific opportunity in so many quantitative ways. Life has never been better on this planet for humans at least where you terms of lifespan and quality of life. We live today in a way that kings a century ago couldn't have imagined. And so in on the one hand, this this is, you know, we should definitely be happy and positive about where we are in the world. But there's also a lot of uncertainty about where we are in the world. There are far more people on this planet than there ever have been in history. And each one of those people is constantly demanding more and more energy and resources. And so you know, the the burden that we are placing on the ecosystem of our planet is tremendous. And it looks like it's going to become more and more and more and so machine learning, combined with the Internet of Things. So the kind of the prevalence and cheapness of sensors being everywhere, in my view, it has the potential to allow us to continue the wonderful trend that we've had over the last hundred and 50 years to Words prolonging human life and making human life more satisfying than ever before. Well, at the same time, allowing us to coexist peacefully and indefinitely on this planet. So yeah, so that's, that's the kind of a bit of inspiration behind, suggesting that people tackle social impact projects. And then I include resources in there for, you know, if you're looking for something to do with your time, or your machine learning skills, here are some serious problems that we're facing today that could be worth focusing your attention on. And
Tobias Macey
one of the things that I'm curious about in terms of people coming out of this book or your course where and having the fundamentals of being able to build these neural networks, and they have built out some sample models is the ability to repurpose some of that same code or some of the model pipelines for different applications or different data sets because my understanding is that one of the critiques of deep learning is that it is largely single purpose. In terms of once you build a single model, it is great at that particular use case. But it is generally difficult to be able to repurpose that to a slightly different context. And I'm wondering what your experience has been in that, and some of the recommendations that you have for practitioners and engineers to make it easier to be able to component eyes the model pipeline to make it more reusable and more flexible, outstanding question Tobias.
So neural networks are interesting in that they are actually highly flexible and can be retrained to particular tasks. So well, any given Deep Learning Network and any given point in time might be very highly specialized to our particular task. You can use what we call transfer learning to take that existing network and repurpose it to some related task quite effectively. So in chapter 10 of the book, we go over a machine vision example where we Take a very deep neural network that was trained on a huge data set of millions of images, images called the image that data set. And it would be very expensive computationally. You know, it would take weeks on a on a high end server, deep learning server with GPUs to train up such a deep machine vision model and such a large data set. But with modern deep learning frameworks, including the keras API in a TensorFlow, you can trivially easily in a line of code, use that that deep, very nuanced machine vision model, load it in and then you can adapt that model to your own particular use case. So in the textbook, what we have doing is in deep learning illustrated, we have it do we transfer learn in order to be able to distinguish images of hotdogs from images that are not hot dog, so other types of fast food and that's a funny idea inspired by the HBO Silicon Valley series where one of the characters on that show builds a hot dog, not hot dog detector. And, and so so in that sense, deep learning models are quite flexible. Now, of course, you can't take a model that was built for a machine vision task, and it reads in pixels and outputs what you know whether it's a hot dog or not, and take that in and have it be reading and resumes and job descriptions and predicting whether, you know, a given person is a good fit for a given role. So there's, you know, there's a limit to what this transfer learning can accomplish. So I guess I overall kind of don't agree with the point that that deep learning models are quite fixed or not usable for the purposes, that they actually are quite easily repurposed to related kinds of tasks and that this kind of transfer learning can be a very powerful thing to do. And in fact, to just give one final example, with the work that we do here, you know, our core model, if you think about it that way, the only kind of not the only model that we've applied for a patent for Is this job to candidate matching algorithm which we found on hundreds of millions of data points, but many of our clients that are interested in other human resources are recruiting related models. And so often what we do is we take that that starting point this this beefy model trained on a huge data set, and we can repurpose parts of it for other human resources related tasks like matching a candidate to a pool of other candidates, for example.
Tobias Macey
And another issue that I've seen identified with machine learning in general, but particularly with deep learning is the fact that it is excellent at creating correlations, but that it is difficult to be able to model causal inference about the results that they produce. And I'm wondering what your thoughts are on that and any useful references that you might be able to point people to, to maybe dig deeper on that particular problem space.
It is definitely A shortcoming of deep learning models today and indeed most statistical or machine learning science is that you know, the vast majority of techniques that exists for modeling data, whether it's statistics, deep learning or some other machine learning process. For the most part, these techniques are great at identifying correlations but have little capacity, if usually no capacity to say anything about a causal direction. And that isn't the case with everything. And actually, a big part of my PhD was using particular machine learning models and Bayesian statistical models to infer causality, which in the case of my PhD research was, in some cases, relatively straightforward because you could say, for example, you know, somebody if you find a correlation between a gene and some behavior, well, we know that genomes are fixed over persons lifespan except for random mutations, and so There's no way that the causal direction could be from somebody being anxious. And that causing them to have their genes change in a way that, you know, the genetic sequence can change in a way that then is more likely to be the profile of somebody who's anxious. So in some ways, you know, so there are some problem spaces where you can define causal direction based on your knowledge of the data. But of course, the model itself is not aware of that underlying understanding. So yeah, Gary Marcus is a researcher at New York University who has called up a lot of shortcomings of deep learning models today, and one of the big ones is this. causality is his inability to infer causality. I don't actually have a specific resources on how to resolve that Gary Marcus might and I and I cite a Gary Marcus paper in chapter 14 and my textbook that probably could point you in the direction of resources on on trying, you know, to To pursue models that, that have more causality in them and actually, ECMO was a one resource. I could point people in the direction of his others an author named Judah Pearl, JUD a pearl like pearls from an oyster and GJ a pearl has written extensively on causality. And he has several books on it, including one called causality. And that might provide people with some techniques for identifying causal direction with the data that they're working with. But yes, as deep learning models, you know, is a typically stand there's no capacity to infer causal direction and that is one of the shortcomings that will have to overcome as Gary Marcus himself points out, in order to bridge the gap from the narrowly defined artificial intelligence systems that we have today which are able to identify what's in an image accurately and expanding to a general audience. diligence that is more like the broad intellectual capacities that you and I have as human beings. Yeah. So there's a huge amount of work required in that space. And I imagine there's going to be 10s of thousands of deep learning engineers over the coming decades tackling that problem,
Tobias Macey
and what are some of the other limitations of deep learning as a particular practice and some of the cases where you would recommend against using it?
Another really great question Tobias. So in the vein of causality being a difficult thing for deep learning our algorithms to identify in the same way it is, it is often difficult to understand to explain why a deep learning model has made a particular decision and so if you use a linear regression model to solve a particular problem, then you know, your regression model might have a dozen inputs, and each of those dozen inputs is associated with a very specific weight. And we can we can then for any prediction that that regression model makes, we can look at those weights and say, Okay, well the reason why this linear regression model made this prediction is because of factor x y&z and and those factors, x y&z are weighted by these very, very specific amounts that the linear regression model has has identified in deep learning because we can have millions or even billions of parameters in our networks, it can be difficult to put your finger on exactly why it has some particular output. And so some people will talk about deep learning models as being a black box because of that. Now, there is quite a bit of research being done on on explainable AI, quote, unquote, which kind of gives us some insight into what is happening in the black box and because of my work at untapped building these human resources models is is something that all of us here are very familiar with. Because when you're building a model that can recommend a particular individual for a particular role, it's absolutely imperative to our clients that they know that that isn't happening based on some demographic factor like gender, age or race. And so it is possible to begin to distill the important parts, you know, so if you're building a model to to predict the applicability of a given candidate given role, for example, one thing that we've done is we say, Okay, we have this big pool of female applicants for role a big pool of male applicants for a role, what are their relative scores, how did they score on average for this role, and we see that with our model and the modeling process that we followed, okay. The regardless of gender, the distribution of probabilities looks identical. So although although we might not understand how every single neuron amongst the millions of neurons in our artificial neural networks are behaving in order to produce the outcome that we're creating, we can be mindful about what what training data we're using to train the network and remove bias or, or problematic data from the inputs. And then also, after we've done training, we can then do these kinds of tests like that one I just described to make sure that you know, males and females are getting the same scores for a given role to ensure that these precautions that we took have been effective in preventing bias and so that even though there's a black box, we we understand its behavior sufficiently that we're comfortable using
Tobias Macey
it and in terms of the ongoing research or applications of deep learning and the direction that the field is going. I'm wondering what you're most excited by and what you're personally concerned.
So it's easy for me to answer the most excited by question because I kind of already answered that earlier on in this podcast, which is that the thing that most excites me about deep learning is its capacity to allow us to continue to make exponential strides in human quality of life, and maybe even quality of life for other animals on the planet, while simultaneously avoiding an ecosystem catastrophe. So that that potential for machine learning techniques, and particularly deep learning techniques, is what I'm most excited about. Beyond that, which are applications that I can foresee being possible over the coming decades beyond that there is this possibility that and it's only a theoretical possibility that we can engineer an intelligent system that is as intelligent or more intelligent than a person or any group of people. And that's potentially exciting. And you know, people call that the singularity. So Ray Kurzweil, I think coined that term, to refer to that moment where we build a machine that is more intelligent than humans. And you know, that's kind of exciting in a way to, although that brings about a huge amount of fear as well, because we have no idea how such a system will will treat humankind or how we will interact with such a system. It's impossible for us to even imagine it. So there are some great resources if you're interested in in kind of thinking about that problem. You've all know a Harare, who is most famous for his book Sapiens also has a great book called homo Deus, which is a Latin term he coined referring to so if Homo sapiens is thinking man, then Homewood as his god man. And, and he and a big part of that book is Tom is talking about what could happen and how we might be treated by An intelligent life form on this earth that we create that is much, much more intelligent than us. And for kind of that's kind of a dense read, if you're interested in a in a relatively quick introduction to that topic, Tim urban, who writes the blog week, but why, and he really kindly gave us an endorsement for deep learning illustrated that appears on the on the back of the book. And inside the front cover. He does a great long form series of blog posts. It's two blog posts that cover what artificial intelligence is today. And what could happen as we approach or go past that singularity. So yeah, so that's so that's that. So in the short term, you know, over the coming decades, the thing that I'm most excited about is is the capacity for machine learning, deep learning and the Internet of Things to, to to make life on this planet more wonderful and peaceful than ever before continuing the trends that we've seen over the past century and then in the longer term, I mean equal measures excited and afraid what could happen
Jon Krohn
as if the singularity happens.
Tobias Macey
Are there any other aspects of the field of deep learning or your work on the book that we didn't discuss yet that you'd like to cover before we close out the show or any other parting words that you'd like to give to the listeners and potential readers?
That's a great opening, Tobias. Nothing actually really comes to mind. I got to talk about so many of the things that excite me most about deep learning in our conversation today. And I even got to talk about, you know, the social impact concepts a couple of times. Now, I'm really satisfied Tobias. I really enjoyed this podcast today, and I hope your readers take away some interesting tidbits from it.
Tobias Macey
Well, for anybody who wants to get in touch with you or follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And so with that, I'll move this into the pics and this week I'm going to choose a website called spurious correlations. And so it's just a list of different data sets that correlate, but are obviously not causally related. So one of the examples is that the divorce rate in Maine correlates with per capita consumption of margarine. So it's just a list of different charts with hilarious correlations that are obviously not causally related as a sort of warning to not read too much into the fact that two data sets happened to relate to each other and give you a second to think twice about that. And so with that, I'll pass it to you, john, do you have any pics this week? That is
a really fun website. I've come across that before. A recommendation I have, that I use primarily to keep up with innovation in data science and deep learning in particular, is a great newsletter called Data elixxir ELIXIR and data elixxir is it's a it's a one man blog that has Yeah, between half a dozen and a dozen articles in it each week, and I have never come across anything that's so Simply captures everything, all the major events that you need to keep an eye on in the world of data.
Tobias Macey
Well, thank you very much for taking the time today to join me and discuss your work on the book and your experience both working in and teaching deep learning. It's definitely a fascinating field. And I've enjoyed the time I have spent with the book and I definitely plan to read it in its entirety. So thank you for all of your efforts on that front, and I hope you enjoy the rest of your day. Awesome, Tobias. It's great to hear that
and it's been an absolute pleasure being on your show. I had never come across such a thoughtful and thorough list of questions. So thank you very much for the time.
Tobias Macey
Thank you for listening. Don't forget to check out our other show the data engineering podcast at data engineering podcast com for the latest on modern data management. And visit the site at Python podcasts. com to subscribe to the show, sign up for the mailing list and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email host at podcast and dot com with your story. To help other people find the show. Please leave a review on iTunes and tell your friends and coworkers
Liked it? Take a second to support Podcast.__init__ on Patreon!