Simplifying Social Login For Your Web Applications - Episode 247
January 27, 2020
A standard feature in most modern web applications is the ability to log in or register using accounts that you already own on other sites such as Google, Facebook, or Twitter. Building your own integrations for each service can be complex and time consuming, distracting you from the features that you and your users actually care about. Fortunately the Python social auth library makes it easy to support third party authentication with a large and growing number of services with minimal effort. In this episode Matías Aguirre discusses his motivation for creating the library, how he has designed it to allow for flexibility and ease of use, and the benefits of delegating identity and authentication to third parties rather than managing passwords yourself.
Do you want to try out some of the tools and applications that you heard about on Podcast.__init__? Do you have a side project that you want to share with the world? Check out Linode at linode.com/podcastinit or use the code podcastinit2020 and get a $20 credit to try out their fast and reliable Linux virtual servers. They’ve got lightning fast networking and SSD servers with plenty of power and storage to run whatever you want to experiment on.
Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
Your host as usual is Tobias Macey and today I’m interviewing Matías Aguirre about Python social auth and the complexities of third-party authentication
How did you get introduced to Python?
Can you start by describing what the Python social auth project is and your motivation for starting it?
Why might someone want to integrate with or rely on a third-party identity provider in their projects?
What are some of the tradeoffs or drawbacks of implementing
Can you describe the current architecture of the library and how it has evolved since you first began working on it?
There are a number of pre-built integrations with different web frameworks in the social auth github organization, but Django is the only one that has seen any commits recently. What are the contributing factors for that state of affairs?
There are a number of authentication protocols that you support. What are the common capabilities that they each support and what are some of the more challenging differences between them?
How have you implemented the interface for plugging different authentication mechanisms to allow for the variation between them while keeping the library code maintainable?
What is involved in adding support for a new authentication provider or protocol?
Many times authorization and authentication are conflated or used interchangeably. How does Python social auth address those concerns and what are the limitations of different mechanisms for defining permissions?
For someone who is using Python social auth, what is the workflow for integrating it with their application as a consumer?
What are some of the most interesting/unexpected/innovative ways that you have seen Python social auth used?
What are some of the most interesting/useful/unexpected lessons that you have learned in the process of building and maintaining Python social auth?
When is Python social auth more effort than it’s worth?
What do you have planned for the future of the project?
Hello, and welcome to podcast, the podcast about Python and the people who make it great. When you ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at linode. With 200 gigabit private networking, scalable shared block storage, node balancers, and a 40 gigabit public network all controlled by a brand new API, you've got everything you need to scale up for your tasks that need fast computation. So just training machine learning models, they just launched dedicated CPU instances. They also have a new object storage service to make storing data for your apps even easier. Go to Python podcast.com slash linode. That's l I NOD today to get a $20 credit and launch a new server and under a minute Don't forget to thank them for their continued support of this show. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen and learn from your peers you don't want to miss out on this year's conference season. We have partnered with organizations such as O'Reilly Media chronium Global intelligence, od sc and data Council. Upcoming events include the software architecture conference, the strata data conference, and pi con us. Go to Python podcasts comm slash conferences to learn more about these and other events and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey, and today I'm interviewing Mathias Geary about Python, social auth and the complexities of third party authentication. So Mathias. Can you start by introducing yourself?
Yeah, what is been a long time since that? So I called my handsome Python back in the day when shango opaline. Nice six was a thing was quite popular version but them. So with a friend, were looking for something to move away from PHP. So we speak our research one focusing solely on race another focusing on the shango, which were the popular solutions at that point I wasn't chatting up that will be on praise research again. shanga won the contest. I'm
Yes, so I will correlate for several years on my previous workplace bullying was the main language we use. That was for a years around eight years oh four. And sometimes I got the chance to use some vital as of today I'm working on a new place is fully Python for everything over the over the years I had the chance to work on Python on personal project mostly by comes out is one of the results of that. And so
Yeah, so vitus analyses, a small library, not quite small by now, but it seems to simplify the developer life when they want to integrate social based authentication authorization on their projects. So everybody's familiar with a logging QC an ex sociales boots on surrounded by tonsillar wants to hide the complexity behind that functionality. Why given enough room to define a good solution that better fits your project, like many projects, it got started today. forestation with solutions that were available at that time, people small every day, co working on a personal project, the solutions that were popular back then they have complicated setups or not enough room to add my requirements. So I found my own project shango set out back then, which fully focused on shango. And then it becomes a very popular solution. And a couple years later it morphed into violence.
And at this point, it seems to have become somewhat of the de facto standard for being able to handle these third party login integration. So I'm wondering what you think are some of the contributing factors to the popularity that it has gained and maybe give a bit of a description of some of the overall landscape of available options, particularly now since as you said at the beginning, when you first started the project, there weren't really a lot to choose from.
So I say that the main contributing factor to make it so popular was to simply ICT by integrating shango do over else the project post, we release more set of settings or small changes in your code base will have a fully working application process using associated website as a source of information for taking 30 days and user profile. Then the second key factor on the property, I would say is the particular feature on the narrative which is called pipeline. This feature allows you to extend the authentication process as much as you want with any requirements that are needed for your project. But then this future it didn't exist on on the solutions. As I recall, the popular library at that time you will were required to define templates, both on Sat forums that didn't seems like the proper solution for our dedication flow were usually to just click a boat on go to the Provide application provider to put your credentials in there. And then you are COPPA sent out to the your site. Can you have your account ready up and running? That was main focus on papers cut out and I did quite well. It asked us today to and then in terms
of some of the use cases of when somebody might want to enable integration with these third party identity providers, what are some of the motivations there? And what are some of the trade offs or drawbacks of delegating authentication and identity to those third parties versus controlling it entirely in your application, almost every
application request username to be registering, not at the property war is a common pattern. Everybody wants to search on their side use and application and it's a very common problem, but it demands quite a lot of work to solve a you need to define forums or by deviations, make information or profile pictures, upload details, etc, etc. So it's a very complex project and demands a lot of work and Then there's the security big of all these, which is password counting password, ensuring that they're properly fashioned your database, they use a salt, a good algorithm, then there's potential of leaking this power out with these passwords out. Especially when there are so many bad practices used by user like use the same password for every site. There's a risk, a security risk on this sociate authentication. So most all of these problems, there's no forums, you just put a link on your site profile details are already populated from the authentication provider, a maid address, are you sorry, are ready by the database of education providers, especially just no password on your end, you don't need to have a password store or your database at all. So there are many factors that they produce simply by adding a link or a port on your site. And
for a number of projects, they'll actually have both options where you can create an account using An email or a username and password, but then also have the option of using different social providers whether that's Google or Facebook or Twitter or GitHub or what have you. And I'm wondering what you have found to be some of the other best practice or common trends among people using Python social author as to whether they prefer to have that option of maintaining identity on their platform or if they tend to prefer just relying on these social providers as the primary or only authentication mechanisms. Frankly, I
on our side we maintain our syndication appreciate, really dislike password I even I prefer the vacation mechanics where I put my email address and a temporary password or link is made to my account. There's no need to password at all. So password has proved to be really security problem since the beginning of pointing out as much as possible. is is is a must for me. And four reasons why they provide both options. They fail to I fail to understand them, I tend to assume that's mostly for recovery. But they're all their recovery methods today, they need a passport and decide to accept vacation.
Yeah, the magic link method that you're referring to as far as not having a password at all, but just taking an email account and then sending an email with a one time use authentication token, I think is definitely gaining a lot of popularity because of the weaknesses that we've seen and passwords and passphrases that you mentioned. So does Python social app also support that type of mechanism? Or is that something where you would just rely on a different library and then include that in the authentication pipeline that Python and social off provides?
also support Can you talk a bit more about how Python social author itself is actually implemented and some of the current architecture of the library and how its evolved since the first stages of when you began working on it in the form of Django social?
Yeah. So again, the beginning it was chambers. Okay. Now, this unit proposed was to solve so sanitation problem for Shasta shango projects. So in the beginning, it was really highly complex with a framework a even but even some basic blocks of what's today been set up already exist by the so. A key part of the implementation was ensuring the backends barkins are the parts that we handle the communication with education provider. So a key part of the individual was defining this packet with a goal of hiding as much as possible the complexity of the authentication provider wide offering clear and simple interface interface to the users These robotic which is the rest of the call it sexy love then this models usually like our initial corporate sure you have wallets where we store a date user ID authentication provide the reference to the user into database in and then there was the pipelines feature this was really early into listen to Kobe's looking at the features for December lotteries. List of functions that get called on a given or they're the output of the previous version is positive the Death Squad until the last one is security and they call as a pilot is having to have a user a pilot user on your database. It's okay to authenticate those a basic block of the application made it possible to enter Bobby to Python associate out where the shango related bits were moving to a new concept as colon strategy these are strategies are the glue in between the framework particularities vital silicon Core does quite an old bicycle to support shango for aska. And pyramid says, you can have more integrations if you want. And
yeah, those are the basic blocks today to square that you have dedication baganz, heightened complexity of the providers models for a storage of data certification filters, five guns to extend your particular functionality, the strategies to fight it framework complexities, and what have you found to be some of the most complex or challenging aspects of designing and building this library, particularly given the number of different identity providers and the variances in the authentication protocols that they support. So
for sure, the difference in the protocol were problematic a problem to solve by the framework, but in the end, I found defining a really simple interface or interface of what application needed from this provider that they have very well as they call it boss, because this simple interface allows me to hide the complexity of the provider while still fulfilling the requirements for the library. For instance, there's some metal core, get user details, I don't care. The rest of the code doesn't care about the particular implementation that dismantle a task for the different providers. It only cares about the output restored is a nice day having the user data simpler to build a user in somebody's store in the database. So ensure that defining this interfaces a clearly defined these interfaces for the rest of the code was key. To hide all the complexity behind the protocols or the different location provides them without difference from the protocols like open ID works a bit different, allows. They communicate differently, specs are peddling ball, Chase or API. JSON API soil providers, there are these key difference between them. But it was not as much as it was, it was begging when adding them. Some other of defining the parameters that need to be invoked on the different providers that is already available there to to pay the rest. It wasn't a composition complicated in overall, the authentication flow is quite similar. You click a button on your site, you'll get the redirected to the provider for your credentials. And then you are sent back to your site to continue that education process, which usually is hidden
behind in there. And I know that in different implementations or different workflows, there can be cases where you have to suspend the authentication flow and then resume it either from links into your email or from a different computer. And so I'm wondering how you approach that challenge of being able to ensure that the current state of the authentication for a given user is maintained across sessions or across browsers or machines. Yeah, that that's fine. Put your ideas were provided by the parsha pyromancers kobashi partner. So it's something globally the pipeline feature, it allows you to stop the integration process at any given point where you can do some scenarios that sent an email, or bring there for where you will need extra details from the user, etc. Right? It was a very welcome feature. In the beginning, it wasn't perfect. It was session based. So if you stop a Babylon right in the middle for some reason, you need confirmation from an inmate, for example, somebody is sit on the computer and try to loneliness on your side, they will press on notification flow from the previous user, so it wasn't perfect. In the beginning, then import into storing the pipeline mistake on the database. A token was generator you can use that token to send in an email or store the link is somehow shared a link to a user One way or another, and that unique token was indentify, your connection process, and the session was not involved anymore. So you can continue your best co star, the authentication process on your computer and ankle do calm and continue the process on your home computer in terms of the different protocols and implementations of those protocols for different authentication providers, they'll all have a different set of attributes that they're able to support that they'll provide. And so I'm wondering how you approach that challenge in terms of determining what the minimum set of common parameters are and how to take advantage of the additional attributes and merge that all into a cohesive user profile on the end of the site that's implementing Python social off.
Yeah. So Python cells trying to solve the bare minimum of the problem. So if focus on rail is more set of attitudes to war, II may or username or first and last name, we don't Willis more attribute is able to come to bear user understood out into the database. these are these are very basic if you base it on on shango user mauler it was based upon chanko sadao it was immediate narrative from the M S, still you are able to define what actually it was you want to pull from the different education providers, some some of them can be pulled ultimately by the library because they are. API seduced by the four is able to provide these attributes. Sometimes on most of the cases, you need to actually extend the pipeline. In order to retrieve the extra tables you want to store on your project. Like you want to don't go out a profile picture from gold. Then you need to add metal to the pipeline. Though that particular API request, and that's where I can say allow is out of the, of what it offers, you can extend, you can put anything you want by unit to add implementational yourself.
And so it's probably worth digging a bit more into the users perspective of implementing Python, social art and the overall workflow there and maybe dig into the pipeline mechanism that you have for being able to handle these different stages of authentication and how you manage the attributes and permissions from the identity providers that you're getting.
Yeah, so from a developer perspective, usually like all you need to do is go to the authentication provider administration panero page, that's just one for dedicated for developers who are looking quite an application wants to create an application with our time values, like our branded URL. You will get An application ID application secret or an application ID for open ID, or minimum sentence. Then you go to your project these particular keys in your settings to be available for the backend, we found that there was indication, the you can include a URL or a proton monitor template. Once the user clicks on this photon that vacation flow kicks in and you end with a user, create a user or an existing user login on your site. So it tries to be really simple. There's the complexity of going to the pond and to create this application on certain requirements from from dis providers. Then on the Python side, it's set of settings a URL link, that so if you want to extend a functionality like difficult example load on a profile picture, there will be a little more involved, you need to figure out which are the permission name or scope names you need to use to be able to access these. These provide feature from Google, for example, that's called LA City. With a scope there, you will get granted access to the API that provides the picture. And then you can call these API to define and install into your storage or whoever needs to be. So they're really simple solution. It usually is enough for everybody or for most of the projects, you want more involved solution you come to this room for for data. That's what I bought off the projector.
And another concept that's often conflated with authentication is that of authorization. And I'm wondering how you handle that aspect of the overall process of identification and permissioning in Python, social or if that's something that you leave entirely at the to the developers as a separate concern. From what You're trying to tackle in this library.
Yeah, heightened sellout actually doesn't care about that at all. I understand the setup abuse of authentication, authorization protocols. While they are even if they're only totalize access to API, they can also be used to greater user, your site, dedicated user in your site. So the there's a fine line there that you could use a nice grows out of the user forum dedication is worth. In the end, it's up to the developer to decide I want to use OAuth for authentication or authorization to access API. So on behalf of the user, my experience is that all of this the most common protocol around the web day is usually used for authentication and authorization.
Yeah, and in terms of the scope of the application that is granting these identities to the user is authorization also takes on a different scope of What they're allowed to do within the application, which is not something that you necessarily want to rely on third party identity providers and their scopes of determining and something that you would have to build as a primary concern for your own application. And I know that there are a number of other libraries available for Django and other frameworks for handling that concern, which is definitely a separate and differently complex beast to tackle. And then, in terms of the protocols that you support, as you said, a number of these identity providers are using now OAuth two, but also some of them are using OAuth one. And then there are things like open ID and samel. And they all have different variations of how they handle the overall flow. And I'm wondering what you have found to be some of the most interesting or unexpected or challenging lessons that you've learned in the process of building support for all these different authentication protocols into this library.
So I was lucky enough to be able to be BY JOHN CENA on the shoulders of other Peter projects like open ID, ID, or obviously, there's also a summer library that will integrate with it will provide you the access to the protocol in a simple way. So it wasn't that difficult to integrate the different protocols, or since pytel. To sell out actually cares about the bare minimum solution to the problem. It actually requested little very little from this library, usually, Euro the user is going to be sent to counter missions, confirmations input, the current confirmation, etc. And then there's a secondary API that gets called most of the time call to get some extra details. Open ID faster difference. This user the tears are already provide in the response back to your side but This complexity is all hidden on this libraries use or this little small complexity that I need to implement on vital scale out in order to to access the need that you boots. So in the end of the discussion flow is similar on the different protocols to send the user to aside, this is sensation back to you. You exchange a few activists from idi and that's all. Yeah, and that's all the Titans actually cares about.
And as somebody who has been working in the space of federated identity for so long, I'm curious what your thoughts are on the current state of affairs in terms of the available protocols. And if you think that there are any missing elements, or if there's space for a new protocol to be defined, that will improve upon the existing set that we have or any shortcomings in the protocols that you deal with on a day to day basis.
I still have strong opinions about different protocols. I know they're there to solve different problems, I really would like to see a unified protocol promotes ticket ADA and open ID or knows, for example, that which are the most common protocols around the web of today, having a single solution that provides false like all of this use today sexually abused today to the authentication I Samson seminar, the mesh, open it off would be a nice solution to see our morale. So from a from a developer perspective, today, they they actually found the option to avoid the hassle of integrations or certification on their side or the main sites are the different protocols even just allow that try to solve the differences. There are pure and simple solutions today services will provide you social integration or certification as a service is reduced actually the work you need to so in the end for developers which is the war I occur, with slavery is becoming even simpler to this feature to your site.
And in terms of your experience of using Python, social author or your conversations with people who have been using it for their own purposes, what are some of the most interesting or unexpected or innovative ways that you've seen it used?
I don't use any kids feedback or you know other ways the delivery is being used. I usually land on a ball here there are some simple pops on on my Twitter feed. There was a particular project, Bernie gains the wins the contest have been innovative and he was said user using buyten system now 20 K or once authenticated he the garage door of the house open or closed. I don't remember the details. Now. I don't remember if you you logging in and you have the control of the character on this website or ultimately by logging into your garage door will open or closed depend on this day. It was a refined grain. I go, I can't find it anymore.
It's definitely an interesting way to abuse the pipeline where you can run arbitrary functions just as a result of different stages completing. So yeah. And then in terms of your experience of building and growing the community around it, I'm wondering how that has evolved. And just some of the overall benefits and learning experiences that you've had as a result of it gaining popularity and then splitting it into its own GitHub organization and onboarding new committers and maintainers. To the project.
Yeah, suddenly there are no more containers on the project. I tried to add a few a couple years ago, a experience actually didn't work out. I it's not that they were apartment Arizona since they are the India and the amount of work that to the project did it okay they have been taunted therefore, so the discarded that option there is since then sadly because I really would like to delay adult of the world is needed for the Larry a month a point of my life were dedicating time to Python set out this will be something I'm trying I am struggling with new containers will be ready Welcome to the to the project to play by the Air Force and the different development it does is needed at the moment, I have a big list of different items I want to work on the area that I need to find the time to dedicate to it. Right now I'm in corn Jonah, low maintainers mall where I go we review a day from corporate ways request some change from our show a few here and there. And when he's a new version of celebrity with the last customership.
And in terms of the primary focus, it definitely seems like particularly given its origin of being integrated with Django that it's primarily used in web content. texts. But is it viable to use in other types of environments where it's primarily just for authenticating between back ends? Or maybe in terms of some sort of industrial automation or Internet of Things contexts, or is it primarily just focused on a web environment and anything that people might choose to use it for outside of that context, they're sort of treading their own path.
they're still my plans implement that area with the introduction of strategies on vitals Hello, that opens the door to implement new ways of integrated delivery on on a new places. I really would like a solution that's fully console integration that you can take usually, token number is based on the console or stuff like that, or or UI doesn't need to be a one. That's one of the strategy I would like to work at some point but the door is open to delay it with more environments, not just what and what other plans do you have for the future of the project, either in terms of the technical aspects or community aspects or just your overall ambitions of where you might take it or additional projects that you might build within its ecosystem?
for community? I don't have any particular plans. A it has its own flow as it is for me, I can put request from time to time and that's, that's good enough, in my opinion, technical aspect. Yeah, I like that as I bite on to mobile by conscious or what a sink and await support integration, the newer micro framework x fast API or Sonic, implement a few WC shy or hsgi strategies. They actually don't depend on any framework, non web integrations, your improvements here and there. This is good. Maintain integration is you need some low. I know better delivery. Now I know I can do a better code there. And of course documentation the documentation is a lot of work in progress samples, improve documentation, couting, lemon, new strategies, new baganz, new integration, etc.
And are there any other aspects of the work that you're doing with Python social off and the library or any of the associated topics with identification and authorization and authentication that we didn't discuss yet that you'd like to cover before we close out the show? No, rotating II.
Okay. Well, for anybody who wants to follow along with the work that you're doing or get in touch or offer contributions, I'll have you add your preferred contact information to the show notes. And with that, I'll move us into the pics and this week I'm going to choose the Joker movie which came out recently I think they did a very good job of portraying the character and giving a believable and meaningful origin story and a lot of really good acting throughout. So for anybody who is at all interested in that overall story space of the Batman mythos and the DC character is definitely worth a look. not your typical Superman story definitely much more character driven but worth a watch. And so with that, I'll pass it to you Mathias. Do you have any pics this week?
Yeah, I have a technical one. I am winning showing interest in this new eco framework, Sonic which focus on a sink and await support on Python three, I really find some fun is close to flask on API because it on a small sensor make very comfortable to work on non technical takes. And we're looking forward to the watch the new Star Trek series. And we're really looking forward to that and we'll find out this of the of the saga And this one is waiting for my sake.
All right. Well, thank you very much for taking the time today to join me and discuss your experiences of building the Python, social auth framework. It's a tool that I've used pretty extensively in the projects that we do at work, and it's made our lives a lot simpler. So thank you for all of your efforts on that front. And I hope you enjoy the rest of your day.
Thank you for listening. Don't forget to check out our other show the data engineering podcast at data engineering podcast.com for the latest on modern data management, and visit the site at Python podcasts calm to subscribe to the show, sign up for the mailing list and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email hosts at podcast and a.com with your story. To help other people find the show please leave a review on iTunes and tell your friends and co workers
Liked it? Take a second to support Podcast.__init__ on Patreon!