Faster And Safer Software Development With Feature Flags - Episode 239


Any software project that is worked on or used by multiple people will inevitably reach a point where certain capabilities need to be turned on or off. In this episode Pete Hodgson shares his experience and insight into when, how, and why to use feature flags in your projects as a way to enable that practice. In addition to the simple on and off controls for certain logic paths, feature toggles also allow for more advanced patterns such as canary releases and A/B testing. This episode has something useful for anyone who works on software in any language.

Do you want to try out some of the tools and applications that you heard about on Podcast.__init__? Do you have a side project that you want to share with the world? With Linode’s managed Kubernetes platform it’s now even easier to get started with the latest in cloud technologies. With the combined power of the leading container orchestrator and the speed and reliability of Linode’s object storage, node balancers, block storage, and dedicated CPU or GPU instances, you’ve got everything you need to scale up. Go to today and get a $100 credit to launch a new cluster, run a server, upload some data, or… And don’t forget to thank them for being a long time supporter of Podcast.__init__!


  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Go to to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Pete Hodgson about the concept of feature flags and how they can benefit your development workflow


  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what a feature flag is?
    • What was your first experience with feature flags and how did it affect your approach to software development?
  • What are some of the ways that feature flags are used?
    • What are some antipatterns that you have seen for teams using feature flags?
  • What are some of the alternative development practices that teams will employ to achieve the same or similar outcomes to what is possible with feature flags?
  • Can you describe some of the different approaches to implementing feature flags in an application?
    • What are some of the common pitfalls or edge cases that teams run into when building an in-house solution?
    • What are some useful considerations when making a build vs. buy decision for a feature toggling service?
  • What are some of the complexities that get introduced by feature flags for mantaining application code over the long run?
  • What have you found to be useful or effective strategies for cataloging and documenting feature toggles in an application, particularly if they are long lived or for open source applications where there is no institutional context?
  • Can you describe some of the lifecycle considerations for feature flags, and how the design, implementation, or use of them changes for short-lived vs long-lived use cases?
  • What are some cases where the overhead of implementing and maintaining a feature flag infrastructure outweighs the potential benefit?
  • What advice or references do you recommend for anyone who is interested in using feature flags for their own work?

Keep In Touch


Closing Announcements

  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at


The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Click here to read the raw transcript...
Tobias Macey
Hello, and welcome to podcast, the podcast about Python and the people who make it great. When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at linode. With 200 gigabit private networking, scalable shared block storage, node balancers, and a 40 gigabit public network all controlled by a brand new API, you've got everything you need to scale up. For your tasks that need fast computation such as training machine learning models, they just launched dedicated CPU instances. They also have a new object storage service to make storing data for your apps even easier. Go to Python slash linode. That's l i n o d today to get a $20 credit and launch a new server and under a minute, and don't forget to thank them for their continued supportive this show. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest and machine learning and data analysis. For even more opportunities to meet, listen and learn from your peers you don't want to miss out on this year's conference season. We have partnered with organizations such as O'Reilly Media chronium, global intelligence, a luxio, and data Council, go to Python slash conferences to learn more about these and other events and take advantage of our partner discounts to save money when you register today. Your host as usual as Tobias Macey, and today I'm interviewing Pete Hodgson about the concept of feature flags and how they can benefit your development workflow. So Pete, can you start by introducing yourself?
Pete Hodgson
Sure. My name is Pete Hodgson. And I'm an independent software consultant, as I spent quite a lot of time helping engineering teams figure out kind of how to build software in a more effective way and get that software into production in a more effective way.
Tobias Macey
And I know that you don't use Python for you primary language and they don't generally do too much work in it. But I'm wondering if you can just share a bit about sort of maybe how it first came onto your radar and any experience that you do have with it.
Pete Hodgson
Sure, first came onto my radar. Well, I guess I've been doing software development for 20 years. And I think probably around the same time that I was early in my career, I was trying to figure out how to kind of script things and automate things. And I think Python and Ruby were the two that I started playing with Mashie pearl pearl first, and then rapidly realized that I'd rather do something other than pearl. And so I played with Python, Ruby, and then actually event eventually got into Ruby, because I happen to be working somewhere that was doing rails development, and then didn't use Python that much. I've used it for a fair amount of kind of system automation stuff on and off through the years. And I've had a couple of clients that were building apps in Django.
And so I've kind of dabbled, dabbled with Python, Python there, but I certainly wouldn't consider myself an expert.
Tobias Macey
And before we get too far into the discussion about how to use feature flags and some of the advanced concepts, can you just start by describing a bit about what the idea of a feature flag is? And maybe some of your first experience of experiencing them and how it affected your overall approach to software development?
Pete Hodgson
Sure. So I think the way I think about it kind of fundamentally, a feature flag is a way of choosing between two code paths, normally at runtime. So you can think of it as a way to kind of dynamically choose or adjust the business logic in in a system in your kind of app without recompiling and redeploying that app, and we can probably talk a little bit about later on about kind of how dynamic that decision making processes but I think that's the kind of the fundamentals of what a feature toggle isn't or feature flag is. They they tend tend to have a few different names feature toggle on is the one I used to use feature flags. I think the time that's got more acceptance, or more commonly used nowadays, I've also seen it called feature bit feature flipper. Sometimes people kind of conflate a little bit a B testing and feature flags. So yeah, the naming, naming is kind of interesting. And I think my first experience with feature flags was years and years ago, working at a startup where we implemented them, I think they were called feature bits that that company and didn't use them for a few years after that we didn't we use them a little bit, but I didn't didn't really use them that much. And then I, after I went to that company, I worked at a consulting company called thoughtworks. And we were really big into practices like continuous delivery, and trunk based development and feature flags kind of play quite heavily into that. And when we use them in that way, they really, actually pretty radically kind of changed the way that we were able to To build software, mainly because they allowed us to, to get rid of feature branches or get rid of branching and kind of have just people working directly in a shared in a shared branch in our code repo and still be be able to kind of deploy software into production very frequently. So a lot of teams I was I was out of four weeks, we'd be deploying to production on a very regular basis, you know, like once a day maybe,
which was quite
at the time was was kind of quite thought definitely felt
like a pretty rapid tempo. And we were able to do that without using branches and without kind of having to slow ourselves down with that kind of stuff. And I mean, fish, fish flags were a big part of how we were able to achieve that.
Tobias Macey
And you mentioned being able to do trunk based development because of feature gating at in the code level rather than at the branch level. Yeah. And also mentioned some association with a B testing and wonder if you can just discuss some of the broader scope of the ways that feature flags are used. And just some of the ways that they'll manifest in somebody's code base.
Pete Hodgson
Yeah, I think feature flagging is really interesting. In my kind of experience over time, what's really interesting to me is like the number of different things that people use feature flags for, or kind of the number of different words that concepts that kind of get mixed around feature flagging, and I used to, I used to get quite annoyed when people refer to something as a feature flag, you know, in something that I personally kind of didn't consider to be a feature factor and a B testing, for example, I people would sometimes say, Oh, yeah, we're doing feature flagging, we use is a B testing framework, or, you know, we'll add it will add a feature flag to test out that wherever this the color of this button looks good or not. And I would get kind of grumpy and I would correct people are Well, technically that's not a feature flag, that's something different. And what I've kind of come to embrace over the time is these are all essentially, they come down to that same fundamental capability of choosing copass at runtime. And so if you If you think about like, what are the different things you can do that capability, the way I think about it, there's kind of these four main categories of feature flags. So the one that I, the main category that I kind of was was my, I guess my gateway drug for feature flagging was was what I would call a release flag, or at least toggle. And that's a way for an engineer to hide half finished code from from users. So the ability to essentially deploy a half finished feature into production without releasing that feature to your users because it's protected behind the flag. So I could I've tried to think of an example. So if I'm implementing a implementing a new login feature where you can log in via via kind of a social login to my application, I can kind of protect that you know, the button that says, you know, login via Twitter or MySpace or whatever, I could kind of hide that button behind a feature flag, and that means I can I can be working on that feature and applying it into into my production involved. And without expose deck the codes there that kind of up the powers that feature but I haven't turned it on yet by kind of flipping that, that flag on flipping that that feature flag on. And so I can kind of work in a shared branch work and even deploy that stuff out to production without worrying that it's half finished work at latent code is going to be shown to a user. So that's like the kind of the first use case release toggle. And then there's another kind of use case around kind of experimentation, where I might want to try try out a B test, for example. So trying to think of an example where I'd, let's say I want to see whether a new recommendations engine is more likely to to get my users to, to click on a recommendation. So maybe I'll have the old recommendation system and the new recommendation system kind of both deployed into into my my production system, and then I can choose whether to use the new recommendation system. Or not. And maybe I send 50% of my users to that new recommendation system and 50% to my old recommendation system. So that's, that's, that's kind of like on the one hand is feels like a really different use case. At the end of the day, if you think about, like, what's happening inside of your software is the exact same thing. You've got essentially Ika NFL statement or something similar to that. And you're at runtime, your code is deciding Should I send j go down this code path or this other code path. So that's that's the second use case is kind of experimentation. Third use case I've seen where people using feature flags for a kind of operational flags. So that's a way to kind of disable parts of your system or change kind of dynamically change how your systems operating in production without having to without having to redeploy your software. So if you have let's say you have two different ways of two different third parties are you have a third party that calculates what the shipping costs are going to be for your ecommerce package or your ecommerce software. And this vendor is kind of flaky, and sometimes they, their system goes down. And you want a way to kind of just dynamically just turn off that feature in production. If they're having a bad day, you basically want to just turn it off rather than having it be broken on your site, you could have a feature flag behind, that's kind of protecting that feature. And if you see that that third party system is having a tough time, you could just turn it off without kind of having to like redeploy your software or make some kind of urgent change to your system. That's, that's the third way. First thing, I see people using operations toggles. And then there's a fourth thing around kind of permissioning. So basically, depending on what user you are, I'm going to let you do something different. So if you're an admin user, I'm going to show the edit button. But if you're not the admin user, and I'm not going to show the edit button, or if you're a premium user, then I'm going to show you the free shipping option. And if you're not a premium user, and I'm not going to show you The free shipping option as kind of a simplification of probably what you'd want to do. But there's fundamentally there's this kind of idea of based on who you are, I'm going to either turn off or on certain capabilities. And that's that's one where I think a lot of people would say, well, that's not a feature flag, that's something different. And I kind of used to be one of those people. What I've found over time in talking to different organizations about how they use feature flags is there is kind of a benefit and kind of embracing the fact that even though these kind of scenarios are quite different, and the use cases are quite different, the fundamental capability is the same. And there's kind of a benefit and kind of thinking about these as different different types of the same thing.
Tobias Macey
And in my experience, a lot of developers will start to grow into using feature flags for a particular capability that they're building either because it's going to take too long to actually get the whole thing out into production in one fell swoop where they don't want to have a long lived branch that's going to be a pain in to merge three months down the road or because they have something that they want to be able to launch into a Production or pre production environment so that they can test out how it actually functions outside of their local development context. And they want to be able to turn it on and off until they're sure that it's working properly. And then for some of these other scenarios, particularly for a B testing or operations toggles where you want this to be more a long lived the way that you would actually approach implementing the toggling infrastructure is a lot different where a lot of times you might just have an environment variable or something that lives close to the logic branch that you're dealing with. And then maybe down the road, you have another PR that just strips it back out for these other use cases, you want to have something a bit more sophisticated. And I'm curious, what are some of the ways that you've seen people approach the overall introduction of feature toggles into their code base and maybe some of the anti patterns as to how people have used them in your experience?
Pete Hodgson
Yeah, I mean, I think so that that thing about like how people introduce it is an interesting one, and I think my my experience This has been similar to yours where usually these things initially get introduced into a code base or the concept of feature flags, a lot of times gets added by an engineer that wants to like you're saying they want to do some kind of, they want to work on this thing without having to create a big long feature branch, that's going to be a pain too much. That's one kind of like gateway. The other thing, it's quite different. The other entry point I see is usually from like a product manager who wants to be able to do some kind of a b testing. And those are quite different entries into kind of the concept of feature flagging. And so that that's kind of interesting, because it can, it can change quite a lot like how you get how you get started. And now I'm forgetting what the second part of your question was, or apart from the kind of the entry point. I'm
Tobias Macey
just curious about any sort of anti patterns that you've seen people use around how they either introduce feature flagging and implemented or how they're actually employing it in their software.
Pete Hodgson
So I think the biggest there's a few things I see people struggling with them. Most common problem I see with people who are using feature flags a lot is managing those feature flags and retiring keeping the number of active or the number of flags kind of in check. So as as you kind of as time goes on, you are motivated to add flags because you want to, you want to avoid creating a feature branch, or you want to try out a new feature experiment against certain number of users and that kind of thing. So you're motivated to kind of add them and then they there's tends to be kind of this kind of tech debt type thing where it's tough to, to clean them up over time. That's one very kind of very common challenge that I see people having is is figuring out how to track the flags in their system and how to actually kind of manage them and remove them over time. And kind of related to that is when people use the wrong technique for the wrong type of flag in terms of how they implement that flag. So I'll give you I'll give you two examples. So if I was doing a Released toggle release flag where it's like, I'm kind of just temporarily want to hide this thing. You know, so my social login, I want to kind of temporarily hide this thing for for a week or so while I'm while it's under development. And then once it's done, I'm going to turn it on. And I'm going to remove that, that that kind of the NFL statement from my code or whatever. If I know that this is a kind of a short live flag, then it's actually probably not terrible if I literally implement that with NFL in my code base. But if it's a, if it's a flag that's going to be used for a long period of time, so like one of those operational flags, it's going to be there for a long time or a you know, an admin is admin type thing. It's going to be probably in your code base kind of forever, then implementing that with an if else is, is is going to be really painful. There's going to get really painful over time, particularly if you've got a lot of flags over time. So I think people who, when I talk to people My talk about feature flags. And I find people who are skeptical about it. A lot of times they've been burned by a code base where people have just sprinkled FL statements everywhere. So I think that's like a big anti patent. And I think part of it is, I think, fundamentally part of it is, is people not thinking about what like, why they're creating this flag and how long that that flag is going to live in your code base. So that part of it is, is not kind of thinking about how long this thing is going to be there. And then I think the other part is just like teams not having the right kind of processes in place to clean those flags up over time either to figure out which Flags Over just like the management and kind of tracking like which flags are in use, which flags on or more of a process thing of kind of getting the time to, to kind of keep the Kobe's cleaned up and kind of keeping the campground
Tobias Macey
clean. And other thing too is that because of the fact that a lot of teams will just introduce Use flagging ad hoc to begin with that it then starts to catch on as a good idea of Oh, I can do this. And I don't have to maintain this long, long running brand. Yeah, so another person will implement their own flag, but there isn't really any consensus as to what the common entry point is or what the common design pattern is as to how we maintain these flags. And so somebody will put their flag in for their block of code that they care about, the next person will put it into the block of code that they care about, and then it can quickly devolve into just everybody putting the flags wherever they feel like rather than taking a step back to think about Okay, well, now that we're starting to use this, how do we actually design into a system that is easy to maintain where we have a common entry point, and we can clearly see what flags exist Yeah, code base, because otherwise, you might start to see basically the same type of flag or a very similar flag in two different places. And you don't really know which one does water for why
Pete Hodgson
Right, right.
Yeah, I mean, I think I described feature flagging, as it's kind of like a little bit of an iceberg where like, when you get started with it, it just it's very simple, right? You're just like Particularly because normally you get started. And it's just literally just one thing you want to you want to talk about. And you're just like, Oh, well, sometimes people don't even really realize that what they're creating is a feature. Like, they'll say, like, Oh, well, we've already got this configuration system. So I'll add an extra an extra fields like us new, whatever. And and then we'll just we'll just check the configuration here in this part of the code base, and if it's on them, or use it, and if it's not, we're not and kind of like there's this creeping thing over time, like, exactly like you're saying, like, someone else is like, Oh, yeah, we could do, we could do that config thing that we did, but for this other thing, and then eventually, there's like three or four places where you're, we're using that, that you're checking the configuration and making a decision on your code. And then someone's like, Oh, well, we need to do this per user. So you know, depending on who the user is, so I guess that we should add this extra thing and I kind of like overtime and kind of kind of boiling a frog, you get to the situation six months time where it's a mess, and no one ever decided that they wanted it to be a mess and no one consciously said let's let's just make There's messy, it's just us. It's slowly the scope of the thing kind of changed grew over time that the type of things that people wanted to do with it grows over time. And if you're not careful, you can end up just kind of taking lots of little steps, none of which seemed bad. But when you look back where you are in six months, you like, Well wait, well, you know, what the heck are we doing here?
Tobias Macey
And so for teams who are starting to think about feature toggling or already have something in place, what are some of the approaches to implementation that you've seen that you recommend? And particularly for these different categories that you outlined? Do you think it makes sense to have everything route through a single common implementation? Or do the different categories of feature toggles require their own implementation logic? I know
Pete Hodgson
that last part is a really good question. What I what I have found is there's a lot of value in is in kind of decoupling The reason that the place where you're making the decision from the reason that you're making a decision, so kind of decoupling that decision point from the decision logic. So what that means is, is in the middle of my code, or an area of the code where I need to make that decision, like, should I show? Should I show recommendation engine? A or B? That's really the question I should be asking, right? That code should say, hey, should I show this recommendation system or these other recommendations system? Or, you know, where should I? What, which, which class? Should I use to implement this algorithm or, or kind of get the result for this algorithm? That part of your code really shouldn't actually care about feature flags at all It knows is like, I know, there's two ways I should do this. Tell me which way to do it. And you kind of want to abstract over the reason for the decision. And then kind of the other side of that is somewhere in a centralized place. I think there's a lot of value in centralizing like that, the the why for why you're making a decision, and that Why could be, you know, we're using this recommendation system because in this context, this is an AB test and this purse this user for those current contacts, like the current request, in a framework, for example, falls into the cohort. So we're going to use recommendation engine A, this other user, the next request comes in, and this user is in a different cohort. So we're going to use the different requests recommendation algorithm. So I think that's that's one thing I think is really beneficial is to centralize and abstract the reasons behind the decision being made and, and kind of not let that leak into your code. I think that's, that's really valuable. And that's, that's again, that's something that if you kind of start with just a simple checker configuration value, and and call that call function, a function Be it feels like overkill to do that. But at some point, you've got to realize that there's a lot of benefit in kind of abstracting over the reason for the decision versus the put the place where you're making a decision.
Tobias Macey
And another thing that complicates the decision as to how specifically to implement the feature flagging is the idea of the overall lifecycle of the logic branch. Where for Something that you're only caring about being able to enable trunk based development where everybody can push to one common code branch without having to branch using their source control and just branch by abstraction. Instead, there's a high probability that once they finish the feature, they're going to want to take out that code branch. And so one suggestion that I've heard before is essentially, as soon as you introduce the feature branch, you introduce another pull request that removes it, but then you don't merge it until you're done with the code that requires it. So just so that you don't forget about the fact that it's there and that you don't want it to stay there forever. But for some of these operations, toggles or a B testing where it needs to run for a longer period of time, there's a high probability that you're going to maybe want to keep it in a code base, possibly forever. And then the other thing is if the code that you're writing isn't actually going to be deployed and managed by the same team writing it and as instead intended as a piece of software that gets deployed into a customer's environment Or if it's an open source project that somebody else is going to use, how you manage, documenting the existence of these different feature toggles and what they're used for. And so I'm curious what you have seen as far as beneficial actual implementation strategies and triggering strategies for some of these different types of lifecycle requirements.
Pete Hodgson
Again, like the thing that I've seen that that burns people on on feature flags is if l statements sprinkled through your code and your and kind of just it makes it really, really hard to read the code, it makes it really hard to even understand like, is this part of this code base every from cold. And so my, my advice in general is Unless Unless something unless you're really confident that this toggle is only going to be in your, in your code base for a very small period of time, and like you're basically making this decision at one point in the code, then, you know, in that situation, maybe just a good old fashioned if else is fine. If I should show the social login, then show it otherwise don't show it for anything more complicated than that of anything that's going to be in your code base for any period of time, then treat this like production code. So use the same good design patterns and kind of software design approaches as you would use for any other kind of thing you're doing in your production code base. So patterns like, like the strategy pattern, for example, is a kind of a classic way of doing this way you rather than saying, you know, I'm going to have for FL statements in different in different areas of this of this class. You just have to kind of implementations of, of a common interface or two classes, one that implements recommendation engine, a one influence recommendation be and then the consumer of that kind of functionality has just given one of those things and it doesn't know that it's feature flag it doesn't know Which way it's going, it's just kind of you basically you using polymorphism to to kind of implement that decision or that dynamic dynamism between those two code paths and representing it using polymorphism. Rather than using sprinkling FL statements everywhere saying that's, that's like a, really a really important strategy. And I think it kind of comes down to just embracing that this is just because you're writing, it's kind of like test is like, it's kind of like unit testing. Just because you're writing a unit test doesn't mean you're allowed to write kind of crappy code. Like you should still write good code. And the same thing with each fact just because it's a feature flag doesn't mean it's not production code, someone's got to look after maintain, it doesn't impact your system. So you gotta kind of treat it with respect.
Tobias Macey
Another decision to make particularly as you get further down the road of using feature flags, and they become more of your common development practice and part of your standard operations environment is the decision as to whether it's To continue to support a homegrown solution or start looking at third party libraries or service providers that might have more advanced functionality than what you want to build and maintain on your own. And I'm curious what you have found to be useful considerations when making that build versus buy decision for whether to continue using a homegrown service versus a paid service or a third party library.
Pete Hodgson
See, I think
this kind of goes back to that iceberg thing that I said earlier, where I see a lot of engineering organizations that are using some kind of home rolled feature flagging system. In fact, normally, if you if you if there are more than a small size, normally they have three or four home rolls feature flagging systems. I was actually just talking to a company the other day and in the when we started the conversation, I said, how many of these do you have? And they said one, and then I asked a question and they said why I guess maybe I guess there's two. And then I said, Well, do you ever do this thing? And they're like, Oh, yeah, I guess we have free and I was like, I bet there's someone who's doing this other thing. And yeah, I guess we probably have. So, yeah, they kind of, it seems it simple to get started. And then the complexity of these things grow over time. And my advice for the kind of build versus buy thing, I would, I would kind of, say, build versus buy versus, versus kind of borrow or rent, right? Because you can, you can use a SAS product, you can use an open source implementation, or you can build your own, unless it's the core competence unless it kind of touches on like a core competency of your company. Why the hell are you spending time building this thing? Right, like, I mean, this is just general advice I give to clients on build versus buy is, if it's just a commodity kind of table stakes thing that you need in order to achieve your business goals. Then look to Buy it, rent it, borrow it. Only if it's going to be a differentiating kind of feature of your product, should you be building this stuff yourself. Like you don't have time to mess around implementing your own database, you don't have time to mess around implementing your own UI framework, you actually don't have time to mess around building your own feature flagging system like why Why are you doing that? I'm pretty sure that it'd be cheaper and definitely cheaper in terms of opportunity costs to, to kind of buy one off the shelf or rented or whatever. So I think I I don't think there's that many cases where it makes sense to build your to build your own unless you are kind of a really large engineering org that has very complex requirements, you need to integrate with some custom internal metric system, or your entire business is built around showing unique things to each user or something like that. The most of the time, most companies I speak to they should just be using SAS product where they should be using an open source product.
Tobias Macey
And one thing that's always a consideration, particularly for production software is when you are relying on a third party service at some point that third party service ends up impacting your availability based on what their availability patterns are. And so I'm curious if you have seen anything similar happening with some of the SAS providers for feature toggles where some sort of system outage on their end might have an unintended consequence and you're running application? Or do you find that they're generally fairly good at maintaining current state when they're going through an outage? And the only thing that is impeded? Is your ability to actively toggle one of those features that is running in your platform?
Pete Hodgson
Yeah, so all the SAS products that I know of have thought about this quite a lot, because it's the kind of thing and engineering so quite a lot when they're trying to decide whether they should use this thing is like, well, what will happen if I can't talk to you? And it is, obviously it's a pretty important question. It's kind of definitely an important thing to think about it because it's Essentially, if if their SLA was directly tied to your entire system not being able to operate then then that's that's pretty bad. All of the SAS products I'm aware of have pretty good methodologies to to solve this. So they kind of store to store the state of the flags locally. And kind of like this, the systems that they use, the kind of the agents or the library that you're using locally has some kind of a built is basically fails gracefully, when you can't can't phone home. And I think this actually is a really good example of why you should be buying versus building because I'm pretty sure that the home grown thing that people build actually sucks more in terms of handling outages. And it turns out that when you're running your own software, you also sometimes have outages. And it turns out that companies who are dedicated to running a feature flagging Service, a probably going to do a better job of running that feature flagging service than your company that's dedicated to selling pants online. So I hear this kind of, I think it's a really valid thing to ask. It's a really valid thing to dig into and understand. But normally I think the outdate it's kind of the answer is that the flip side to what you would think it normally is a good another reason to be using a hosted product is because you can kind of essentially pay someone else to do the uptime, and the monitoring for their all of their user base rather than you having to kind of make sure that your systems are your internal whatever internal system you're using, is going to handle all of all of these kind of axes and edge cases that could happen.
Tobias Macey
Another thing to consider when dealing with some sort of dynamic system that can toggle your feature flags or feature branches is the question of auditability where if it all lives alongside your code base, and all you're doing is maybe changing a value in a settings.pi file or in a YAML configuration, then you can go back in time and see, okay, this is what the value was at this time, this is what it is now. Whereas if you're just toggling something in a web UI or sending an API request, I'm wondering what you have seen as far as some of the auditability, or some of the strategies for auditing those changes over time in that type of content? Yeah,
Pete Hodgson
I think it's a great, it's a really good question. I think like, in general, if you can, if you can make the flagging decisions static in terms of, for any request for this version of the code, it's going to go this way through the system all the way through the system. That's the ideal because because then you get that audit trail via source control, right. So if, if I if my decision as to whether to show that social login button is powered entirely by configuration that's kind of maybe hopefully checked into the same repository as my code itself or maybe this is kind of system repository, then you get this awesome audit trail and available audit trail, you also get like nice things around availability because you don't have some external system you need to talk to etc, etc. That's great. And I would, I would kind of, there's a, there's kind of an argument for, for doing that in the 31st the toggles or the flags that that work that way. The problem is almost always there's some need for that, for those, that flagging configuration to be more dynamic. So either, so it's like an operations toggle, for example. The ideal if you've got a really good Continuous Delivery practice and I just talked to a company the other day that does this, if they need if they're their hair's on fire, and they need to turn off, you know, turn off the external tax calculation vendor or maybe, you know, switch from recommendation system. Recommendation system B because recommendation system A is, is eating all of the CPU in the system. If you've got really healthy CD practices, continuous delivery practices, you just update the configuration in encode and you run it through your delivery pipeline. And that's how you make that change in production. And if you can do that, then good for you. That's, that's amazing. That's awesome. The most real life organizations, they need like an OSHA capability to do it at runtime without having to to make that configuration change. So Matt case, you need it to be more dynamic. And if you think about it, in terms of things like a B testing, and if you think about it in terms of toggles that are used to kind of incrementally roll out, you know what, let's roll this. Let's roll out the social button to 10% of our users and make sure that we don't get any five hundreds and then let's roll it out to 50% of our users, using using feature toggles for feature flags for controlled rollout you generally need it to be more dynamic than a code change. And so in that case, you, you basically need to get that auditability. From the, there's two ways to get it.
One way is to have an audit trail in whatever feature flag kind of management system you're using. So whenever someone updates, like what percentage of users should be getting this feature, or whenever someone kind of changes the flip the tab flag dynamically from off to on or vice versa, you you record that in some kind of audit log. So that's that's one thing and that's great. And could could well be useful for compliance reasons or whatever was probably more useful is observability. around your feature flag decisions. So at the point in which in the context of that you're making a flagging decision. So most commonly that would be while services serving a web request, you are deciding to do XYZ if you can include in your Logging in your metrics in your observability systems, that the state of those flags, then you get really, really rich insight, not just an audit trail as to what was happening, you know, this request had a 500 what was happening, but you, you get like, you get the ability to slice and dice and say, we're seeing latency is going up for a certain percentage of requests. What's the is that related? Is there any correlation between this increase in latency, and this feature flag that we flipped on five hours ago? That's like, that's like a real superpower, particularly if you're using if you're using feature flags heavily the ability to slice and dice your, your production system metrics, and ideally, your business metrics to write like, be able to be able to look at a graph that's that says our conversion rate or the click through rate on I recommend system dropped, like noticeably dropped in the last week? What feature flags? Did we change around that time or even better? Is there kind of like a correlation where the people with a feature flag on were behaving differently from the people that feature flagging off, that's super useful as a general capability. It's something that you need, if you're using these for a B testing, that's kind of the point is to say, what's the difference in behavior, depending on what the state of this this flag is. But if you generalize that, and again, I think this is a really good example of why thinking about feature flags broadly, you know, thinking about a B tests as being in the same conceptual bucket, or in the same context, contextual kind of area as a release toggle. We think about all of those the same way. Then you start saying like, Well, why don't we be like, why can't we do a B testing for a operational change? Or why can't we do abt For, for every feature, I think like Uber, I think it's boober have this kind of phrase saying that, like, every, every feature should be an experiment or something like that. So I think kind of what that gets to is the end of the day, you should you should be able to slice and dice, any change to the system and say, you know, the people that had this changed, how do they behave differently whether that was more errors, or increase latency or lower conversion in terms of people opting to put something into their cart? Like they're all fundamentally the same, the same kind of question that you're asking.
Tobias Macey
Yeah, and that definitely gets into some of the more advanced use cases, like you're saying, a B testing and being able to dynamically route traffic through a certain code path based on whether it's a cookie or a header or a user ID and I'm wondering what you have found to be some of the challenges that organizations or teams face as far as As how to implement some of those types of dynamic feature toggles and be able to track the appropriate metrics and getting a useful and effective feedback loop for when those feature toggles are causing problems or, or how to measure some of the user facing metrics or user interactions based on the feature path that they're going down and how that factors back into their overall development workflow.
Pete Hodgson
I mean, I think
like the biggest challenge that almost everyone, I haven't really seen people solving this The biggest challenge is closing that feedback loop between the state of a flag and the kind of the observed behavior like what actually happened, when, what what was the impact of this flag, and I think that what I see a lot is, I mean, like loads of places that I talked to, the only way they can correlate the you know, the state of the flat versus the versus the observed behavior is they, they have some kind of some kind of proxy way of measuring it. So I was just talking to a company the other day that roll out new features, they had this habit of rolling out new features to a single market first. So they're kind of controlled rollout was, let's turn this feature on for Denver. Or let's turn this feature on for for this specific cohort of users. And then we'll kind of use that as a proxy for what's the impact of that flag. So rather than saying directly, you know, how has latency? How does this flag reflected latency? You look at the things and say, how's the latency for users who are in Denver versus users elsewhere? Or do we haven't observably kind of statistically significant change in conversion rate for people in Denver since we rolled out this change, so people do correlation kind of few proxies? And the most obvious one of that is kind of a temporal proxy of like, we know we turn the feature on 10pm Well, you know, 12:55pm What was the behavior before 1255 and after 1255. So, now that kind of works ish, and and you get to do the same thing as well, you know, by market and say, you know, we turned it on in Denver, did we see anything happen in Denver, that's nice if you don't want to have it just entirely on or entirely off, you know, you want to do a controlled rollout or something where there's a bit of risk, but actually being able to like have a direct correlation between your metrics. And the hardest one is the what your business metrics, and the state of have that feature flag is, is something I think almost no companies I've talked to you have fully solved, and they've started to solve that. But they it's really, really hard. And I think part of what makes it hard is the organizational challenges of the people who are collecting the metrics are not the same people as the people who are kind of building these these feature flagging systems, particularly for those kind of marketing and business and product metrics. That's very different from the operations folks you care about. latencies for sample.
Tobias Macey
And then another challenge that we briefly touched on earlier is how do you handle documenting the complete list of feature toggles that are present in a system and identifying what they're for and what their current state is the current state being something that you can potentially introspect at runtime, but more generally useful is what are all the feature toggles that exist? And how do you avoid overloading their intended purpose by just saying, Oh, that's close to the code path that I want, I'll just, you know, piggyback on that feature flag rather than adding it in one. Just some of the overall strategies of making sure that everybody's aware of what the toggles are, what they're for, and sort of what are the criteria for deciding when to add a new one?
Pete Hodgson
Yeah, I mean, and that's kind of a an example of that iceberg thing of the first it seems like this is a pretty small set straightforward product for the engineer can bang out in on a Friday afternoon and it will be a kind of a fun little project turns out that there's all these extra capabilities you need. So the real answer for a lot of companies is they have like a spreadsheet, and they check it in a spreadsheet, or they have a wiki page or something like that, which is a really crappy answer a more mature feature flagging system has a way of, of adding metadata for each flag. So so things like which team owns this flag, super useful, because then you can actually go and ask that person or the you know, the technique of that team or the product manager for that team, like, Hey, you guys still using this for you guys and gals still using it? So being able to attach metadata, I think is is is really useful there. Other metadata that's useful is a description like what does this flag actually do? It's amazing to me that there are some systems that don't let you actually add like a textual description. It's just like the name. So hopefully, you know, like, hopefully you're good at one of those hard computer science problems of naming things. Other things that are really useful is in terms of managing flags is when was this flag created? And when do we expect This flag to be retired. So it goes back to that thing of certain flags, you're expecting to only be in your code base for a few weeks, a few days, maybe certain flags you expecting to be there for the next two years and being able to ask the system, which flags should have expired by now, but as still being used is is really helpful in terms of kind of keeping hold of that or keeping in check that kind of that tech, that kind of thing of flags growing over time. So I think those are those are some really kind of useful ways of kind of extra information of managing those. And a good feature flagging system has that capability. Even if you don't even have your feature flagging system doesn't have that even if your Homeworld thing or the open source tool you're using was actually using doesn't have all of those capabilities. The next best thing is to just include that information in source control next to a place where those flags are defined normally. This is probably less less true in a language like Python, which is kind of more dynamic, but if you working in a static language The, the way these a lot of times the kind of the rubber meets the road with the feature flags is somewhere where where you've got like an enumeration or Mac or, or something that kind of list all the flags in the system, or in the code base that or this part of the code base is aware of, and lets you kind of say is this flag on or off and you can include in source code, that extra kind of contextual information, even if it's just in a comment that's useful, it's there somewhere. And of course, if it's in source code, then you've also got that audit trail of, you know, when it was created, because you can, you know, use get log or whatever, and you can even sometimes infer the owner by doing a get blame. So that's, that's a second best thing. I'm not sure if I totally covered all of that with your question.
Tobias Macey
No, I think that that was very useful. And another thing that is useful in the idea of when did this toggle get introduced, when is it supposed to go out of use is the idea of being able to market flag as being deprecated of we don't want to support this anymore. It really needs to go away. Please stop using it.
Pete Hodgson
Yeah, yeah. And I feel like there's this. I've talked to teams that have these kind of extreme ideas of like time bombs, right where like, if the flag is still in use past its expiration date, then like, refused to start the application or something like that. And I was talking to a company the other day about a feature flagging. And they said, Yeah, we did that. And inevitably, it just meant that people kept on updating the expiration date. Which is, which is, you know, it's sad, but it's also I still think there's value in that because you're at least making making it up, you're raising awareness of this issue, right? Like, it's not like it's like the difference between something that's going moldy in the back of the fridge and you can't even see it versus something where you open the fridge and it stinks, you know, like, at least at least you're getting that smell. And you know that eventually someone's going to say like, God, gosh, we really gotta clean up this mess.
Tobias Macey
So talking about feature flag, it's very easy to get into the mindset of Oh, feature flags are fabulous. I'm going to use them all over the place. They're wondering There's no downside. So what are some of the cases where adding a feature flag is just not worth it because of the additional cognitive overhead or possibly performance overhead or just the overall just difficulty of implementing and maintaining feature flags in a code base? I'm curious if you've run into any situations where somebody was using them and you said that's just a horrible idea. It's much simpler if you just have this one if statement.
Pete Hodgson
Yeah, I mean, I I definitely think that teams like teams who have mature uses of feature flags with using been using them for a while one of the things I one of I think very common themes you'll you'll hear is like we try and keep the number of flags in check. So having like literally having a whip work in progress limit where you say we're only allowed five active flags is is legit and I think teams are doing that would tell you there's a lot of places where you could put it behind a feature flag, but do we really need to so if something is, you know, there's an argument to be to be made for where this is going to be really fiddly to be done with a feature flag and we're okay with We are going to go in with our eyes wide open on creating, you know, putting a pause on production deployments for a week for this small system or making this long live feature branch even though we know it's going to be a horrible match. I'm pragmatic enough, I think to say, even though I think in general, those are bad practices. There's times when it's good, you know, you know, the rules well enough to break them. And I think there are definitely places where you could use a feature flag, but it's better not to, I think there's places where it's better to where you could use feature flags, but it's better not to in general, there is there's definitely a lot of places where you can be smart about where you put that flag. And you can also be smart about other ways to sequence your work. So a flag is not necessary. So let's say for example, our social login feature actually needs like four different changes in the system. We've got to add like a new, a new kind of like back end gateway that goes to this authentication system. We've got to change something We're gonna add some tables to a database or something. And then we've also got to kind of put that that piece
of UI, and you only need to put a feature flag behind that
last piece of putting the UI and you can do all that other work without a feature flag. If you're confident that it's not, if you're confident that you're going to detect breakage before you put it into prod, or you you're kind of the risk of breakages is not so much that you're you're kind of worried that you you you need to be able to instantly turn it back off. Again, you can make all of the backend kind of behind the scenes changes directly on master if you're doing trunk based development or no virus series of small feature grant, you know, feature shortlist feature branches that are checked into master make all of those kind of like setting up the setting up the background ahead of time. And then the last thing you do is add that bit of UI that shows that social login button, let's say and, you know, if you're a lot of times that last piece doesn't even need to be feature flag diver because it's like a very small change. You can land that in a single, you know, single click Or a small feature small shortly feature branch. And if you don't feel like you, again, if you don't feel like there's a risk, there's a high risk that you'd want to kind of pull it back straight away. And if you know that you don't need to do any experimentation with this or controlled rollout where you only roll it out to 5% of your users or whatever else, then don't bother with a feature flag, just do all the work behind the scenes. And then do the last piece that kind of finally surfaces that feature to users at the very end, then you can avoid using a feature flag. And I think that that general technique of these are sometimes called branch by abstraction techniques, those general techniques a useful even if you are going to eventually put the entire feature behind a feature flag doesn't mean that all the back end pieces have to be checking that feature flag, they can just sit there and if they're being used, then presumably the feature flag is on if they're not being used and the feature flag is off. And for anybody who wants to dig deeper into feature flags or learn more about it, what are some of the useful pieces of advice or references that you recommend? So the two of us already touched on it. But I think the two pieces the two pieces of advice I I have, which sound a bit contradictory, but I don't think they are. The first piece of advice I have is just start really simply don't don't start with open source framework. Don't start with a SAS product. If you're literally just want to get started, just start with a simple FL statement in your code or something like that,
and, and get comfortable
with the concept of feature flagging. But as soon as you realize that you're really that you're that this is a useful thing and you want to kind of start using it more broadly do not kind of take those small series of steps that eventually end up with you hand rolling the 720 fifth half assed feature flagging system that's inside of a company somewhere at that point, stop and don't build it yourself. Look at open source libraries look at SAS products, and yeah, don't I just don't so many people build this thing themselves and I think it's just because I enjoy doing it.
Tobias Macey
The not invented here problem.
Pete Hodgson
Yeah, I mean, I don't mean it's not even an invented here thing. It's just it's a fun, it feels like a fun sized, it feels like a fun size problem. It's normally something which is a little bit below the radar of product managers because it starts off as like an engineering internal feature. And so people can kind of sneak it in and it's an but it has enough justification that you don't have to kind of lobby your your kind of product manager or your product owner or Scrum Master or whatever, for permission to do it. And so I think, I think honestly, I think a lot of times it's a fun thing to for people to build. And so they end up building themselves and they kind of ship Be honest and not do it. And they also just don't realize how much work it's going to be. Like I worked at a company where the feature flagging system had been built by an intern and just death by like, you know, we were a boiled frog. Six months later, we're trying to figure out how to get our production systems to be reliable with this code had its origins in someone summer project, so it's not not a good look. I think the other recommendation I have which kind of we touched on already is to is to prefer static configuration where you can if you can have a feature flag flipped on or flipped off by a code change, that's great, because you get to leverage all of the quality checks and safety checks that you have in your delivery pipeline. When you make that feature, flag change, just like as if you'd made a code change. So you flip the flag on and then you watch the, the new, you know, all the tests pass the integration tests around, and you if you've got performance testing, you check that your performance testing hasn't been impacted. That's really nice. If If you can do that via code change, because you get all that stuff for free. versus if it's a dynamic flag, then essentially, you're just like, banging stuff something into production without doing any of the tests that you would do for a code change, which is kind of scary in some ways. And then like my last piece of advice is to read up on kind of beyond feature flags and kind of read up on trunk based development practices and continuous delivery practices and There's a really good website called, I think it's trunk based, which has a lot of good background material. The book continuous delivery is a little bit old now. It's not a but it's an you know, it's great book has got a load of really good stuff about just broadly kind of continuous delivery practices.
Tobias Macey
Yeah, I'll second that continuous delivery book recommendation yet. The content is a little old, and it has some references that might be a little outdated, but the core principles are definitely still completely valid and still useful to read up on.
Pete Hodgson
Yeah, the only thing that i the only thing I was just rereading it the other day, or rereading a section of it the other day, and the one thing that's in there that I'd be interested to actually talk to Jason Davis, you they still agree with this. The one thing that's in there is kind of it advocates for using release branches as a way to kind of orchestrate releases. And I think that that is probably the advice is now a little out of date because the CI CD systems that we have today have kind of like delivery pipelines as a first class first class kind of being in the system. So summit, some of it is some of it is just kind of like Aha, that's all SPN and they're not all that good, because get wasn't around at the time. But some of it is just like a little bit that, you know, the state of the art has moved on a little bit. But that's like maybe 1% of the content 99% the content is just amazingly, super valuable. Super, super useful.
Tobias Macey
Are there any other aspects of feature flagging or your experience of using them are working with companies who have implemented them that we didn't discuss yet that you'd like to cover before we close out the show?
Pete Hodgson
No, I mean, I think we, I think we talked about, we talked about a lot of stuff. I think the main thing is the main thing I want people to try and get their heads around is that all these different types of things are kind of fundamentally the same thing. And building it yourself is probably a fun thing to do, but not necessarily the right thing to do.
Tobias Macey
Always good advice. All right. I think the only other thing that we didn't really touch on is how feature flags manifest in your tests. But I think anybody who is doing testing can figure that out fairly well as far as just make sure that you have a test that sets the flag to on and set it to off and make sure that both branches have the Expected functionality.
Pete Hodgson
Yeah, I mean, I think that, you know, we could probably spend another hour talking about this, actually. But the other thing that I would say is, this is another one of those iceberg things is it is worthwhile adding some kind of awareness of feature flagging into your testing system. So things that I've seen that are really useful is the ability to kind of tag a test is saying this test should be run with the feature flag off and on, and it should behave the same, or the ability for a test that the having your feature flagging system make it easy for a test to temporarily override the state of a flag. And likewise, you talked about this already, but having the ability for a manual tester to temporarily override the state of lag in order to verify different things that investment in kind of in feature flagging systems kind of like supporting testability is definitely a worthwhile thing to do. Sometimes it's not something that is top of mind for Engineers if they're not also doing his party testing, but it's it's a really good
investment to do that kind of stuff.
Tobias Macey
Well, for anybody who wants to get in touch with you and follow along with the work that you're up to, I'll have you add your preferred contact information to the show notes. And so with that, I'll move us into the pics and this week I'm going to choose the circuit playground Express from Adafruit. I started playing around with one few weeks ago with my kids and spend a lot of fun and been able to use the circuit Python distribution for experimenting with flashing the LEDs and doing all the fun things you can do on them. So if you're looking for a fun little hardware project that's inexpensive and easy to get started, where they recommend checking those out. Then with that, I'll pass it to you Pete Do you have any pics this week
Pete Hodgson
plus one on on Adafruit plus one on all of the socket Python stuff is super cool. Also, I'm just a huge fan of Adafruit in general and they've got a load of just really awesome documentation on there. Like the kind of learning section of their website is amazing and it's amazing how much stuff they do for free. So yay, I love Adafruit man My recommendation is a book or my pick is a book called accelerate. And this is this is written by Nicole falls green, Jess humble. And Jean Kim just humbled is also one of the co authors of that continuous delivery book. And today are the people that were behind that the DevOps report that has been coming out every year for the last few years. It's amazing because they using actual science to kind of validate what kind of engineering practices work and what don't in terms of like org sort of making more money at the end of the day, they actually kind of like look at like, which organizations are performing better from kind of like a either a kind of a capitalistic perspective, like they make more money than their peers or from a kind of a social perspective, what are they doing better? And then they back that all the way to what are the engineering practices that that
correlate, or not even correlate, that drive, that kind of success,
and they really dig into the details of what that means. So they don't A lot about things like continuous delivery, but also about kind of cultural aspects of the company. And it's super useful. It's great advice and it's advice that I kind of generally, most of the stuff they talk about stuff I kind of generally agree with anyway. But it's it's extra wonderful for it to be stuff I agree with. It's actually backed by real science that shows that it's true rather than it just being like, you know, this is well argue case. And so I'm going to kind of hope that they're right about it. So really recommend that book. It's really fun to read as well. And if you're a stats nerd, it has a lot of stuff at the back around how they actually did those stats. As that's one pic. And then the other pic, which is very, very self serving is an article that I wrote for Martha was website, which is just an article about vegetables. So all the stuff we talked about, but some more detail about some of the implementation practices. It's a little bit old now, but I think not that much has changed since I wrote it. So people want to learn, get into more details about vegetables, and they should they should read that. They should be a page and I think also the last Everything is in the pic is, you know, I really love talking about this stuff and I love. I really, really love hearing about what people are doing in the real world around feature flagging. So if anyone who's listening to this is kind of got questions or thinks I'm talking nonsense about one of the points I made, definitely reach out to me and I'd love to chat more about it.
Tobias Macey
Yeah, I'll second the article actually read through a bunch of that, get ready for this conversation. And I'm definitely gonna have to add that accelerate book to my reading list. So thank you very much for taking the time today to join me and share your experiences of working in this space, something that's definitely useful to engineers working in any language. So I appreciate your time and I hope you enjoy the rest of your day.
Pete Hodgson
Absolutely. Thanks so much for having me on.
Tobias Macey
Thank you for listening. Don't forget to check out our other show the data engineering podcast at data engineering for the latest on modern data management, and visit the site at Python podcasts. com to subscribe to the show up for the mailing list and read the show notes. And if you've learned something or try it out a project from the show then tell us about it. Email host at podcast and with your story. To help other people find the show, please leave a review on iTunes and tell your friends and co workers
Liked it? Take a second to support Podcast.__init__ on Patreon!