SE Radio 492: Sam Scott on Building a Consistent and Global Authorization Service

Sam Scott, CTO of Oso discusses authorization challenges with host Priyanka Raghavan. They discussed basics such as definitions of authorization, RBAC, ReBAC and differentiating with authentication. Sam also described the Google Zanzibar engine. The host quizzed Sam on whether to build an off the shelf authorization service or build a custom one. Sam deep dived in Oso, the open source authorization service and the solutions it offers to companies wanting to have their own database but to run a rule engine for authorization. The show finishes with tips and advice for maintaining consistency and latency and golden rules for building a Global and consistent authorization service.

Show Notes

Transcript

Transcript brought to you by IEEE Software magazine.
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.

Priyanka Raghavan 00:00:16 Hi, this is Priyanka Raghavan for Software Engineering Radio, and my guest today is Sam Scott. He’s the CTO and co-founder of Oso, where their mission, from LinkedIn and several other blogs, says “we’re making security easier for developers.” Sam has a PhD in cryptography and he regularly talks on authorization and has also appeared on several podcasts. Today we’re going to be talking about how to build a consistent and global authorization system or service, and subsequent challenges. So welcome to the show, Sam.

Sam Scott 00:00:51 Thanks for having me. It’s good to be here.

Priyanka Raghavan 00:00:53 Is there anything else you would like listeners to know about yourself before we jump into all things, authorizations?

Sam Scott 00:00:59 I think you about covered it. I mean, I guess, you know, one of the things is I’ve been myself coding for about 10 years or so now? One big changing event to me was when I discovered Rust. I had actually kind of resigned myself to not be a decent programmer. And I thought I’d maybe stick in academia and research, but kind of found Rust and that like unlocked a bunch of things for me.

Priyanka Raghavan 00:01:19 We’re definitely going to ask you a lot more about that soon, but you know, let me just take you a little bit through SE radio right? So in SE radio we’ve done a couple of episodes on securing your EPI’s. We’ve done an episode on OAuth2, we also had a show on open-end up, but we’ve never really focused a show just on authorization. So, I thought, you know, we could start right from the top with maybe the first question I’ll asked you, which maybe it’s after repeated, but what is the difference between say authentication and authorization? Can you define the terms for us?

Sam Scott 00:01:52 That’s a great question, especially because a lot of the time people just say Auth and they sort of maybe either one. So, it’s good to clarify. So, authentication is the process you go through identifying who somebody is. You know, so typically this means you log in with username and password, or you maybe log in with, like a social login, you know, Facebook, Google, it’s kind of that process through which, you know, the website identifies who you are. On the other hand, so authorization is kind of the bit that comes next. Now that I know who you are, what are you allowed to do? Those are kind of the two pieces and what gets really challenging here as well, you just touched on a few technologies like, OAuth and L dot where it starts blurring the lines between the two a lot. I mean, you know, OAuth is technically primarily around authorization — it’s sort of a different context. It’s granting consent to all the people to access your data on your behalf. Also use sometimes for doing authentication, like, you know, can I check you have this identity and the two times get very, very mixed and very exchangeable.

Priyanka Raghavan 00:02:51 Okay, so maybe then you can break it down. Another terminology, which is often seen in literature is RBAC. What is that? Can you define that?

Sam Scott 00:03:00 RBAC stands for Role-Based Access Control. It’s a pretty broad term. It captures the, this sort of the idea that you, instead of just saying directly, like what can a user do? You sort of group a bunch of users together under like a, you know, a role as what’s called you basically define what that role can do instead. And so, you imagine you have a website with like millions of users. You maybe just can segment them into two different user types. You have members and admins that those might be the two different roles. And then it’s like pretty easy to say, members can do
XYZ, admins can do ABC. And so, there’s kind of a way of structuring the authorization logic.

Priyanka Raghavan 00:03:35 You said you have a group, I guess a group of individuals we’re hoping entrepreneurial role, and then you can give them, what can they do? Like, I guess, on a resource, I guess, would that be right? So, another thing that I see is, especially when I was researching before talking to you, was there was another term called ReBAC, which is Relationship Based Authorization. So, can you also define that?

Sam Scott 00:03:59 Yeah. So, in some ways you can think of relationship-based access control, ReBAC as a more general version of role-based. So, you, instead of just saying, you know, grouping users by role, you start saying, well, like, what are the kinds of relationships exist between things, maybe uses and resources or resource and other resources. A simple one might be most websites, if you post a comment or something, you can normally also delete that comment, or maybe even edit that comment, you know, but most users can. And so it’s kind of saying, well, you have a, you have a relationship with that comment. You are the one who submitted the comment. And as the submitter, you’re able to make some changes and it’s kinda like different versions of that. So maybe you have a good example would be like in a Google drive like system, you have, you know, files belong to folders, folders on to other folders, folders belong to organizations.

Sam Scott 00:04:49 Like all of those that are relationships between the data as it’s kind of different forms of relationships there, it’s like, you know, maybe it’s a user resource, you know, you created a document, it’s a resource to another resource, you know, document belongs in a folder. You maybe throw a role in there as well. And it’s, you know, you’re a, an admin of a folder and the document belongs to the folder. You can do certain things. I think this kind of separation between the two is resolute than you think? Like a lot of what I just described could also sort of be broadly captured in the, uh, back in the role-based paradigm. Like it’ll sound similar, like, oh, I’m in, I’m an admin of the folder and documents in the folder, like sounds very rolesy. But I think kind of go to actually a little bit of granularity and saying, well, actually we’re describing something unique here by describing this like file system hierarchy is, is what ReBAC is about.

Priyanka Raghavan 00:05:34 Great, another doctor, this picture is the authorization model where whenever I read a lot of articles or people, as they talk about how important it is to have a good authorization model to enforce things. So, can you maybe define that? So, all of this, do you put that into an authorization model?

Sam Scott 00:05:55 Yes, absolutely. So, the, I kind of think about it is to start with the way I described authorization is, is this entirely open-ended question who can do all it’s like, okay, like where do I begin? So, the idea of authorization models is really just sort of identifying patterns that are very, very common out there and sort of giving you a little bit more structure on how you go about answering that question. And so really what it allows you to do is sort of say like, you know, if your outfits this kind of a shape, like if you are building a B2B app, you will probably want to use role-based system, like very, very commonly. And, you know, you should think about it as users have a role inside the organization and they can do these things. You sort of taken this problem, which is entirely open-ended completely open to someone to solve how they want to into something, which is like now has existing patterns, existing templates, and like give someone a little more structure to approach it. And that’s kind of what a model is about.

Priyanka Raghavan 00:06:50 So once you have a model, I guess it becomes easier to meet positions or enforce policies.

Sam Scott 00:06:58 You’ve touched on a few things. You can kind of break up how you go about implementing authorization into a few different pieces. So, the way we talk about it is this kind of a separation between enforcement and decisions, enforcement. It’s the lines of code you add inside your app that kind of ultimately do you show a page to a user or do you redirect them to an error page? That’s kind of like the in Folsom piece, like what do you do once, you know, if a user can, is allowed to do something or not, the decision piece is like, how do you make the decision if they should or not? So, like, how do I answer the question? Is this user allowed to see this page or this document? When we talking about that kind of like decision mechanism, that’s really where those authorization models come in. It’s like, how do I go about answering that question? Like, what am I trying to get to the policy piece that you, you touched on plays into kind of the specifics of how you actually implement that decision mechanism? Like you might say, okay, I want roles. How do you wanna implement that? You could use a rose library. You could use a policy engine is kind of multiple different approaches.

Priyanka Raghavan 00:07:58 I think I understood that it’s very important to make that distinction. Okay. That’s great. So, let’s another, we’ve got this definitions out of the week, or maybe you can do a little deep dive into see one of the main parts of this podcast, which is also about the Google Zanzibar system, which I think we’ll jump into. But before that, you know, we had the traditional way of doing authorization was held up when you have a group of users, belonging to one particular part of the org, and then they are given access to resource. How does that differ from see what, I don’t know if the right term is to call it new age authorization, or maybe it’s been around for a long time, but new age services that focus on reading authorization house, how is it different?

Sam Scott 00:08:41 That’s a great question. And in some ways it kind of comes back to what I was saying earlier, you know, authorization, it kind of does cover a lot of different topics. I think what we’ve seen has changed in the last several decades really is, has been a huge shift from a lot of software development being done. In-house in Thailand side organizations, all lots of it being ready, an internal thing into, you know, the explosion of like software as a service and people building software products. And that’s kind of the shift of authorization models. Like, so, you know, things like LDAP, active directory, that those kinds of technologies, you know, it’s an interface for managing and assigning users to permissions, to groups, to roles, things like that. The focus is on like, you know, I have a set of users, how do I, you know, give them the right permissions?

Sam Scott 00:09:29 How do I group them together, assign them to resources, things like that, which makes sense for like in a, you know, internal it team inside enterprise, and these get the sales team access to the sales folder with the sales reports, things like that. When we talk about the kind of new authorization systems, we’re talking about a different, different set of people we’re talking about, you know, developers, building applications, full their end users. Those are now like multiple organizations, not just one organization. And so in some ways, you know, what we’re talking about is how, how do people build things like LDAP into that wraps? You know, so LVF is kind of, it’s an existing part know integrates with different systems, but now it’s like, well, you’re building your own app. How do you want authorization? How do you want users to like manage things inside your application? So, if someone enrolls the create new organization and they want to happy with the groups, like what’s their experience and held up, wasn’t really designed to be used in that way that it would be like embedded in other products. And in other ways is not the easiest thing to use. So, it’s not like something that people have like jumped to try and integrate into their applications to.

Priyanka Raghavan 00:10:34 We can now jump into a little bit into the Google Zanzibar project in the show notes of this podcast. I will add a link to the people that came out from Hulu. There was several things that stood out from the people, but maybe we can take it one by one. The first thing, what I wanted to ask him was if you could tell us a little bit about the Zanzibar system before we jump into a few things, can you just talk a little bit about that?

Sam Scott 00:10:59 Sure. So, Google’s a bar is the name of a system developed at Google. They released a paper on this in about 2019. I think it was it’s the central system for how Google does authorization across their entire fetal applications. So, you know, everything from, you know, Google drive and Google documents to, you know, YouTube, for example, and it’s a phenomenal piece of engineering. There’s two pieces that I love to call out as like the real core innovations. I mean, one is like this will the mind mindblowing scale the, um, to the needs to happen for Google to do authorization at Google for all those services. Right. But I think the second one, it touches on what we were speaking about earlier. As far as I’m aware, it’s going to one of the first kind of republics really spoken about implementations of relationship based access control. I think people spoken about it in the, of like social graphs, like Facebook previously, but I hadn’t seen anyone who’d wrote written about it in quiet, such as sort of B2B focus oriented the way like they had at Google. So, it’s kind of, there’s really like, there’s two core pieces, like the model that they spoke about and they shared, and like how developers at Google build apps with the system. And then there’s just like the feats of engineering to make this thing scale.

Priyanka Raghavan 00:12:15 I think they photo something called is a relationship to bull. And I think there was another thing called namespace configuration being gone from a mathematical background. I’m sure you understood that piece where there’s a lot of scary equations, but I think essentially it was describing the relationship to people as well as I think how they group that underneath this. Maybe you could just still give us a little intro to that.

Sam Scott 00:12:39 Yes. I went back around in, in academia, but I had to read that paper numerous times until it finally kind of fell into place. It’s not a light concept. It’s a really interesting system. So, I mean to start with the problem with solving, we spoke about Reback relationship-based access control Ellia. And I said, you know, it’s like a file belongs to a folder. A user has a role for a folder. A document is inside a folder. You know, things like that. A lot of that is important stuff. There is the data who has what role for what and what folders this file belong to and things like that. And these are very varied data sources that can look completely different, like me stole it and in managing in multiple different ways. So, if you want to start thinking around, how do we write a consistent abstraction or role, these kinds of things.

Sam Scott 00:13:25 You have a really hard problem on your hands because you now have like 500 different teams across Google Ryan, different applications for different pups is using different backends and stuff, all these kinds of things. And you’ll try and say, well, that’s, you know, I want to be able to say that there is just like one single way of doing authorization. There’s one consistent authorization. And so the way that Google went about solving this is they basically first came up with a kind of unified data format and that’s there’s relationship to polls. They basically said, we’re going to have a storage engine for that data. And this is going to be the format it’s basically going to be consists of three things, the subjects, the predicate on the objects. So, the subjects will be like the, you know, say the user, the product, it would be like, you know, the admin role and the object would be a folder because like pretty much all relationships, you can kind of, kind of think of them like a graph, you know, this like sort of draw a line between the years around that folder.

Sam Scott 00:14:19 And the, like the value on the, on the edge would be like the admin role or something, you know, file to a folder. And that folder is like the parent or something. And so they kind of said, well, we just can take all that data and you just need to like, send that data to us and put it into this format. And now what we have is just like one data store that has all of these different data formats in one consistent way. Now that we have it in this format, like we can basically build an authorization system on top of it. And that’s kind of where the namespace configuration piece comes in. It’s basically a configuration format for developers to like tell this central system kind of telling them, like, what does, what does this data mean? And like, how do you combine it and like compose it together.

Sam Scott 00:15:03 So, you know, the kind of thing you might say is if you’re an admin of a folder, then you’re an admin of any file inside that folder. And so the configuration format kind of allows you to express that kind of logic is how you kind of take those data pieces and kind of combine them together, putting all that stuff together. You now have like the core of this authorization system. It’s not through the way to authorization, but it’s like a long way there. It’s like, what relationship do I have with this piece of data? This authorization engine can make, take the configuration can take all of this data and do that expensive computation, like, you know, a customer, this file belongs to this folder. This file wants this, follow, this followed launch this organization. Oh, and this user is an admin of that organization. So, tracing all that through, you know, doing that recursive query. I can figure out that like, yes, you are an owner of this file. And while our, your bill document opens and you know, your house go and edit it.

Priyanka Raghavan 00:15:56 The other part of where, you know, the Google document opening quickly is also a big thing. I think the thing that fantastically stood out in that people was of course the scalability challenges and the numbers, but they threw out there just to quote a few of it. I have to read it from here. I think it said something like thousand 500 namespace, which is defined by hundreds of clients and they look to Chilean relationship tuples off about a hundred terabytes and it was able to handle 10 million client queries per second. So, in this regard, a hundred ask you your take on how important is it for a good authorization system to place importance on say scalability and consistency.

Sam Scott 00:16:40 I mean, those numbers kind of speak for themselves, right? I mean, if you think about the, the volume of data is not surprising when I just spoke about what it consists of files, billing to folders users, having wrote like a lot of that is really core application data. And so you’re saying, you know, this is a central data store that is handling a big chunk of Google’s data, you know, and that’s why there is Tentree and whatever it was, relationship that’s, you know, database he collected from every app at Google. And then you, you, you know, I think around the 10 million client careers for seconds, authorization is something that is happening on or should be happening at once. If not multiple times on like every single request that goes to any application, like anytime a user reads a piece of data, you should be checking, are they allowed to read that anytime they try and do something, you need to be checking, are they allowed to do that for any application?

Sam Scott 00:17:29 Authorization is always going to be this piece that is effectively kind of like kind of a fixed cost on the scale of your application. It’s like not something that just, you know, it’s painful at a certain stage. It’s like, no, it’s always going to be, it’s always gonna expensive. And it’s pretty common basis for any, for any team that like authorization is going to be this like hugely impactful performing piece of that, their app. And I think, you know, basically what Google did by putting this all into one place and being able to measure as you can kind of see the impact of that

Priyanka Raghavan 00:17:56 Full disclosure here. So, at my day job, I also have been working on building up kind of unified authorization service integration tool. Right now. It’s tough trying to build something genetically is challenging to say the least. So, in your experience, should you build your own authorization service or should you repurpose something, any advice on that?

Sam Scott 00:18:21 The challenges there really aren’t that many existing solutions out there. I mean, that’s obviously something that we’re trying to change. We’re trying to address by building building stuff. And there’s a kind of a few other people out there trying to do the same, the challenge, a lot of people face that we’ve spoken, we’ve spoken with is companies rather share like 80% of like, kind of the same logic, the same core pieces. And that’s tough. You could maybe try and like standardize, or you could build like one product for and solve, but everybody has their own like 20% of like special, unique to that business things they need to handle. You know, whether it’s like some billing based sludge Eric, where if you’ve paid for a sand, Hey, you get access. Maybe it’s around like usage quotas, maybe it’s around. I know you have a specific like collaboration sharing model.

Sam Scott 00:19:04 Like everyone has these unique pieces. And so it can be really hard to find something like off the shelf that like just works. You know, you take something like, like a soundbar. And as I say, it gets you like a lot of the way there, but it doesn’t actually even like fully solve your problem. I think the key key problem that like Google’s answer parcels is before men data lookups over like very relational data, very like hierarchical, nested data, especially if you want to do that across like multiple applications. It’s like one of the things that Zanzibar was needed to solve with like my, get these specifics wrong of this. But it’s like, if you’re embedding a YouTube video inside a Google document, then you might still need to do authorization on the YouTube document and the Google document. And like, kind of do do these things together and say, Hey, kind of like messiness, but that’s like one, one big piece of Zanzibar is going after, but it doesn’t kind of actually help you go do that.

Sam Scott 00:19:54 Like final piece of what different permissions do you have inside a Google document? And like, what can, what kind of editor a document do versus what can a viewer can do? And like what should the user interface to apply? Like what pieces of data should we share to the user? Like, should we tell the user if they can do this or not, the challenge with authorization, is there like so many different parts of the problem to solve it cut across so many different parts of the stack? The it’s sort of hard to talk about, like, do we build an authorization service or not like, should someone solve this or not? It’s around finding the suitable pieces to attack like the different POS department you have Zanzibar. I think specifically is very good at solving that like scaling authorization for deeply hierarchical, nested data, or like a global scale. But it’s also not like a template to build something and solve all your problems.

Priyanka Raghavan 00:20:43 What I’m hearing is there’s just not any one magic bullet authorization, challenges,

Sam Scott 00:20:49 Proportionately,

Priyanka Raghavan 00:20:52 Actually building the relationships, trying to work it out. I guess the modeling is super important before you jump into any implementation and enforcement. So now I guess I should move on to the meat of what to talk with you about, which is also what is also, I need to Jude on Twitter. That also means then in Spanish, what’s the,

Sam Scott 00:21:20 Yep. I was always indeed the and Spanish, the name came about, I mean, honestly, because as you mentioned at the very beginning, right? Like what we’re about as a company is making security easier for developers. And when Graham and I have grabbed my co-founder and I were like coming up with names, I kind of had like one requirement. I was like, I don’t want to be named like, well, there’s all the security companies that sound scary and intimidating. And they always have like some security naming them, a fortress or things like that. And just, they really kind of want to feel like they’re selling you on like this, you know, where the secure solution needs to be scared and trust us. Like, that’s not what we’re about. We’re about like mainstream more accessible for developers, empowering them, giving them tools that help them like take away the bits they don’t want to solve and kind of makes it easier for them to build. And so that was my requirement was that we had something that was kind of a little bit more friendly and accessible. They have them the gram and I both speak Spanish. And so we just, and we wanted something that was like kind of easy to remember and easy to spell and randomly came out with. We’d also just kind of fell in love with it. Okay.

Priyanka Raghavan 00:22:23 So what can also do,

Sam Scott 00:22:25 Yeah, so obviously there’s an open source framework for authorization and the kind of core of it is how, when you figure out how to build authorization into your application, if you call pieces of this, we’ve touched on a lot so far, but so we have, for example, a policy language called polar that has a lot of like kind of building blocks and built in pieces to help you with that modeling piece. So, for things like roles and relationships that we’ve touched on, we have like specific patterns built into that to like help you get started and to say like, here’s the backbone of your authorization model here is, you know, that 80% that you should just kind of do because the best practice and like kind of, you know, we’ll give you that piece. The policy language piece of it means it’s like super flexible to do all those kind of like extra custom things you’re gonna need to do.

Sam Scott 00:23:08 You know, you can, you know, if this thing is public, you can see it. And if the, you know, whatever this was created in the last 10 days, and you can’t edit it well, you know, whatever the custom logic needs to be like the policy engine can help you do. And so that’s like on the modeling front, we support like a number of different languages and we have kind of a bunch of enforcement APIs that really helps you with the, how do you actually make an integrate authorization? Like interapplication like, how do you make sure you’re tying the right pieces of data? How can you make sure you’re taking the right action at the right time and kind of making sure you don’t get anything wrong.

Priyanka Raghavan 00:23:37 So, you’re talking about patterns. What is that? Can you just explain that a little bit more? What do you mean that?

Sam Scott 00:23:43 So, we have this concept of a resource block. The policies you write with ours are very results centric. Cause a lot of authorization tends to be like, can I see this thing on? And so we have, you know, syntax built into language to express, Hey, I ha I have a resource. These are the kinds of permissions that users can have on that resource. You know, they can read, they can ride. They complete, here are the roles that I want to exist on this resource. And like, here are the relationships that this resource has with other ones, you know? So, you know, I have a document and it has a parent folder as we kind of have like syntax that helps you like specify that like declare his here’s everything I know about my results now that you have kind of like that piece in place, we make it like super easy to, to write your authorization logic from that, you know, a user can read this document if they are an admin of the parent folder. Like you literally just write, you know, read if admin on parent, that’s simple to write that model. So, we kind of give you those, like building blocks desks, you can kind of get you a coral summarization model built out really quickly.

Priyanka Raghavan 00:24:42 So there’s a lot of examples and things of that we’ll start

Sam Scott 00:24:45 Off. Lots of dogs, lots of examples. Lots of like yes. On pool applications using this things like that,

Priyanka Raghavan 00:24:50 The things that you talked about when you’re chatting right now, you may need to do authorization at every level. Like it could be right from the UI to the business logic tool maybe at the database. So, does also do give you like patterns for that as well. Like

Sam Scott 00:25:06 Yeah. That’s coming, coming back to the, the enforcement API I was talking about it just kind of provides them a context for folks listening and something. I think people often overlook with authorization is just like how far can spread throughout an app authorization done really well. It can be like, should be bubbled up, like all the way to the user interface. So, you’re going through on the website, you’re trying to do something and maybe the button is grayed out and you hover over. And it says like, you know, only admins can delete this, that right there is like your corporatization logic. That’s being like presented all the way up to you. I’ll use a, I think it was, people have seen this as you try and do something and you get some like horrible pop up, like you can’t do this. Talk to your administrators.

Sam Scott 00:25:42 Like what’s, this becomes like a really interesting challenge with authorizations. It’s like, do you want to apply best practices and like decouple it and keep it separate from your, all your business logic. But it it’s so pervasive. Like how do you solve that? The way we let you do that with also, so you express your logic in this policy language, and that’s kind of it’s by itself kind of separate from the app. But when you need to integrate with the applications, we have like different API at different layers. So, one that’s useful for the, for like the front end is you can get, you can ask for like a list of all the things that use a conduit and then you can like, pretend that’s the front end and use it to like make UI differences. The Cole one is basically just authorizing, can it use to do this thing or not?

Sam Scott 00:26:20 But then we also have, you know, things like an authorized query API, which basically allows you to get your policy, your authorization, logic, also physician policy expressed as like a database filter. So instead of just saying kind of users, see this one thing you say, like water, all the things this user can see I’m actually then going to go and make that as a database query, I’m going to the database handle the heavy lifting of like filtering down my, you know what? I have 10 million records down to the five. The user can see, you know, something that people often over look, but it’s to do that well, requires you to have a system that is like a web that needs to be used. And then there’s like multiple different things.

Priyanka Raghavan 00:26:55 I was actually going to ask you some stuff on filtering, or maybe I can ask that specifically this, because, um, you know, we talked about having organization and every layer I think at this latency becomes a big problem, getting huge amounts of data. And then what do you present to the user? What can also help with this? You have any sort of filtering specific tips and tricks. I don’t know, do you have a filtering query language or something like that?

Sam Scott 00:27:20 So what we have is without wanting to go too much into the weeds of this, we sort of have an API that allows you to say, instead of passing in a concrete object and saying, can the user see this? You basically just have us like the type, the results type document you pass in the arguments being like use, uh, maybe the string read this type document. And basically when also goes and evaluates the authorization logic you wrote in the policy, it goes through. And basically instead of like concretely checking, you know, is this, what are the fields on this resource? It kinda just like accumulate stories as a, as a set of filters and it can do this across like pretty complex expressions, right? So, the, um, you know, expression, it gets back might be, yes, they use a, can see old documents that belong to a folder where the user has the admin role, all of the member role on that folder.

Sam Scott 00:28:12 We do have like an internal representation fold out that does look kind of very similar to your, at least in the latest version looks quite similar to like a relational algebra, like a SQL kind of thing, but it’s mostly just like an internal thing. And then we basically have an API that allows you to kind of hook that up to you, all data models. So that typically is very easy to do with something like an RM, but we’re also making it possible to go like direct to SQL with that. The kind of cool points is, yeah, we sort of returned this intermediate expression that like, kind of looks like a data filter, but we have like an API to hook that up into existing data models. And the idea being that you basically get to do this like rich complex authorization, you know, where all the things I can see, but you’ll you’ll application data like stays exactly where it is.

Sam Scott 00:28:54 And this is kind of the big, like conceptual difference between also and like Zanzibar. So, what I was describing Zanzibar as a developer, your responsible for basically migrating and moving all your data, all your application data into the central service. It’s very convenient to do that because now it’s like, well, now my data is structured so I can do authorization for you. We kind of push ourselves. We challenge ourselves to say like, actually I think we don’t need to have to deal with migrating synchronizing data. Data should stay where it is. That’s the easiest way to be able to make like dynamic up-to-date decisions. So, instead we’re going to do that walk of like figuring out how do we plug in and integrate with you all today.

Priyanka Raghavan 00:29:30 Okay. It’s interesting. So, you also will actually tie in with any of your data models on your database and provide you this query processing, I guess that’s again, something that you have. And I guess there are many examples on the documentation on

Sam Scott 00:29:48 Exactly. Yeah. What can be really nice about this as well as it’ll say sometimes less, you do like your data fetching Andrew authorization in one shot, because like, typically you’re going to need to it, you get the request, you pass the parameters you are and fetch some data from the database. We actually make it possible to, in that one, go fetch the data from the database and apply the authorization logic while you’re doing it. And so kind of the authorization overhead, there is just the little kind of like extra query parameters. There is no like separate latency piece needed.

Priyanka Raghavan 00:30:18 Okay. Okay. Interesting. So, so you’re seeing that. I mean, there could be, I mean, I think the agency will not be an issue if you June the battery does. Okay. You’ve built this whole authorization logic using your declarative languages and then you’ve got implemented. How do you check if this is the right behavior? Do you have any testing framework that can check that it’s doing that? I think,

Sam Scott 00:30:48 Yeah. So typically people just use the existing built in test frameworks and whatever language and framework they’re using. The thing that’s particularly nice about it, because what the approach does is decouples that logic from the application codes, it’s basically very easy to unit test your logic because you can just write a test that says, create a new dummy user, make them an admin, and then check. They can read a thing. And if that works, you know, the application logic is going to work too because I using the same methods, they using the same API APIs. It basically makes it very easy to kind of do that. So, of unit testing logic, it is something though that we want to kind of walk and develop on more is actually go beyond that and be able to let you sort of do kind of like property test style things.

Sam Scott 00:31:27 What you say. I want to check that any, you know, I use a, must have a role, some roles to be able to access this kind of resource or something. And we can just like, just check that over to the policy itself as sort of a step towards that. Like now we, we’ve basically been adding a lot of sort of validation checks to our policy right now. So now we’ll tell you, if you try to assign permissions to a role as an exist, or if there is a permission that doesn’t have any, it doesn’t have any roles assigned to it, or you’re missing a piece of logic, like what we’ll give you those kind of like built in verification checks to make sure you get it right.

Priyanka Raghavan 00:32:01 I should’ve asked you this in the beginning, but one of the things that is leading me to work, I, I drove, this is a problem with a lot of people. Is that a good rule of thumb on how many rolls one can have?

Sam Scott 00:32:12 This is a great question. I would say as few as possible, as few as possible, we spoke around like very high level. Like what role based actually control is. But I think the thing that’s like really important to do with, with role-based access control is it’s effectively a way to communicate to your end users, like what they should expect to be able to do inside an app. And that’s often pretty clear in like an admin member Wells, as you know, maybe like admins can, other users, admin can rename the organization that can delete the organization and members can typically use the app, do the things they need to roles work really well when they communicate to an end user, like what their responsibilities are inside an application. I think I get labs. The version control development platform has like some really nice examples of this, where they have guests.

Sam Scott 00:33:01 They have like reviewers who can review. They have like moderators who can moderate and update and maintain as you can maintain the code. It’s, you know, those names, like they kind of tell you what you should be able to do. The mistake people often make is they just keep on adding more and more roles to try and address more and more different use cases. And like you end up with these giant matrices of people trying to like figure out what they need. Often the answer, if you’re finding the you’re trying to add more roles, actually find often the answer can be adding more data model structures. So, for example, adding things like teams or projects, because then like users, can they kind of get a little more control of that themselves. They can create a sales team project or something, and they can add a bunch of resources into the sales team folder and they can give, you know, the head of sales, admin access and just that folder. And then that person can go and see like legal documents in a different folder. Adding that is very intuitive. It’s very natural for people to understand that it’s kind of an easy way to go about it rather than try and preempt those and build them as like a bunch of roles yourself basically.

Priyanka Raghavan 00:34:01 And I do know that there was this one that I was working on, where they had so many roles that one person had to onboard a new user. They didn’t know what to do. And they just went and gave them system admin. That was the one that could understand. Now that we’ve covered a few of these things. I want to get into a bit onto the deployment and see consistency. So now that you’ve had this authorization service, so from what I understood is you can have it embedded into your application as all, you know, have it outside, but what is the thing? Should it be a separate service so that you can scale it up and down and things like that? Or

Sam Scott 00:34:44 I think it sort of depends. We sort of have like a bit of a golden rule of Oso, which is you should build authorization around your application, not the other way around. So, doing something like Zanzibar, where you pull your data outside of your application, all you refactor a migrated applications to use a specific authorization system. To me is a little bit of a naturopath because you’re basically pushing so much of the work onto the developers to understand how to use the system. How does it synchronize with it and things like that. You know, I think there are ways you can kind of mitigate it less than that, but Jeremy speaking, I think you should let like the shape of your application drive, how authorization works practically, how we often see that playing out is it’s very, very common for people to have like a central user management service where users can get assigned to organizations and roles within that and projects and things like that.

Sam Scott 00:35:35 Cause that’s just typically how like the, the application got structured out, you know, and then the individual applications that bill inside there and services and things like that, those services need to get that role as data, to be able to make an authorization decision, which is kind of a form of centralization, but it’s using the existing structure that is inherent in your app. You’re saying, okay, well we have, we haven’t used a management app basically, what do we need to do to make it so that other applications can use that for authorization decisions? How do we get the roles from that, from that service? And that basically allows you just to keep the structure of your application as it already was to have the same separation you already, you know, the architecture you are going with and basically figuring out how to make that work.

Sam Scott 00:36:14 I think basically the weather road meets the road of that is at certain scales, that in itself does become a challenging problem. Like it’s not just a, oh, whenever we just scale that one app up. And that does reach a point where you do need like the sophistication of something like of like a Zanzibar to Rudy to make that service, to have that low latency, to have that scale, to be able to process that data. The point is like you identified that based on the size and the scale of your application, mostly other than just like you too, because of authorization. And like clearly people reach that scale. I think there’s been a few others, Airbnb who’ve similarly reached that scale and I’m like, okay, this is becoming a huge pot of huge bottle and economic location. Like we need to like solve this problem to me, that’s kind of the right way to approach it is like, you sort of see what your application covers naturally and like what shape it’s taking. And once you see that, that is a bottleneck, you look for the appropriate technology to address it

Priyanka Raghavan 00:37:07 As a big

Sam Scott 00:37:09 Database, a huge part in it.

Priyanka Raghavan 00:37:11 And also these other optimization services, like I think already what I saw, which also seems to be built on Zanzibar. I see a lot more, how does also digital from other services like this that you see on Open Source platforms like YouTube,

Sam Scott 00:37:28 The conceptual difference of where does the data go? Whereas the data live with also, we went to great length to let you keep the data where it is and figure it out how to do authorization around your data. Basically, like using your data, leaving where it is. I love some of the new solutions coming through based on Zanzibar. They will work as a separate service. Like ours is no different. You need to put all your data into the service that it can do. Do some of this authorization stuff. This is kind of an interesting landscape of solutions because there’s been a long history of existing authorization libraries that tend to be fall specific frameworks. Like I think, you know, rails has often had a pretty rich history of like goods authorization, like libraries integrate with rails. They could do various pieces of this or frameworks and very much focused on like role-based access control, but not much else.

Sam Scott 00:38:16 I think what we’re seeing from these from like also in others is really trying to tackle the kind of authorization, complexity, like the scanning problems that comes with doing more complex models that gives you more, more granular control, more granular access control, things like that. So, I think that’s like kind of the, the main trend we’re seeing from a lot of these new ones. In some ways we’ve kind of tried to stick with the tried and tested approach and keeping it in the app versus kind of, I think others which have so seeing Google and being like, well, maybe we should do a Google. Does.

Priyanka Raghavan 00:38:45 I also tried one of the libraries that you have supported on the ducks, I’ve tried the nauseas washing. And then I saw that you had no darknet support. Maybe you don’t have. So, I just wanted to ask you why is that?

Sam Scott 00:39:01 We’ve had so many requests with done that support actually it’s, it’s almost suddenly the, like the next one we’ll get to the challenges and you did build and design our, so that it could be like easily. I love the core of this is built in, in Rusts and a shared between the multiple different libraries. So that like, you know, adding a library isn’t, you know, like linear amount of work, it’s like, you know, sub linear, like small amount of work to at each library. But I think what we’d be starting with is we’ve, we kind of, it’s writing a law and making a ton of improvements to the individual libraries that we’ve already sort of started putting more efforts into like a few of them, same particular like node Python and go, I think again, a lot more of our attention recently, a lot of the newer stuff we’re trying out there first, rather than trying to do like six things simultaneously, if we did.net, we know that we wouldn’t be able to like, keep it up today at the kind of the latest and greatest version. So, we’re sort of waiting for some of the new features to settle a little bit before expanding out to other languages so that we can kind of like release with a mature feature set.

Priyanka Raghavan 00:40:01 So it’s in the pipeline.

Sam Scott 00:40:03 It is, yeah, it has to, he has, it’s highly requested. We have, we have a get hub to track it that’s I think is the most highly voted issue. It’s we’re definitely aware of that.

Priyanka Raghavan 00:40:13 Yeah. Finally, I guess I had to ask you about the choice of language that you written or did just talk a little bit about it. Why did you choose the last for writing

Sam Scott 00:40:25 One pod is I just think Rust is a real pleasure and a real joy to develop in, in particular like the totaling around it. And the developer experience around is fantastic. Like the error messages from the compiler are very, very useful and they guide you in a good way. That’s got a good linting system. The package management is very easy and things like that. So, I just think it’s a, my Joyce program in, from a more technical perspective, the advantages offered align very nicely with what we needed. So, you, number one, we’re building something that’s like on the critical path, full security, you know, being performance, it was important, you know, protecting against memory vulnerabilities. Well, it’s like a huge bonus that it would offer being able to write something that we had, like fine-grained control over. That was pretty low level. Like all that was like very compelling benefits that it would offer.

Sam Scott 00:41:13 And then the thing that sealed the deal really was that we knew we wanted this to be something we had imbed in multiple languages, the most natural way for us to do that would be to have some kind of a C API interface, as soon as you need that, that really limits your options down to a few different things and felt like, kind of, Russell’s like the best of the bunch for doing that. One other thing as well was just ready, convenient, nice from like the runtime models, because it’s kind of just compiles down to effectually, like a system library that’s kind of have basically it looks under the surface. There are no like background threads. There’s no, run-time, there’s no extra processing deadlocks happening. Like none of that stuff’s happening under the surface. It’s like, oh, sorry, only executes when you’ve told it to do something. And if you’re doing something, it does nothing. It’s just, it’s, it’s not blocking or consuming resources or there’s no garbage collection or anything like that. It’s just very, very, very predictable. You kind of know what it’s going to do. You’re not going to see bizarre like latency spikes and things like that.

Priyanka Raghavan 00:42:13 I think so that’s on one of my bucket list. This has been great before I let you go. Just wanted to ask you one final question. So, to sum this whole conversation up, are there any like some three golden rules that you would see for billing, consistent authorization system?

Sam Scott 00:42:33 The number one rule of authorization is you ability for your end uses first and foremost. Like that is kind of why I love this area as well. It’s like, I think the end user experience should be what drives a lot of the decision making here. Like what is ultimately we’re building products. That’s kind of the main thing that matters. And so that, I think just kind of leads you to think around, like what is going to make sense to the user? Is it route roles? Is it relationships? Like how, how would you explain this to a user? And I think basically models help you have like a consistent version of that on the backend we spoke about like architecturally, you know, building authorization around your application. Not the other way around to me, that means primarily the not leaving the application data where it is being able to reuse or repurpose what you already have.

Sam Scott 00:43:18 So, you’re not duplicating effort. Your application is already capable of fetching things from the database. Your application already has a way to determine if somebody belongs to an organization that has a data model that says that, how do you make sure we can reuse that inside one app or across multiple applications? You know, how are we distributing data across our architecture? That’s how authorization should be distributed. I think number three is something we haven’t really touched on yet, but such a huge part of this as the developer experience as well, you know, authorizations and interesting one, it’s that like, you know, at a bigger team you’re going to want your secure engineers, like helping out, helping order your code, check that it’s all correct. But ultimately like authorization is so interwoven in the application that you’re going to want that developer experience to be good.

Sam Scott 00:44:01 And it’s such a core part of it. And so that means you’re providing them something that’s easy for them to work with. It’s easy for them to update and modify one that comes up a lot. It’s like if a developer is adding a new feature, they add a new end point and need to be able to check, know, make sure the, add a new permission that the user can do this thing or not. You will not to be like instinctive and intuitive to them. You want them to be able to test that locally, to figure it out, to kind of get that working. You don’t want it to be like, they send some PR off into the void where it gets like reviewed by some person three weeks later and it comes back to them and all like, you know, if the permissions break in production that they don’t know why, like it’s a really crucial thing to get. Right. And it, it forces you a little bit to reevaluate how you think around, like who owns the authorization code. Basically. I think what goes hand in hand with that and in how many that actually comes back to the first point, which is the developers, the ones building, they have to the context of the product. They’re the ones building the product features. They hopefully are geared in on what the users want. And so like empowering them to be able to make good authorization decisions, kind of like, can I that full cycle.

Priyanka Raghavan 00:45:09 So thank you so much for coming on the show.

Sam Scott 00:45:11 Thank you for having me. This is fun.

Priyanka Raghavan 00:45:14 Signing off Priyanka Robin for Software Engineering. Thanks.

[End of Audio]

SE Radio theme: “Broken Reality” by Kevin MacLeod (incompetech.com — Licensed under Creative Commons: By Attribution 3.0)

SE Radio 492: Sam Scott on Building a Consistent and Global Authorization Service

Show Notes

Transcript

Join the discussion

More from this show

SE Radio 730: Birgitta Boeckeler on Harness Engineering for AI Agents

SE Radio 729: Garth Mollett on AI Supply Chain Security

SE Radio 728: Clare Liguori on the AWS Strands SDK for AI Agents

Menu

Recent posts

Search

Search

SE Radio 492: Sam Scott on Building a Consistent and Global Authorization Service

Show Notes

Transcript

Join the discussion

More from this show

SE Radio 730: Birgitta Boeckeler on Harness Engineering for AI Agents

SE Radio 729: Garth Mollett on AI Supply Chain Security

SE Radio 728: Clare Liguori on the AWS Strands SDK for AI Agents

Menu

Recent posts