Matt Frisbie, author of Building Browser Extensions, speaks with host Kanchan Shringi about browser extensions, including key areas where they’ve been successful. Based on Matt’s experience as a developer working for Google, Doordash, and a startup he founded, they examine tools for building extensions, as well as APIs they have access to. The conversation presents detailed issues such as cross-browser compatibilities to keep in mind when developing extensions and mechanisms in the browser to prevent security vulnerabilities, and finally examines how emerging platforms can help developers take advantage of exciting new possibilities with web extensions.
- Matt’s book: Building Browser Extensions
- Matt’s article on chatgpt extensions
- Plasmo Extension platform
- Chrome Developer Docs
- Converting a web extension for Safari
- Matt’s twitter
- Matt’s LinkedIn
- Matt’s website
- Matt’s GitHub
Transcript brought to you by IEEE Software magazine.
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.
Matt Frisbie 00:01:03 I think that’s good. Yeah, I’m ready to talk about the book. Excited to be on.
Kanchan Shringi 00:01:07 So, let’s just start with what are browser extensions, and can you give us examples of popular extensions and possibly key industries where extensions are most popular?
Matt Frisbie 00:01:18 Sure. So, I say in the book that browser extensions are strange and powerful parasites, and I think that really captures the nature of them. So, they’re these pieces of software that are mostly written with standard web technology, and they sit on top of your browser and they can do a lot of things — just about anything. Really the only restriction is kind of the permissions you give it. And I describe in the book, they’re kind of a hybrid of a website and a mobile app. And so, from there you are really only limited by your imagination because I think most software developers don’t truly appreciate how powerful this software is. There are some limitations and we’ll get into that in the podcast, but I think that they’re really underrated as a platform, in general. So major industries, so according to Google, almost half of Chrome users have at least one browser extension installed.
Matt Frisbie 00:02:15 And I would bet any amount of money that the most popular extension is an ad blocker because that’s, most people do not want to see ads when they’re browsing the web, and ad blockers are extremely effective at that. So, that’s by far the most popular format. But obviously that’s not a money maker — it does tend to be free and Open Source software. But there are large companies that are based off of primarily browser extensions. So, the largest one that people have heard of is probably Honey, which is an extension that will automatically — well, it does a lot of things, but the thing that it’s probably best known for is it automatically looks up and tries coupon codes when you’re shopping online and then to get you the best discount. And PayPal bought Honey for 4 billion dollars just a couple years ago. So, Honey outgrew the browser extension platform, but that was still definitely its primary piece of software.
Kanchan Shringi 00:04:19 So you mentioned Honey; you said they started as a browser extension, and now they have some other mediums, but starting as a browser extension is useful because it’s right there where the user needs them. And that certainly sounds very useful for React or any other dev tools as well. So, you talked about Chrome, we’ve used the word browser extension. So, can you develop the extension for one browser and expect to be able to run it on all others?
Matt Frisbie 00:04:49 Right, so the landscape is a little bit complicated right now. So no, there’s not a way to write it once and have it work everywhere. There are certain platforms that are trying to get closer to that, but there are idiosyncrasies that are unique to each browser. You have to do it slightly differently. However, the browser extensions have pretty much coalesced around the web extensions API, which was the successor to the Mozilla’s original extension language was XUL and XPCOM, which was a much more extensible and, some would argue, a superior platform that was able to customize almost anything about the browser. That has given way to the modern web extensions API, which has a smaller interface. It’s still quite powerful. And then that’s kind of the meat of what extensions use to do what they do. So, most browsers do support that API, but there are quirks that require special considerations for each browser.
Matt Frisbie 00:05:51 So, there’s kind of levels of compatibility that you can strive for. So personally, if I’m just one person working on an extension, if I publish an extension in the Chrome web store, which is by far the largest platform, automatically I can address like 80% of desktop browsers because you get Chrome right off the bat obviously, which is probably about two thirds of traffic. And then you also get all the browsers that are built off of the chromium open-source browser engine. So that gives you Opera, that gives you Edge, that gives you Brave; there are some others but just right there you’re addressing 80% of traffic. And so, that generally can be a single code base. So, that gets you pretty far. Where it gets tricky: Firefox, which they are still transitioning to manifest V3, which we’ll talk about later, and then Safari, which is its own animal in and of itself. They have a kind of a wacky way of deploying extensions, but it’s powerful and pretty new. So really the three large buckets if you want to get as much traffic as possible would be like Safari, Firefox (those both require special deployment), and then kind of all chromium extensions, you can almost have a unified code base.
Kanchan Shringi 00:07:05 So you mentioned manifest V3 — and we certainly will dig into that — but it might make sense to just define what the manifest file is at this point so there’s some context.
Matt Frisbie 00:07:15 Sure, yeah. So, the manifest, that’s like the core piece of a browser extension. So, when your browser extension’s loaded in the browser, the manifest tells the browser kind of where everything is, what it’s supposed to be able to do, some basic details like the name and description, icon, and things like that. Yeah, it’s a pretty simple file, but it’s kind of the glue that holds everything together. So, Manifest V3 is the latest iteration of the manifest. Obviously, it comes after Manifest V2, and the transition is kind of controversial. So, the Manifest V3 push is being pushed by Google — pretty much, exclusively, et started there. And so, the way it was initially announced was that is, oh it’s going to improve security. Oh it’s going to improve performance so we’re making all these changes, but it’s pretty obvious that one of the main intentions is to kind of start pushing people kind of away from ad blockers.
Matt Frisbie 00:08:10 So, Manifest V3 kind of wrapped into this transitional period is a phasing out of the primary API that powers AdBlockers, which is the blocking web request API. So, the way that all AdBlockers work, more or less, is that you can give uh, an extension permission to manage every individual network request going out on the browser, so it can see everything. And more importantly it can manage everything. So, an ad blocker, if it’s installed, give it permissions with the blocking web request API. So, if it sees an outgoing request to Doubleclick, which is the Google ad server, it can go, oh no that’s, I’m not going to serve that request, I’m going to block that. And then the browser is very well equipped to go, okay network requests fail all the time, so we just won’t load that.
Kanchan Shringi 00:11:20 That sounds really interesting, though I can imagine from what you said that V2, and I’m hoping the V3 continues to be powerful in terms of what the extensions can do, but also comes in with security best practices — because certainly when there’s too much that the extensions can do, there’s also a fear of, if you don’t have the best practices or you don’t have the best intentions, that things could go wrong. And that could reduce the amount of trust that customers have. So maybe if we can just delve into some security best practices when developing extensions and then we’ll go on to our next topic.
Matt Frisbie 00:13:38 The amount of information that a properly permissioned extension has access to is quite profound. So, it’s not a joke when they say that if they have the proper permissions they can see all your web activity, everything you type, everything, the page, everything, all things they can send requests on your behalf. Not saying they do — all the top extensions are very trustworthy, and it’s a great place for open source to kind of instill trust in the user — but they have access to everything. And so, extensions like Lastpass that are recording your passwords, presumably to keep them safe, they are accessing a lot of valuable information, and they have to be good stewards of that information. And any extensions you build that have access to this important information, it needs to be protected because it really is … You know, how much of our life is spent online? If it can sniff everything you’re doing, that’s a really big attack vector. So, really being aware of what you’re storing and where you’re putting it, that’s probably the biggest security concern. And the big LastPass breach recently just kind of underscores what can go wrong when you’re not a good steward of data.
Kanchan Shringi 00:14:44 So, what’s the consumer to do? How do we know to trust an extension?
Matt Frisbie 00:14:51 It’s a good question. So, there’s really no one single way of doing it. So personally, all the extensions that I actively maintain, they’re all open source, so you can see everything that’s getting packaged into the extension. I find that that, I mean not a lot of people will take the time to go through the code, but it just being there is kind of a okay, I, I can see what you’re doing. And so, there’s a certain level of trust that that instills. So, the way that extensions display permission messages, some people have problems with it because it’s a little bit aggressive and it is. So like for example the tabs permission, which is a very common permission to request. So you can open new tabs, close tabs, move them around, whatever it is. I think the Chrome permission for that will say can view your entire browsing history and something else, something very scary sounding.
Matt Frisbie 00:15:43 So, I think one big thing is pay attention to what the permission warning messages are telling you because that’s really the last line of defense. That’s like if it’s saying it can see all your browsing history and, read all your webpages, whatever the message is, like that’s real. If an untrusted person gets access to this extension, they really can see everything and cause a lot of problems. And the reason I bring this up is that it’s pretty common for extensions to be purchased, or people to attempt to purchase them, just to get access to the users and permissions. So, for example, I launched a chatGPT extension recently and it got a bunch of users right away cause I was pretty early out of the gate, and I had people contacting me looking to acquire the extension. I don’t know what they were going to do with it, I didn’t say yes to any of them, but I don’t think their intentions were good because I think it’s a pretty common pattern for someone to come swoop in, buy an extension, do bad things with it, and then kind of move on to the next one because they are this kind of asymmetric model that like if you pay a couple thousand bucks to get an extension with 20,000 users and you have access to all their web browsing activity, that’s a problem.
Matt Frisbie 00:16:57 And that definitely goes on, I would hope not too often, but those people are out there. So yeah, pay attention to the warning messages, and stick to trustworthy extensions. So yeah, I mean stick to the bigger ones and maybe don’t be too adventurous.
Kanchan Shringi 00:17:12 So, later in the show I think we should spend a little bit of time, given this, on how the developer can create more trust as well. Let’s move into the architecture now for a little bit. So, it might be useful to start with just the very basics of how the browser renders the webpage and then the key elements of how extensions fit in there.
Matt Frisbie 00:19:36 So it’s kind of this constellation of different pieces, and different extensions will use whatever is the most appropriate. So, like Honey, for example, they’ll paint stuff into the page like a little widget for showing when it’s trying to inject coupon codes, or LastPass when you are using it because all the stuff needs to be protected. It can’t put your passwords in the page so that lives inside the pop-up because only the extension exe has access to the pop-up, or React developer tools, they will, there’s really no user interface outside of the developer tools itself because that’s the place that kind of makes the most sense. Because when you’re building a website and using React, that’s where you’re spending a ton of time inside the developer tools. So yeah, it’s just a bunch of these pieces that can be assembled in different ways and you kind of use what you need and don’t use what you don’t need.
Kanchan Shringi 00:20:31 So the popup is where the user has to take an explicit action to invoke the extension, is that correct?
Matt Frisbie 00:20:39 Right, so the popup is probably the interface that most people are familiar with. So there’s right when you install an extension you’ll get inside like the extension bar — on desktop, at least — you’ll get a little extra icon that you can, is a clickable target. And so most people are not tech savvy or whatever, this is going to be the most comfortable experience for them because it’s a visible button: they can see it, they can click on it, they can right-click on it and do different things. But the interface of click the button, get the popup window, most people are very comfortable with doing that, and so, most extensions should at least have something there when you click that because people are definitely expecting it.
Kanchan Shringi 00:21:19 But what I understood is Honey used the content script, which automatically changed the page in some way that the user would recognize, but the user would not have necessarily to take any action to have that. Is that correct?
Kanchan Shringi 00:22:46 And then you talked about the service worker, which is a background script, so I assume there needs to be some kind of communication between the background script and content script.
Matt Frisbie 00:22:58 Right, so the communication medium for extensions is messaging. So, one of a couple different types of messaging, but it’s all basically the same concept as a post message that, it’s this asynchronous messaging and you can open a channel, it can just be a one-off messaging. It’s bidirectional so you can send a response, multiple tabs can talk to the service worker at once, extensions can talk to each other. There’s also ways for extensions to talk to native software. It’s all done via message passing because the different pieces, they will have different exposure to the APIs. So for example, a content script doesn’t have access, like it can’t handle extension events, but it can send messages to the background which thereby can handle those events. So, the background will be able to exchange messages with the content script exchange messages with the popup exchange messages with the options page and the background or sorry, excuse me, the developer tools.
Matt Frisbie 00:23:57 And so it’s kind of acting as the hub for the extension itself. And the service worker is also useful in this case because that’s the, it’s guaranteed to be a singleton, so there’s only ever going to be one service worker for any given extension. And so that’s very useful because if you’re tying together a bunch of these different UI pieces that have all these different considerations, the service worker, you can always kind of fall back on that there’s only going to be one handing handling message in this way and that kind of makes it easier to tie everything together.
Kanchan Shringi 00:24:27 So we’ve talked about permissions and in the context of, a security mechanism as well. What are permissions and what are the different kind of permissions and how does the author request permissions?
Matt Frisbie 00:24:42 Permissions are tricky because I devoted a whole chapter to them. At some level they expect, pretty much how you’d think, that if you want to do something that requires any elevated permission, you request the corresponding permission. So, for example, like there’s an alarms API so you can like have a piece of code run in the background like every minute for example. So that’s called an alarm. So, you’d request the alarms API. If you want to talk to a certain domain. So like let’s say I wanted to send requests to google.com, I can request to host permission and I would say okay give me a define a regex that gives me the ability to talk to google.com from the extension and there’s, I don’t know, like a hundred different permissions that you can request, some of which will trigger a warning message and some of which will not.
Matt Frisbie 00:25:35 So like, for example, the alarms API: it’s not really doing anything sensitive. When you submit to the Chrome web store, you will need to say, like, here’s what it’s for, and you put in like a sentence, but the user’s never going to see a popup because that’s not, there’s no opportunity for abuse really. Whereas, if you’re requesting access to the tabs API, or you’re requesting the all URLs host permission, which gives you the access to everything, they’re going to get a popup that says, the extension either on update or when it’s installed saying the extension is requesting all this stuff; is that okay? And so, a useful pattern, if you don’t want to scare the user, because some of the warning messages can be very scary are optional permissions. So basically you can have, when they initially install it, you can have the core subset of permissions that you extension requires to work and those will be applied automatically.
Matt Frisbie 00:26:32 And then if you want to have them on a one-off basis grant additional permissions, you can do that; it will still incur the warning window for permissions that are more sensitive but they will, they’ll be explicitly requesting them so it won’t be as scary as like getting all these warning messages on install. One caveat with permissions, which is a pretty ugly aspect of extension development in my submission, is that if you add required permissions to an extension and then push that out in an update, everyone who has it installed will have to reapprove the extension, which is depending on how much they need, the extension can have a substantial amount of attrition. So, your browser will disable it. Like in Chrome, if you have extensions installed, you’ve probably seen this before, there’s like a little yellow exclamation point in the settings menu and then you’ll have to explicitly reenable the extension that’s requested a sensitive permission. And a lot of developers will get bit by this when they’re not expecting it because it’s a really unpleasant user flow. So, if you’re trying to avoid things like that, optional permissions are your friend.
Kanchan Shringi 00:27:38 So permissions is a key mechanism, and there may be concerns in how some of these are displayed to the user, but are there other mechanisms in the browser to prevent any extension vulnerabilities?
Kanchan Shringi 00:29:24 Let’s spend some time on the extension-specific APIs. So you introduced these earlier on. Can you describe the scope of what can be done with access to these APIs?
Matt Frisbie 00:29:38 Sure. So, there’s common ones — you know, storage is a really common one. So, you can request different types of storage that are separated from the webpage itself. So, it’s an asynchronous storage; you can request different amounts of space. So, there’s like an unlimited storage permission and you can store as much stuff as you want, which is useful for extensions where you’re recording video or stuff like that. Yeah, so there’s, I mean APIs for authentication. So, one kind of tricky corner of extensions is like how do you authenticate someone? And so, there’s this whole set of — like, OAuth, especially, like how do you deal with like authenticating a person with the OAuth protocol, which is particularly difficult because you need these callback URLs. And so, browser extensions have a native way of dealing with these things, but it’s kind of tricky to do because these OAuth is kind of built around being used in like a website format, and so yeah, there’s a whole API to deal with, like OAuth and extension.
Matt Frisbie 00:30:36 Yeah, I mean I talked about the messaging. There’s a ton of like APIs to deal with like the browser chrome itself. So, there’s an OMNIBOX API which allows you to like kind of show auto complete search results like from the browser bar; there’s like a context menu API. So like when you write click, like you can add an entire right click menu that’s sensitive to like what you’re clicking on the page. There’s a ton of APIs dealing with like network requests themselves. So like you can sniff what’s being loaded on the page, what is the browsing history, things like that. There’s like a pair of bookmarks API so you can manage the person’s bookmarks, the tabs API talked about. So, you have total control over what tabs are being opened, closed, pinned, muted, whatever it is. Yeah, the list goes on and on, but it’s pretty extensive what they can do, and it’s not really any that much of an exaggeration to say that pretty much anything that you can do in the browser and extension can do for you to a certain extent.
Kanchan Shringi 00:31:38 So can we talk about some of the key differences across browsers, that you know of, in support of these APIs?
Matt Frisbie 00:31:48 Yeah, so browsers have mostly coalesced around the web extensions APIs. So that core set of APIs is pretty well supported. Where there’s some fragmentation are on — so for example, Mozilla is planning on continuing its support for the blocking web request API even though like next week they’re about to roll out support for manifest v3. So that fragmentation is interesting, we’ll see where it goes because now Firefox is becoming the only major platform that will be supporting the blocking rub requests API that all ad blockers need. Firefox also has like a whole bunch of extra like themes and things like that. They have their own bag of APIs that are unique to that platform. There’s some idiosyncrasies with how the different, uh, APIs behave on platforms, but there’s not a ton. So, if you’re, I mean if you’re within the core web extensions API, everything any major browser vendor has pretty nicely come in and supported the web extensions API. So, development is pretty nice in that respect.
Kanchan Shringi 00:32:54 That’s good to know since that certainly reduces the amount of work you would have to do to have your extension work across browsers. Moving on to a little bit of detail now on popup pages, content scripts, and background scripts. So, starting with the content scripts and maybe popup pages, how much control does the developer have on the styling? And especially I think that’s relevant in the case of content script because you are updating the existing webpage. So, what should you keep in mind as you start to style the user interface of the extension, and any caveats there?
Matt Frisbie 00:35:22 So good job plasma guys on that one. Yeah, but then as I mentioned before, the content script then how it folds into the page is really up to you. So, there are some context where you’ll want to integrate more tightly. So if you are, so a lot of like, like Gmail extensions will, the developers will ahead of time, they kind of know what the DOM looks like inside Gmail and so they can go, oh I can look for this certain pattern of , pieces of the DOM and then I can inject my own button in there and then it’ll look like a piece of the native DOM and so then I can style it however I want and they can have it trigger whatever interfaces I want, but it’s going to be stuck inside the page so it’ll look like a piece of Gmail. Other ones will like pop over the page.
Matt Frisbie 00:36:02 So for example, if you’re like a Lastpass user, you’ll notice that Lastpass will stick a little Lastpass icon over the end of an input element. And if you click that, it’ll pop a widget over the page. That’s a pretty common pattern because it’s pretty cheap and easy to locate like a small box widget over the page. And then there’s also extensions that will kind of have like a floating button in the bottom right of the page that will trigger something more substantial kind of a like a popover or like a modal window. That’s a pretty common content script pattern. Yeah, so content scripts are, they’re really, they have to be judiciously applied because like you can totally mangle the host page or the host page will mangle your content script. So, you have to be very careful with yeah how it’s being applied on the page, but at the same time it’s the most natural way to like extend and interact with the host page and it really allows for the most powerful stuff.
Kanchan Shringi 00:36:55 So you mentioned that the mangling can happen in both directions, and one of the key user experience elements was having a floating element on top, which I assume will be helpful with the CSS property Z index. So, are there any best practices around using that property?
Matt Frisbie 00:37:16 So it really depends on what the host page looks like. So, there are certain pages where it makes sense to just use the Z index to kind of just force your widget on top of everything. But then you have to be you have to be sensitive about is the host page also going to use Z index to push stuff on top and am I going to interfere with that? So, Z index, that’s one of the few CSS properties that’s going to cause problems, right? Because you’re setting the Z index presumably on the shadow DOM host element itself. So, you could be pulling your extension on top of everything and then the user’s going to be like, what’s going on? I can’t see anything. Or vice versa, the host page is going to be covering up your stuff and make your extension unusable.
Matt Frisbie 00:38:01 So in any event, it really requires, if you’re going the content script route, you have to be very sensitive towards what is actually going on the page because that’s really going to drive how you’re integrating with the host platform. So, like for example, a server-side rendered website will be much easier to integrate with because you know that if it’s just sending you back a blob of HTML, you can modify that without fear of like a single page application blowing it up — or I mean at least it’s less likely. Whereas if you’re integrating with this like really involved React app, it could be rerendering all the time and the URLs might look the same and you have to be aware of like is the host page going to wipe out my content script entirely? So, all these things are kind of under the same umbrella of really having a good understanding of what your host page is doing and can do that will make it play nicely with whatever your extension’s trying to do.
Kanchan Shringi 00:39:01 Besides the mechanism you mentioned using shadow DOM, can iFrames be used as well to isolate styling?
Matt Frisbie 00:39:07 They can. I don’t recommend it. It’s all the benefits of using an iFrame. So, the problem with iFrames is if you’re putting it on a host page, you’re subject to the same origin restrictions as the host page. So, if you’re trying to load an iFrame from whatever the host website is of the extension, if you’re sticking an iframe in the page, the host page may zap that request and say yeah you can’t, I’m not letting you embed that domain. And so, at that point the sandboxing abilities of an iFrame, they’re harder to work with and they’re a little bit more finicky than a shadow DOM, and shadow DOM is supported pretty much everywhere. So, I would say it was pretty rare where the use case of an iframe exceeds the utility and kind of flexibility of just a vanilla shadow DOM post.
Kanchan Shringi 00:39:54 And you certainly don’t have a lot of these complexities if you use a power page, but it’s not as well integrated with the page that you’re accessing. So, there’s some trade-offs there.
Matt Frisbie 00:40:04 Right, right. Yeah.
Kanchan Shringi 00:42:05 Thanks for that, Matt. I wanted to cover some other key topics that we’ve mentioned in our talk but not necessarily in sufficient details. So, let’s start with the dev tools pages. I think you mentioned that in the context of helping developers implementing React. So, can you just give a little bit of what the use case would be? What is an example of an extension here in this area?
Matt Frisbie 00:42:35 Sure. Well, the two that I use frequently are just React developer tools and then there’s a ReduX, a React redux extension as well. And so, the premise is relatively simple. Basically, an extension is able to create custom handles inside the browsers developer tools, and it’s pretty similar to an options page or a popup page that you can render it however you like, but it also gets access to a few special browser APIs. So, it can really tightly understand what the DOM structure is like. You can inject pieces scripts into the page to locate elements or, simple little script injections. You can sniff web traffic in a really rich way. And so, typically the way these are used is that like the React developer tools extension can tell if the webpage is running a React app, and then there are different there’s different ways to kind of make them play with each other.
Matt Frisbie 00:43:32 So you can have it be like, oh yeah, like the, your React app is rendering this way and, these components are working this way and, props and so forth are filtering down. So, it’s especially useful for kind of debugging and understanding what’s going on with single page applications because they can intimately integrate with what’s going on on the page, understand it and then show show it in developer tools in in ways that you can interact and understand kind of what’s going on behind the scenes. And so, I would most all major single page applications have some sort of developer tools like companion extension like this since it does prove to be so useful in so many cases.
Kanchan Shringi 00:44:13 Okay, makes sense. So, we did chat about manifest v2 versus v3 to some extent. You said the key difference was it might make it harder, it likely will make it harder to implement ad blockers. What else is different, and what else is the challenge of migrating from v2 to v3?
Matt Frisbie 00:45:43 And so now that it’s a service worker, there’s no DOM anymore, right? It’s the service worker global object. And so, there’s things like JSDOM that lets you kind of emulate a DOM or there’s like the offscreen canvas API, which lets you recapture some of this behavior. But taking away the DOM has been a big headache for a lot of extensions. So, some of the workarounds are keeping a designated tab open all the time, like just have a supplementary HTML tab open all the time and then use that DOM. That’s not a solution though because the user has now this extra tab floating around all the time just so you can have access to the DOM. It’s a bad solution. So, it’s not clear what’s going to happen with those extensions. They also might be in trouble. And then one big one is the life cycle of the service worker itself.
Matt Frisbie 00:46:29 So there are some extensions that — like, let’s say you wanted to open a web socket, a long-lived web socket or something that needs to run for an extended period of time; Chrome will aggressively shut down a service worker because it’s a service worker and it’s designed to be this quickly destroyable restartable thing. And so, any extension that needs a background script to run persistently no longer has the ability to do so because the Chrome will shut it down after I think five minutes is the typical timeout. So, there are hacks that can prolong the lifecycle of the service worker, but any extensions that need a long-running background script, there’s no path forward for those either. And all these problems I’ve mentioned, the Chrome team has indicated they want to address them and has said that for a long time, but it seems to be progressing slowly and at the same time it seems like — so I’ll put it this way, it seems like the Chrome team cares because they had initially set a rollout date of this month, actually; January 2023 was the hard stop date for, I think it was all manifest V2 extensions in the Chrome web store would go dark. They’ve already cut off V2 submissions, but the existing published V2 extensions could still be updated and would still be public.
Matt Frisbie 00:47:49 They have pushed that back to, I think, the middle of 2023 — so June or something — because they know that they’re not ready and all these extensions that are extremely popular are going to be killed if they turn them off. So, it seems to be the Chrome team cares about preserving these extensions that are kind of getting crushed by manifest V3, but I don’t really know what’s going to happen because it doesn’t seem like there’s much of a plan, and I don’t know; they haven’t rammed it through yet, but it would not surprise me if it got to that point and they just said well you’ll have to figure it out. So yeah, in summation, Manifest V3 is pretty controversial, and it remains to be seen what’s going to happen.
Kanchan Shringi 00:48:25 But if you are writing a new extension you should start with V3.
Matt Frisbie 00:48:29 At this point, yes, unless you’re really … People are still writing V2 extensions because they’re targeting Firefox or they really want to hang on to the old APIs. But, if you want to use the Chrome web store, if you want to have the most users, V2 is dead. Time to go for V3.
Kanchan Shringi 00:48:46 So I did read about a Safari web extension converter. Do you have any experience using that?
Matt Frisbie 00:48:52 Yeah, so this was, I made sure to cover this in the book. So, this was a really interesting development. So, I think it was, let’s say two years ago. Safari, or Apple, rolled out extension support for Safari. And so, traditionally extensions have not been a mobile platform. So, Google obviously has never rolled out extension support for Mobile Chrome. I think everyone kind of understands why they didn’t do that because they don’t want to lose the ad revenue because it would be billions of dollars out the window if they did that. So, I think they just said, yeah, we’re not doing extensions for mobile. Too bad. So, there are ways to get extensions on mobile Firefox is one way, but there’s a Kiwi browser for Android so you can do it on an Android phone, but there’s really no like first-class support from like the primary browser vendors — until Safari rolled out extension support.
Matt Frisbie 00:50:33 And so, the wrapper itself is a mobile app for Safari, and you can talk to it with the native messaging API. So, there’s this entirely extra piece of software that’s running on the phone that’s, it’s a Safari app. I’m not a Safari developer, but you can see it on the device and you can write code for it to behave as an app. So, it’s this whole extra piece that lets you kind of talk to the phone itself. So, the safari aspect of it is interesting for that reason, one — because it’s kind of this new domain — but it’s also the first major foray into extensions for phones because the iPhone is by far the most popular phone, and Safari is a huge chunk of users. Being able to run an extension — admittedly, inside a walled garden — is still pretty exciting.
Matt Frisbie 00:51:23 And it’s really the … Hats off to Apple for doing that because it seems like Chrome, the Google team is just never going to do it for mobile. And mobile computing is more than half of web traffic these days, and it’s kind of silly that we’re not able to use mobile devices, so good job Apple for supporting that. And so, it’s still kind of a clunky interface. So, developing for it is kind of, it’s kind of difficult and not nearly as easy as publishing to the Chrome web store, but it is certainly a promising future in the context of browser extensions.
Kanchan Shringi 00:51:55 So not as easy as a Chrome web store. So, what is the approval process? Do you have a separate one for each browser?
Matt Frisbie 00:52:02 Yeah, so each browser has its own store. So, the Chrome web store that’s for Chrome, although caveat on that is that other Chromium browsers can install from the Chrome web store, but Edge has its own Edge extension store; Opera has its own store; Mozilla has its own store. Yeah, so you have to, if you want to appear in these stores, you have to submit your bundled extension to each of these, and there’s a separate approval process. So, I’ll say that it’s a pain, for sure. The Plasmo guys who wrote the forward, they have a pipeline that allows you to automatically deploy to all the stores, which is really awesome. And I suggest anyone who has to deploy to all the stores, it’s quite something and they’ve put a lot of work into making it great. But you can also do it just, like, one-off.
Matt Frisbie 00:52:48 So if you just want to publish in the Chrome web store, there’s an API you can do it through, or you can just upload a zip file. That’s, the zip file is pretty low overhead. And then yeah, there’s — I should say it depends on what your extension has asked for. So, if it’s a low-permission extension that doesn’t ask for anything sensitive, I will typically see my extension, like, go live in under 30 minutes. So, it seems like that’s an automated approval pipeline and they, Google has some automated process that they go, okay, yeah, that’s probably not stealing anything. So, we can just publish that. Go right ahead. So, like the book has a companion extension called ‘example Chrome extension’ that, well because it has to demo all of the APIs the extension requests basically every API imaginable. So, whenever I submit updates to that extension, it takes days because obviously someone has to like sit down and look at it and be like okay, why are they asking for all the permissions in the world? And then, it gets published in the same way, but that takes much, much, much longer. So, I think the approval process is pretty straightforward. I think it’s just, developers need to understand, like, if you’re requesting certain extensions, it takes like 50 times longer for the approval to go through, which can be pretty annoying.
Kanchan Shringi 00:54:03 Let’s spend a little bit of time, maybe a couple of minutes, on testing and monitoring. Is there anything unique to testing extensions, or is it similar to how you would test any other web application?
Matt Frisbie 00:54:17 Yeah, so testing is tricky. So, a lot of it’s pretty manual. So, one interesting thing with Manifest V3 is that, for modern web developers, pretty much every build tool offers a hot module reload feature, right? So, it can quickly swap out pieces of the application that were updated; there’s no reload required and it can show it to you immediately, which is amazing when you’re writing code and don’t have to refresh the page every time. The problem is that this bumps up against what Manifest V3 allows for. So, and depending on what piece, so like if you’re writing a content script, for example, and you’re writing a, let’s say you’ve written a React widget to be injected into the page, the hot module reload can’t reload just that thing. So, it’s not compatible there. So, you have to do a page reload. And at the same time, you also have to be careful about when you need an extension reload.
Matt Frisbie 00:55:13 So, like when you’re going to kick out the service worker and replace it with the updated one. And so, there’s a whole bunch of pain points when you’re doing this. So for example, if your — one thing that I still bites me in the butt to this day is that if you are inspecting the service worker when in development, if you leave the inspector window open, it will keep the service worker alive even after you reload it. And so, all these really weird bugs like come out of the window, you’re like, what is going on? And so, it’s just the browser, it’s like okay, I got to keep this alive because you still have the inspector window open. And so, yeah, so testing extensions is kind of hard because it’s living in this weird like browser space, and then all of the traditional build tools are kind of geared towards web development.
Matt Frisbie 00:55:56 So, some things translate, but yeah, it’s still a manual process and there are still all these kinks that may or may not be worked out. As for monitoring extensions, I would actually say it’s a superior experience to monitoring web apps. And the reason for that is for the same reason that extensions are so popular because ad blockers for webpage — an ad blocker eats like half of your analytics traffic. So, for example, if you stick Google Analytics on a page, the number of people using your web app, I usually pad it by like 30 or 40% because those requests are getting killed by ad blockers. You will never know that those people are viewing your webpage. However, if you are installing analytics inside an extension, other ad blockers can’t block network requests from your extension. So, if you’re sending analytics from the service worker, you’re going to get perfect fidelity for your analytics, which is great. Like, being able to see 100 percent of the user activity and not have to worry about ad blockers eating all your stuff, that’s great. So, monitoring I would say is actually nicer than webpages because you’re not losing all that analytics data to ad blockers eating your lunch.
Kanchan Shringi 00:57:09 Trying to wrap up now. So, you did talk about the Plasmo platform, I think this was built by Stefan Aleksic and Louis Vilgo — I hope I’m pronouncing the names correctly — who wrote the forward to your book, and you talked about a use case where the platform does help for the approval process across browsers. How else do these platforms help?
Matt Frisbie 00:57:34 Yeah, the Plasmo guys, they’ve gone a really interesting direction. So, they’ve kind of built this declarative model. So, when you’re building an extension, instead of kind of explicitly labeling everything out inside a manifest file, a lot of the boilerplate stuff gets generated for you. So, they’ve figured out like a good way of injecting one or multiple content scripts. They’ve figured out a good way of managing permissions and messaging and things like that. And so, they’ve built it into this kind of opinionated platform, but you get all these benefits once you use the platform. So, they have a really, really nice command line interface. As I mentioned, they’ve got the store deployment pipeline. That’s great. Yeah, they’re actively working on it. And they’re great guys. It’s a lot of fun to talk to them. And so, I think they’re really onto something. So, there aren’t a ton of platforms, and a lot of people home roll stuff, but for anyone who’s looking to find a platform to easily get started, look no further. Plasmo, those guys are killing it right now. And they have, they’re working on some really important stuff, and they’re really advancing sophisticated extension development tools, and boy, Lord knows we need those because this space is, it’s still kind of the wild west.
Kanchan Shringi 00:58:51 It certainly is. I couldn’t find a lot of material besides the official documentations. Then I chanced on your book.
Matt Frisbie 00:58:57 And the book.
Kanchan Shringi 00:58:58 So, we’ll definitely have a link to the book in our show notes. How else can people contact you?
Matt Frisbie 00:59:06 Yeah, so Twitter’s a good way. My Twitter handle is @MattFrizz. Yeah, so buildingbrowserextensions.com is the website for the book. You’ll be able to buy the book there. There’s contact info for me. Yeah, you can find me on LinkedIn. There’s a number of ways to reach me or just Google my name. My personal site is mattfrizz.com. You can find my contact information there. So yeah, pretty reachable.
Kanchan Shringi 00:59:29 So certainly be very interesting conversation. Matt, we’ve got a lot of topics. Is there anything else you think we should talk about today with respect to browser extensions?
Matt Frisbie 00:59:37 So, there is one thing, and that is kind of where the future is for browser extensions. So, I’m sure you’re familiar with chatGPT, which was released in December and has taken the world by storm. And I think that there’s a really interesting pairing between AI tools — especially LLMs (large language models) like chatGPT — and things like extensions. So, I wrote a blog post about this, but there has been an explosion of browser extensions that use tools like chatGPT and other open AI APIs to do cool things inside the browser. And so, a lot of the ones that have come up can write emails for you and can, like, summarize articles, and there’s hundreds of extensions now that utilize these language models in some interesting way. And the pairing really opens up some interesting possibilities because like if you think about what a webpage actually is, right?
Matt Frisbie 01:00:31 It’s this hypertext; it’s a pretty consistent length — at most a few thousand words usually images and things like that. And the inputs to these large language models can handle that amount of text. And so, there’s this new space where these AI tools can like really richly understand like what you’re looking at and can unpack all these things. So, they can summarize what you’re looking at, or they can have this conversational understanding of what your browsing is. And so, there are certainly privacy implications of like do I want to be feeding this closed-source AI model what I’m looking at? But browser extensions can, what I really see them as is like, it’s almost like a, it’s a glimpse into like the future of augmented reality because they’re adding these contextually useful interfaces where you need them.
Matt Frisbie 01:01:25 So, because it can so richly understand what the page is showing, it’s able to go, oh you could really use a little widget here that does XYZ, or we should really we should format the page in this interesting way because this’ll help with XYZ. So, because an extension can richly modify and understand the page, and because LLMs are able to quickly ingest the contents of the page and do useful things with them, there’s this really interesting future where browser extensions are kind of this assistant that are kind of modulating the way that we use the web. And, obviously so much of what we do is now inside a web browser or some computing device in some form. And so, having this layer over what we’re looking at that is a smart layer and is able to modify and suggest things really opens up some interesting possibilities. So, I’m really excited about the future of this pairing between browser extensions and things like chatGPT. Maybe it won’t be the form of necessarily like a browser extension because they’re mostly limited to desktop browsers and that’s really only half a web browsing. But the ability to have this controlled layer that’s like this assistant and can understand what you’re looking at and what you’re doing, it gives a small glimpse into the future of computing, and it really excites me in a profound way.
Kanchan Shringi 01:02:47 I did notice your article on LinkedIn. We’ll definitely have these links in the show notes. This is a very interesting conversation. Matt, thanks for coming onto the show and I’m happy , we could have this conversation.
Matt Frisbie 01:02:58 It’s an unusual space and I was happy to be on to talk about it.
Kanchan Shringi 01:03:01 Thanks all for listening.
[End of Audio]