François Daoust, W3C staff member and co-chair of the Web Developer Experience Community Group, discusses the origins of the W3C, the browser standardization process, and how it relates to other organizations like TC39, WHATWG, and IETF. This episode covers a lot of ground, including funding through memberships, royalty-free patent access for implementations, why implementations are built in parallel with the specifications, why requestVideoFrameCallback doesn’t have a formal specification, balancing functionality with privacy, working group participants, and how certain organizations have more power. François explains why the W3C hasn’t specified a video or audio codec, and discusses Media Source Extensions, Encrypted Media Extensions and Digital Rights Management (DRM), closed source content decryption modules such as Widevine and PlayReady, which ship with browsers, and informing developers about which features are available in browsers.
Brought to you by IEEE Computer Society and IEEE Software magazine.
Show Notes
Related links
- W3C
- TC39
- Internet Engineering Task Force
- Web Hypertext Application Technology Working Group (WHATWG)
- Horizontal Groups
- Alliance for Open Media
- What is MPEG-DASH? | HLS vs. DASH
- Information about W3C and Encrypted Media Extensions (EME)
- Widevine
- PlayReady
- Media Source API
- Encrypted Media Extensions API
- requestVideoFrameCallback()
- Business Benefits of the W3C Patent Policy
- web.dev Baseline
- Portable Network Graphics Specification
- Internet Explorer 6
- CSS Vendor Prefix
- WebRTC
Transcript
Transcript brought to you by IEEE Software magazine.
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.
Jeremy Jung 00:00:19 Hey, this is Jeremy Jung for Software Engineering Radio, and today I’m talking to François Daoust. He’s a staff member at the W3C, and we’re going to talk about the W3C and the recommendation process and discuss Francois’s experience with how these features end up in our browsers. So, Francois, welcome to Software Engineering Radio.
François Daoust 00:00:42 Thank you Jeremy, and many thanks for the invitation. I’m really thrilled to be a part of this podcast.
Jeremy Jung 00:00:48 I think many of our listeners will have heard about the W3C, but they may not actually know what it is. So could you start by explaining what it is?
François Daoust 00:00:59 Sure. So W3C stands for the Worldwide Web Consortium. It’s a standardization organization. I guess that’s how people should think about W3C. It was created in 1994 by Tim Berners-Lee, who is the inventor of the web. Tim Berners-Lee was the director of W3C for a long, long time. He retired not long ago, few years back. And W3C has a number of properties, let’s say. First the goal is to produce royalty free standards, and that’s very important. We want to make sure that the standard that get produced can be used and implemented without having to pay fees to anyone. We do web standards. I didn’t mention it, but it’s from the name standards that you find in your web browsers. But not only that, there are a number of other standards that got W3C including, for example, XML data related standards.
François Daoust 00:01:59 W3C as an organization is a Consortium. The C stands for Consortium. Legally speaking, it’s a 501C3, meaning it’s a US based legal entity, a not-for-profit. And the little three is important because it means it’s public interest. That means we’re a Consortium, that means we have members. But at the same time, the goal, the mission is to the public. So we’re not only just doing what our members want, we’re also making sure that what our members want is aligned with what end users in the end need. And the W3C has a small team, and so I’m part of this team worldwide, 45-55 people, depending on how you count, mostly technical people and some admin as well, overseeing the work that we do at W3C.
Jeremy Jung 00:02:51 And so you mentioned there’s 45-55 people. How is this funded? Is this from governments or commercial companies?
François Daoust 00:03:00 The main source comes from membership fees. So W3C has so members roughly 350 members at W3C. And in order to become a member, an organization needs to pay an annual membership fee. That’s pretty common among standardization organization. And we only have, I guess three levels of membership fees. Well, you may find additional small levels, but three main ones. The goal is to make sure that a big player will not, a big player or large company will not have more rights than anything, anyone else, but we try to make sure that all members have equal rights. It’s not perfect, but that’s how things are set. So that’s the main source of income for the W3C. And then we try to diversify just a little bit to get for example, we go to governments. We may go to governments in the EU. We may take some grant for EU research projects that allow us to study, explore topics. In the US, there used to be some funding coming from the government as well. That’s also a source. But the main one is membership.
Jeremy Jung 00:04:08 And you mentioned that a lot of the W3C’s work is related to web standards. There’s other groups like TC39, which works on the JavaScript spec and the IETF, which I believe worked with your group on WebRTC. I wonder if you could explain W3C’s connection to other groups like that.
François Daoust 00:04:32 Sure. We try to collaborate with a number of other standardization organizations. So in general, everything goes well because you have a clear separation of concerns. So you mentioned TC39. Indeed, they are the ones who standardize JavaScript. Proper name of JavaScript is the mask script. So that’s TC39 is the technical committee at ECMA. And so we have indeed interaction with them because their work directly impact the JavaScript that you’re going to run in your web browser. And we develop a number of JavaScript APIs actually in W3C. So we need to make sure that the way we develop these APIs align with the language itself. With IETF, the boundaries is clear as well. It’s a protocol for our network protocols for the IETF an application level for W3C. That’s usually how the distinction is made.
François Daoust 00:05:25 The boundaries are always a bit fuzzy, but that’s how things work. And usually things work pretty well. There’s also the WHATWG, and the WHATWG is more — the history was more complicated because the WHATWG was created out of a fork of the HTML specification at the time when it was developed by W3C a long time ago. And there has been some disagreement on the way things should have been done. And the WHATWG took over, got created, took this, the HTML spec, and did it a different way, went in another direction and that other direction actually ended up being the direction. So that’s a success from there. And so, W3C no longer works, no longer owns the HTML spec. And the WHATWG has taken up a number of different core specifications for the web, doing a lot of work on interoperability and making sure that the algorithms specified by the spec were correct, which was something that historically we haven’t been very good at W3C. And the way they’ve been working has a lot of influence on the way we develop now the APIs from a W3C perspective.
Jeremy Jung 00:06:37 So, just to make sure I understand correctly, you have TC39, which is focused on the JavaScript or ECMA script language itself, and you have APIs that are going to use JavaScript and interact with JavaScript. So you need to coordinate there. The WHATWG, they have the specification for HTML, then the IETF, they are, I’m not sure if the right term would be they would be one level lower perhaps than the W3C?
François Daoust 00:07:07 Thatís how you can formulate it, yes. The one layer one in the ISO network in the ISO stack, the network level.
Jeremy Jung 00:07:17 And so in that case, one place I’ve heard it mentioned is that WebRTC to use it, there is an IETF specification, and then perhaps there’s a W3C recommendation.
François Daoust 00:07:30 Yes, so when we created the WebRTC working group, that was in 2011, I think it was created with a dual head. There was one RTC web group that got created at IETF and a WebRTC group that got created at W3C. And that was done on purpose. Of course, the goal was not to compete on the solution, but actually to have the two sides of the solution be developed in parallel the application front and the network front. And there’s still a lot of overlap in participation between both groups. And that’s what keep things successful in the end. It’s not a process or organization to organization relationships, coordination at the organization level. It’s really the fact that you have participants that are essentially the same on both sides of the equation. That helps move things forward. Now, WebRTC is more complex than just one group at IETF. WebRTC is a very complex set of technologies, stack of technologies. So when you pull a little protocol from IETF, suddenly you have the whole IETF that comes with you with it. So you have the feeling that WebRTC needs all of the internet protocols that got created to work.
Jeremy Jung 00:08:45 And I think probably a lot of web developers, they may hear words like specification or standard, but I believe the official term, at least at the W3C, is this recommendation. And so I wonder if you can explain what that means.
François Daoust 00:09:03 It means standard in the end, and that came from industry. That comes from a time where as many standardization organizations, W3C was created not to be a standardization organization. It was felt that standard was not the right term because we were not a standardization organization. So recommend IETF has the same thing. They call it RFC, Request for Comment, which stands for nothing. And yet it’s a standard. So the W3C was created with the same kind of a thing. We needed some other terminology, and we call that recommendation. But in the end, that’s standard. It’s really how you should see it. And one thing I didn’t mention when I introduced W3C is there are two types of standards in the end, two main categories. There are the deur standards and defacto standards to families. The deur standards are the ones that are imposed by some kind of regulation.
François Daoust 00:09:58 So it’s usually a standard you see imposed by governments, for example. So when you look at your electric plug at home, there’s some regulation there that says, this plug needs to have these properties. And that’s a standard that gets imposed. It’s a deur standard. And then there are defacto standards, which are really specifications that are out there, and people agree to use it to implement it. And by virtue of being used and implemented and used by everyone, they become standards. The W3C really is in the second part, it’s a defacto standard. IETF is the same thing. Some of our standards are referenced in regulations now, but just a minority of them, most of them are defacto standards. And that’s important because that’s, in the end, it doesn’t matter what the specification says, even though it’s a bit confusing.
François Daoust 00:10:51 What matters is that what the web specification says matches what implementations actually implement, and that these implementations are used and are used interoperably across process, for example, or across implementations, across users, across usages. So standardization is a lengthy process. The recommendation is the final stage in that lengthy process. More and more, we don’t really reach recommendation anymore. If you look at groups, because we have another path, let’s say we can stop at candidate recommendation, which is in theoretically a step before that, but then you can stay there and stay there forever and publish new candidate recommendations later on. What matters again is that you get this virtual feedback loop with implementers and usage.
Jeremy Jung 00:11:45 So if the candidate recommendation ends up being implemented by all the browsers, what ends up being the distinction between a candidate and one, that’s a normal recommendation?
François Daoust 00:11:56 So today it’s mostly a process thing. Some groups actually decide to go to RAC. Some groups decide to stay at candidate RAC and there’s no formal difference between the two. We’ve adjusted the process so that the important bits that applied at the recommendation level now apply at the candidate track level. And by important things, I mean the patent commitments, typically the patent policy fully applies at the candidate recommendation level so that you get your protection, the royalty free protection that we’re aiming at. Some people do not care, you know, but most of the world still works with patents for good or bad reasons, but that’s how things work. So we’re trying to make sure that we secure the right set of patent commitments from the right set of stakeholders.
Jeremy Jung 00:12:51 Oh, so when someone implements a W3C recommendation or a candidate recommendation, the patent holders related to that recommendation, they basically agree to allow royalty free use of that patent?
François Daoust 00:13:10 They do the one that were involved in the working group, of course, I mean, we can’t say anything about the companies out there that may have patents, and now we’re not part of this standardization process, but there’s always, it’s a remaining risk. But part of the goal when we create a working group is to make sure that people understand the scope. Lawyers look into it, and the legal teams that exist at the all the large companies basically gave a green light saying, yeah, we are pretty confident that we know where the patents are on this particular area, and we’re fine also letting go of the patents we own ourselves.
Jeremy Jung 00:13:52 And I think you had mentioned what ends up being the most important is that the browser creators implement these recommendations. So it sounds like maybe the distinction between candidate recommendation and recommendation almost doesn’t matter as long as you get the end result you want.
François Daoust 00:14:14 People will have different opinions in standardization circles, and I mentioned also the W3C is working on other kind of standards. So in some other areas, the nuance may be more important, but when you look at specifications that target web browsers, we’ve switched from a model where specs were developed first and then implemented to a model where specs and implementations are being worked in parallel. This actually relates to the evolution I was mentioning with the WHATWG taking over the HTML and focusing on the interoperability issues, because the starting point was yeah, we have an HTML 4.01 spec, but it’s not interoperable because it’s not specified. There are a number of areas that are gray areas, you can implement them differently. And so there are interoperable issues. Back to candidate track actually the stage was created, if I remember correctly, if I’m not wrong, the stage was created following the IE6 problem in the CSS working group.
François Daoust 00:15:20 IE6 shipped with some version of CSS that was in the as specified. The spec was saying, you know, do that for the CSS box model. And the IE6 was following that, and then the group decided to change the box model, and suddenly IE6 was no longer compliant. And that created a huge mess in the history of the web in a way. And so the candidate recommendation stage was introduced following that to try to catch these kinds of problems. But nowadays, again, we switch to another model where it’s more live. And so we, you’ll find a number of specs that are not even at candidate track level. There are at the, what we call a working draft, and they are being implemented. And if all goes well, the standardization process follows the implementation. And then you end up in a situation where you have your candidate track when the spec ships.
François Daoust 00:16:17 A recent example would be a web GPU, for example. It has shipped in Chrome shortly before it transition to a candidate track but the spec was already stable and now it’s shipping in different browsers, Safari and Firefox. That’s a good example of something that follows things along pretty well. But then you have other specs such as in the media space request, video frame call back is a short API that allows you to get the callback whenever the browser renders a video frame, essentially. And that spec is implemented across browsers. But from a W3C specific perspective, it does not even exist. It’s not on the standardization track still being incubated in what we call the community group, which is something that usually exists before we move to the standardization process. There are examples of things where something fell through the cracks or the standardization process is either too early or too late.
François Daoust 00:17:16 Things that are in spec are not exactly what got implemented, or implementations are too early in the process. We’re doing a better job at not falling into a trap where someone ships an implementation and then suddenly everything is frozen. You can no longer change it because it’s too late shipped. We’ve tried different paths there and mentioned CSS, there was this kind of vendor prefixed properties that used to be the way browsers were deploying new features without taking the final name. We’re trying also to move away from it because same thing then in the end, you end up with applications that have duplicate all the property, the CSS properties in the style sheets with the vendor prefixes and nuances in what it does in the end.
Jeremy Jung 00:18:02 Yeah, I think is that in CSS where you’ll see DASH DASH Mozilla or things like that, the example of the request video frame callback, I wonder if you have an opinion or know why that ended up the way it did, where the browsers all implemented it, even though it was still in the incubation stage?
François Daoust 00:18:26 On this one, I don’t have a particular insight on whether there was a strong reason to implement it without doing the standardization work. I mean, there are, it’s not an IPR issue. It’s not something that I don’t think the spec triggers problems that would be controversial, so whatever. So it’s just a matter of there was no one’s priority, and in the end, you end up with a, everyone’s happy it has shipped. And so now doing the spec work is a bit why spend time on something that’s already shipped and so on. But it may still come back at some point with try to, you know, improve the situation.
Jeremy Jung 00:19:00 Yeah, that’s interesting. It’s a little counterintuitive because it sounds like you have the working group, and it sounds like perhaps the companies or organizations involved, they maybe agreed on how it should work, and maybe that agreement almost made it so that they felt like they didn’t need to move forward with the specification because they came to a consensus even before going through that.
François Daoust 00:19:26 In this particular case, it’s probably because it’s really, again, it’s a small spec. It’s just one function call. I mean, they will definitely want a working group for larger specifications. By the way, actually now I know request video frame call back, it’s because the final goal now that it’s shipped is to merge it into HTML, the HTML spec. So there’s an ongoing issue on the WHATWG side to integrate request video frame call back, and it’s taking some time, but see it caught up and someone is doing the work to do it. I had forgotten about this one. So with larger specifications, organizations will want this kind of IPR regime. They will want commitments from others on the scope, on the process, on everything. So they will want a more formal setting because that’s part of how you ensure that things will get done properly.
François Daoust 00:20:19 I didn’t mention it, but something we’re really pushy on at the W3Cs I mentioned, we have principles, we have priorities, and we have several properties at the W3Cs. And one of them is that we’re very strong on horizontal reviews of our specs. We really want them to be reviewed from an accessibility perspective, from an internationalization perspective, from a privacy and security perspective, and a technical architecture perspective as well. And these reviews are part of the formal process. So all specs need to undergo these reviews, and from time to time that creates tension from time to time, it just goes without problem. The recurring issue is that privacy and security are hard. I mean, not an easy problem, something that can be solved easily. So there’s an ongoing tension and no easy way to resolve it, but there’s an ongoing tension between specifying powerful APIs and preserving privacy.
François Daoust 00:21:19 Meaning not exposing too much information to applications in the media space. You can think of the media capabilities, API, so the media space is a complicated space because of codecs. Codecs are typically not relative free. And so browsers decide which codecs they’re going to support, which audio and video codecs they’re going to support and doing that, creates additional fragmentation. Not in the sense that they’re not interoperable but in the sense that applications need to choose which codec they’re going to ship the stream to the end user. And it’s all the more complicated that some codecs are going to be hardware supported. So you will have a hardware decoder in your laptop or smartphone, and so that’s going to be efficient to decode some stream, whereas some codecs are not, are going to be software based supported, and that may consume a lot of CPU and a lot of power and a lot of energy in the end.
François Daoust 00:22:14 So you want to avoid that if you can select another thing, and it’s even more complex than codecs have different profiles, lower end profiles, and profiles with different capabilities, different features depending on whether you’re going to use these or that color space, for example, this or that resolution, whatever. And so you want to suffice that to web applications because otherwise they count, select, they can’t choose the right codec and the right stream that they’re going to send to the client devices. And so they’re not going to provide an efficient user experience first and even a sustainable one in terms of energy because they’re going to waste energy if they don’t send the right stream. So you want to surpass(?)that to application. That’s what the media capabilities APIs provides. But at the same time, if you expose that information, you end up with ways to fingerprint the end user’s device. That in turn is often used to track users across sites, which is exactly what we don’t want to have for privacy reasons or obvious privacy reasons. So you have to balance that and find ways to expose capabilities without necessarily exposing them too much.
Jeremy Jung 00:23:25 Yeah. Can you give an example of how some of those discussions went? Like within the working group? Who are the companies or who are the organizations that are arguing for we shouldn’t have this capability because of the privacy concerns?
François Daoust 00:23:44 In a way, all of the companies have a vision of privacy. I mean, you will have a hard time finding members saying, I don’t care about privacy, I just want the feature. They all have privacy in mind, but they may have a different approach to privacy. So if you take let’s say Apple and Google would be, I guess the perfect examples in that space. Google will have an approach that is more open-ended saying the user agents should check what the given site is doing. And then if it goes beyond some kind of threshold, they’re going to say, well, okay, we’ll stop exposing data to that site. So that application, so monitor and react in a way. Apple has a more, you know, has a stricter view on privacy, let’s say.
François Daoust 00:24:30 And they will say, no, the feature must not exist in the first place. Or, but that’s, I mean, I guess it’s not always that extreme. And from time’s the opposite, you’ll have Apple arguing in one way which is more open-ended than Google, for example. And they’re not the only ones. So in working groups, you will find usually the implementers. So when we talk about APIs that gets implemented in browsers, you want the core browsers to be involved otherwise, it’s usually not a good sign for the success of the technology. So in practice, that means Apple and Microsoft, which one did I forget? Google. I forgot, Google of course. Thank you. That’s the core list of participants you want to have in any group that develops web standards targeted at web browsers.
François Daoust 00:25:18 And then on top of that you want organizations and people who are directly going to use it, either because they, well, the content providers. So in media, for example, if you look at the media working group, you’ll see browser vendors, the ones I mentioned, content providers such as the BBC, Netflix chip set vendors would be there as well. Intel and VIA, again because there’s a hardware decoding in there and encoding. So media it touches on hardware. A device manufacturer in general, you may I think Sony is involved in the media working group, for example. And these companies are usually less active in the spec development. It depends on the groups, but they’re usually less active because the ones developing the specs are usually the browser, again, because as I mentioned, we develop the specs in parallel to browsers implementing it.
François Daoust 00:26:06 So they have the feedback on how to formulate the algorithms. And so this collection of people who are going to discuss first within themselves. W3C pushes for consensual decisions. So we hardly take any votes in the working groups, but from time to time, that’s not enough. And there may be disagreements, but let’s say there’s agreement in the group when the spec matches, horizontal review groups will look at the spec. So these are groups I mentioned, accessibility one privacy internationalization, and these groups, usually the participants are, it depends. It can be anything. It can be the same companies. It can be, but usually different people from the same companies. But maybe organizations with that come from a very different angle. And that’s a good thing because that means you enlarge the perspectives on the technology. That’s when you have a discussion between groups that takes place. And from time to time it goes well and from time to time again, it can trigger issues that are hard to solve. And the W3C has an escalation process in case things degenerate. Starting with the notion of formal objection.
Jeremy Jung 00:27:22 It makes sense that you would have the browser vendors and you have all the different companies that would use that browser. All the different horizontal groups like you mentioned, the internationalization, accessibility. I would imagine that you were talking about consensus, and there are certain groups or certain companies that maybe have more say or more sway. For example, if you’re a browser manufacturer, youíre Google, I’m kind of curious how that works out within the working group.
François Daoust 00:27:56 Yes, I guess I would be lying if I were to say that all companies are strictly equal in a group. They are from a process perspective, I mentioned the different membership fees, with were design specific details, so that no one could say, I’m putting in a lot of money, so you need to respect me and you need to follow what I want to do. At the same time, if you take a company like Google for example, they send hundreds of engineers to do standardization work, that’s absolutely fantastic, because that means work progresses and theyíre extremely smart people. So that’s really a pleasure to work with these people. But you need to take a step back and say, well, the problem is defacto that give them more power just by virtue of injecting more resources into it.
François Daoust 00:28:45 So having always someone who can respond to an issue, having always someone editing a spec defacto that give them more say on the directions that get forward. And on top of that, of course, they have the, it’s not surprisingly the browser that is used the most currently on the market, we try very hard to make sure that things are balanced. It’s not a perfect world, the role of the team, I mean, I didn’t talk about the role of the team, but part of it is to make sure that, again, all perspectives are represented and that there’s not such a big imbalance that something is wrong and that we really need to look into it. So making sure that anyone, if they have something to say, making sure that they are heard by the rest of the group and not dismissed, that usually goes well, there’s no problem with that. And again, the escalation process I mentioned, it doesn’t make any difference between a small player, a large player, a big player, and we have small companies raising formal objections against some of our spec that happens, or large ones. But that happens too. There’s no magical solution I guess. You can tell it by the way. I don’t know how to formulate the process more. It’s a human process, and that’s very important that it remains a human process as well.
Jeremy Jung 00:30:02 And I suppose the role of staff and someone in your position, for example, is to try and ensure that these different groups are heard, and it isn’t just one group taking control of it.
François Daoust 00:30:16 That’s part of the role, again, is to make sure that the process is followed so that, I don’t want to give the impression that the process controls everything in the groups. The groups are bound by the process, but the process is there to catch problems when they arise. Most of the time, there are no problems. It’s just, again, participants talking to each other, talking with the rest of the community. Most of the work happens in public nowadays, in any case. So the groups work in public, essentially through asynchronous discussions on GitHub repositories, there are contributions from non-group participants, and everything goes well. And so the process doesn’t kick in. You just never say, no, you didn’t respect the process there. You closed the issue, you shouldn’t have a, it’s pretty rare that you have to do that. Things just proceed naturally because everyone understands where they are, what they’re doing, and why they’re doing it. We still have a role, I guess, in the sense that from time to time that doesn’t work, and you have to intervene and you have to make sure that the exception is code and processed in the right way.
Jeremy Jung 00:31:22 And you said this process is asynchronous in public, so is this in GitHub issues or how would somebody go and see what the results of?
François Daoust 00:31:32 Yes, there are basically a gazillion of GitHub repositories under the W3C organization on GitHub. Most groups are using GitHub. I mean, it’s not mandatory. We don’t mandate any tooling. But the fact is that most, we’ve been transitioning to GitHub for a number of years already. So that’s where the work, most of the work happens through issues, through pool requests. That’s where people can go and raise issues against specifications. We usually also from time to time get feedback from developers encountering a bug in a particular implementation, which we try to gently redirect to the actual bug trackers because we’re not responsible for the implementations of the specs unless the spec is not clear. We’re responsible for the spec itself, making sure that the spec is clear and that implementers understand how they should implement something.
Jeremy Jung 00:32:28 I can see how people would make that mistake because they see it’s the feature, but that’s not the responsibility of W3C to implement any of the specifications. Something you had mentioned, there’s the issue of intellectual property rights and how when you have a recommendation, you require the different organizations involved to make their patents available to use freely. I wonder why there was never any kind of recommendation for audio or video codecs and browsers, since you have certain ones that are considered royalty free? But I believe that’s never been specified.
François Daoust 00:33:13 At W3C, you mean? Yes, we’ve tried. I mean, it’s not for lack of trying. We’ve had a number of discussions with various stakeholders saying, hey, we really need an audio or video codec for the web. The PNG, ping is an example of any match format which got standardized at the W3C, and it got standardized at W3C for similar reasons. They had to be a royalty free image format for the web, and there was none at the time. Of course, nowadays, JPEG and G4J, whatever you call it, are no problem with them. But that at the time, ping was really meant to address this issue, and it worked for Ping. For audio and video we haven’t managed to secure commitments by stakeholders. So willingness to do it, it’s not lack of willingness. We would’ve loved to get a royalty free audio codec, a royalty free video codec.
François Daoust 00:34:10 Again, audio and video codecs are extremely complicated because of this, not only because of patents, but also because of the entire business ecosystem that exists around them for good reasons. In order for a codec to be supported, deployed effectively it needs to match a lot. It needs to be added to at a hardware level, to a number of devices, capturing devices, but also, of course players. And that takes a lot of time and that’s why you also enter a number of business considerations with business contracts between entities. So on a personal level, I’m pleased to see, for example, the Alliance for Open Media working on AV1, which is at least they wanted to be royalty free and they’ve been adopting actually the W3C patent policy to do this work. So we’re pleased to see that in adopting the same process and same thing, AV1 is not yet at the same support stage as other codecs in the world in devices.
François Daoust 00:35:11 As an open question, as what are we going to do in the future with that, it’s doubtful that W3C will be able to work on a royalty free audio codec or royalty free video codec itself, because probably it’s too late now in any case. But it’s one of these angles in the web platform where we wish we had the technology available for free. And it’s not exactly how things work in practice. I mean, the way codecs are developed remains really patent oriented, and you will find more codecs being developed. And that’s where geopolitics can even enter the play because if you go to China, you’ll find new codecs emerging that get developed within China also, because the other codecs come mostly from the US. So it’s a bit of a problem. And so I’m not going to enter details and I would probably say stupid things in any case. So we continue to see emerging codecs that are not relatively free and it’s probably going to remain the case for a number of years, unfortunately from Adobe perspective, and from my perspective, of course,
Jeremy Jung 00:36:16 There’s always these new formats coming out, and the rate at which they get supported in the browser, even on a per browser basis, is very, there can be a long time between, for example, WebP being released and a browser supporting it. So seems like maybe we’re going to be in that situation for a while where the codecs will come out and maybe the browsers will support them, maybe they won’t, but the timeline is very uncertain. Something you had mentioned, maybe this was in your email to me earlier, but you had mentioned that some of these specifications, there’s business considerations like with digital rights management and Media Source Extensions. I wonder if you could talk a little bit about maybe what Media Source Extensions is and encrypted media extensions and what the considerations or challenges are there?
François Daoust 00:37:11 Well, I’m going to go very quickly over other history of view and audio support on the web. Initially it was supported through plugins. You were maybe too young to remember that, but we had extensions added to real player. This kind of things Flash as well, supporting videos in web pages, but it was not provided by the web browsers themselves. Then HTML5 changed the situation, adding these new tags, audio and video, but these tags on this, by default support, you give them a resource, like an image as it’s a module or a video file, they’re going to download this video file or audio file, and they’re going to play it. That works well. But as soon as you want to do any kind of real streaming, files are too large to stream to get, you know, to get just a single fetch on them.
François Daoust 00:37:58 So you really want to stream them chunk by chunk, and you want to adapt the resolution at which you send the stream based on real time conditions. The user’s network if there’s plenty of bandwidth you want to send the user the highest possible resolution. If there’s some kind of hiccup temporary in the network, you really want to lower the resolution, and that’s called adaptive streaming. And to get adaptive streaming on the web, well, there are a number of protocols that exist. Same thing. So many of them are proprietary and actually they remain proprietary to some extent. And some of them are over http, and they are the ones that are primarily used in web context. So DASH comes to mind, DASH for Dynamic Adaptive Streaming over Http. HLS is another one initially developed by Apple, I believe. And it’s Http Live Streaming probably.
François Daoust 00:38:49 Exactly. And so there are different protocols that you can use. So the goal was not to standardize these protocols because again, there were some proprietary aspects to them. And same thing as with codecs. There was no, well, at least people wanted to have the flexibility to tweak parameters, adaptive streaming parameters the way they wanted, or different scenarios. You may want to tweak the parameters differently. So there needed to be more flexibility on top of protocols not being truly available for use directly and for implementation directly in process. It was also about providing applications with the flexibility they would need to tweak parameters. So media source extensions come into play for exactly that. Media source extensions is really about you, the application fetches chunks of its audio and video stream the way it wants, and with the parameters it wants, and it adjusts whatever it wants, and then it feeds that into the video or audio tag, and the browser takes care of the rest.
François Daoust 00:39:48 So it’s really about doing the adaptive streaming, let applications do it, and then let the user agent, the browser take care of the rendering itself. That’s media source extensions. Initially it was pushed by Netflix. They were not the only ones, of course, but there was a major proponent of this technical solution because they were expanding all over the world with plenty of native applications on all sorts of devices. And they wanted to have a way to stream content on the web as well. Both, I guess to expand to a new ecosystem, the web providing new opportunities, let’s say. But at the same time, also to have a fullback because for native support on different platforms, they sometimes had to enter business agreements with the hardware manufacturers, the service provider or whatever. And so that was a way to have a fullback that the kind of work is more open in case things take some time and so on.
François Daoust 00:40:45 So they probably had other reasons. I mean, I can’t speak on behalf of Netflix, on others, but they were not the only ones of course supporting this media source extension specification. And that went kind of, I think it was created in 2011. I mean the work started in 2011 and the recommendation was published in 2016, which is not too bad from a standardization perspective. It means only five years where it’s a very short amount of time. At the same time and in parallel and complement to the media source extension specifications, there was work on the Encrypted Media Extensions. And here it was pushed by the same proponent in a way because they wanted to get premium content on the web. And by premium content, you think of movies and these kinds of beasts. And the problem with the, I guess the basic issue with digital asset just as movies, is that they cost hundreds of millions to produce.
François Daoust 00:41:42 I mean, some cost less, of course, and yet it’s super easy to copy them if you have access to the digital file, just copy and that’s it. Parsing is super easy to achieve. It’s illegal of course, but it’s super easy to do. And so that’s where the different legislations come into play with digital right management. Then the fact is, most countries allow system that can encrypt content and through what we call DRM systems. So content providers, the ones that have movies or the studios here, and Netflix is one of the studios nowadays. But not only them, all major studios would push for, wanted to have something that would allow them to stream encrypted content, encrypted audio and video, and mostly video to web applications so that they could provide the movies. Otherwise they’re basically saying, sorry but this premium content will never make it to the web because there’s no way we’re going to send it in clear to the end user.
François Daoust 00:42:43 So encrypting me extensions is an API that allows to interface with what’s called the Content Decryption Module, CDM, which itself interacts with the DRM systems that, the browser may or may not support. And so it provides a way for an application to receive encrypted content, pass it over, get the right keys, the right license keys from whatever system actually pass that logic over to the, and to the user agent, which passes it over to the CDM system, which is kind of black box that does its magic to get the right decryption key and then the, and to decrypt the content that can be render. The Encrypted Media Extensions triggered a lot of controversy because it’s DRMs and DRM systems, many people things should be banned, especially on the web, because the premise of the web is that the user trusts a user agent.
François Daoust 00:43:39 The web browser is called the user agent in all of our specifications and that’s the trust relationship. And then they interact with a content provider. And so whatever they do with the content is their, I guess, actually their problem. And DRM introduces a third party, which is the end user no longer has the control on the content. It has to rely on something else that restricts what it can achieve with the content. So it’s not only a trust relationship with its user agent, it’s also with something else, which is the content provider in the end, the one that has the license where provides the license. And so that triggers ahead of a lot of discussions in the W3C degenerated into formal objections being raised. Again, the specification that escalated to the, I mean at all leverage, it’s the story in W3C that really divided the membership into opposed camps in a way, that was not only, not really 50/50 in the sense that not just a huge fights, but that triggered a lot of discussions and a lot of formal objections at that time.
François Daoust 00:44:52 We were still, from a governance perspective, interestingly, the W3C used to be a dictatorship. It’s not how you should formulate it, of course, but it was a benevolent dictatorship. You could see it this way in the sense that the whole process is escalated to one single person was Tim Bernes-Lee who had the final say when none of the other layers had managed to catch and to resolve a conflict. And that has hardly ever happened in the history of the W3C, but that happened to be through for EME or Encrypted Media Extensions. It had to go to the director level who after due consideration decided to allow the EME to proceed. That’s why we have a an EME standard right now, but still it remains something on the side EME we’re still in the scope of the media working group, for example. But the scope, if you look at the charter of the working group, we try to scope the updates we can make to the specification to make sure that we don’t reopen a can of worms because it’s really a topic that triggers friction for good and bad reasons.
Jeremy Jung 00:46:00 And when you talk about the media source extensions, that is the ability to write custom code to stream video in whatever way you want. You mentioned the MPEG DASH and Http Live Streaming. So in that case, would that be the developer gets to write that code in JavaScript that’s executed by the browser?
François Daoust 00:46:22 Yep, that would be it. And then typically, I guess the approach nowadays is more and more to develop low-level APIs in the W3C or web in general, and to let libraries emerge that are going to make lives of a developer easier. So for MPEG DASH, we have the type dash.js, which does a fantastic job at implementing the complexity of adaptive streaming. And you just hook it into your workflow and that’s it.
Jeremy Jung 00:46:53 And with the Encrypted Media Extensions, I’m trying to picture how those work and how they work differently.
François Daoust 00:47:01 Well it’s because the key architecture is that the stream that you may assemble with a media source extension, for example, because typically they’re used in cooperation. When you hook it into the video tag, you also call EME, and actually the stream goes to EME and when it goes to EME, actually the user agent hands the encrypted stream goes to the CDM, Content Decryption Module, and that’s a black box. Well, it has some black box logic, so it’s not even if you look at the Chromium source code, for example, you won’t see the implementation of the CDM because it’s a black box. It’s not part of the browser. It’s sandboxed, it’s execution sandbox. That’s the EME’s kind of unique in this way where the CDM is not allowed to make network requests, for example, again, for privacy reasons. So anyway, the CDM box has the logic to decrypt the content and it hands it over.
François Daoust 00:48:01 And then it depends on the level of protection you need or that the system supports. It can be against software based protection, in which case actually a highly motivated attacker could actually get access to the decoded stream, or it can be a more hardware protected, in which case actually the, it goes to your final screen, but it goes through the hardware in a mode that the US supports, in a mode that even the user agent doesn’t have access to it. So it doesn’t, it can’t even see the pixels that get rendered on the screen. There are several other APIs that you could use, for example, to take a screenshot of your application and so on. And you cannot apply them to such content because they’re just going to return the black box again, because the user agent itself does not see the pixels, which is exactly what you want with encrypted content.
Jeremy Jung 00:48:54 And the content decryption module, it’s if I understand correctly, it’s something that shipped with the browsers, but you were saying is if you were to look at the public source code of Chromium or of Firefox, you would not see that implementation.
François Daoust 00:49:12 True. The typical examples are Widevine. So interestingly speaking in theory, these systems could have been provided by anyone in practice. They’ve been provided by the browser vendors themselves. So Google has Widevine, Microsoft has something called Playwriting, Apple their name escapes my mind. So that’s basically what they support. So they also own that code that, you know, with it don’t have to, and Firefox actually they don’t remember which one they support among these three, but they don’t own that code typically. They provide a wrapper around it. Yeah, that’s exactly the crux of the issue that people have with JMS, right? It’s the fact that suddenly you have a bit of code running there that is, that you can sandbox, but you cannot inspect and you don’t have access to its source code.
Jeremy Jung 00:50:07 Yeah, that’s interesting. So the, almost the entire browser is open source, but if you want to watch Netflix movie for example, then you need to run this CDM in addition to just the browser code. I think we’ve kind of covered a lot. I wonder if there’s any other examples or anything else you thought would be important to mention in the context of the W3C?
François Daoust 00:50:34 There’s one thing which relates to activities I’m doing also at W3C. Here we’ve been talking a lot about standards and implementations in browsers, but there’s also adoption of these technology standards by developers in general and making sure that developers are aware of what exists, making sure that they understand what exists. And one of the key pain points that people keep raising on the web platform is first, the web platform is unique in the sense that there are different implementations. I mean, if you, anyway, there are different context, different run times where there’s just one provided by the company that owns the system. The web platform is implemented by different organizations. What’s in the specs is not necessarily supported. And of course, MDN tries to document what’s supported thoroughly. But for MDN to work, there’s a lot of needs for data that tracks browser support.
François Daoust 00:51:33 And this data is typically in a project called the Browser Compact Data, BCD owned by MDN as well. But the open web docs collective is the one maintaining that data under the hoods. Anyway, all of that to say that, to make sure that we track things beyond work on technical specifications, because if you look at it from W3C perspective life and when the spec richest standards you know, candidate record or rack, you could just say, I’m done with my work. But that’s not how things work. There’s always, you need the feedback and in order to make sure that developers get the information and can provide the feedback that standardization can benefit from and browser vendors can benefit from. We’ve been working on a project called Web Features with browser vendors mainly, and a few of the folks and MDN and can I use and different people to catalog the web in terms of feature that speak to developers.
François Daoust 00:52:32 And from that catalog, it’s a set of feature IDs with a feature name and feature description that say this is how developers would understand instead of going too fine grained in terms of there’s this one function call that does this because that’s where you, the kind of support data you may get from Browser Compact Data and MDN initially. And having some kind of course grained structure that says, these are the features that make sense, they talk to developers, that’s what developers talk about. So we need to have data on these particular features because that’s how developers are going to approach the space. And from that, we’ve derived the notion of baseline badges that are now shown on MDN on can I use and integrated in IDE tools such as visual studio and libraries. Some linters have started to integrate that data.
François Daoust 00:53:26 So the way it works is, we’ve been mapping these coarser grained features to BCDs finer grained support data, and from there we’ve been deriving a kind of a batch that says this feature is implemented, well has limited availability because it’s only implemented in one or two browsers, for example. It’s newly available because it’s implemented across the main browser vendor across the main browsers that people use. But it’s recent and widely available, which we try to, well there’s been lots of discussion in the group to come up with a definition, which essentially ends up being 30 months after a feature became newly available. And the time it takes for the versions of the different versions of the process to propagate because it’s does not meant there’s a new version of a browser that people just immediately get it.
François Daoust 00:54:26 So it takes a while to propagate across the user base. And so the goal is to have a signal that developers can rely on saying, okay, well it’s widely available so I can really use that feature. And of course, if that doesn’t work, then we need to know about it. And so we’re also working with people doing developer surveys such as state of CSS, state of HTML, state of JavaScript, that’s I guess the main ones. But also we’re running MDN short surveys with MDN people to gather feedback on, on these same features and to feed the loop and to complete the loop. And these data is also used by, internally, by browser vendors to inform prioritization process, their prioritization process, and typically as part of the interop project that they’re also running on the sack. So a number of different, I’ve mentioned, I guess a number of different projects coming along together, but that’s the goal is to create links across all of these ongoing projects with a view to integrating developers more and gathering feedback as early as possible and inform decision we take at the standardization level that can affect the lives of developers and making sure that it affects them in a positive way.
Jeremy Jung 00:55:45 Just trying to understand, because you had mentioned that there’s the web features and the baseline and I was trying to picture where developers would actually see these things. And it sounds like from what you’re saying is W3C comes up with what stage some of these features are at, and then developers would end up seeing it on MDN or some other site.
François Daoust 00:56:07 I’m working on it, but that doesn’t mean, so W3C thing, we have different types of group. It’s a community group, so it’s the Web DX community group at W3C, which means it’s a community owned thing, so that’s why I’m mentioning working with a representative from the browser vendors and people from MDN people, from Open Web Docs and a few of those again. So that’s the first point. The second point is, so it’s indeed this data is now being integrated. And if you look, you’ll see it on top of the MDN pages and most of them, if you look at any kind of feature, you’ll see a few logos, a baseline banner, and then can I use, it’s the same thing. You’re going to get a baseline banner. It’s smaller on can I use, and it’s meant to capture the fact that the feature is widely available or if you may need to pay attention to it.
François Daoust 00:57:00 Of course, it’s a simplification and the goal is not to the way it’s, the way the messaging is done to developers is meant to capture the fact that they may want to look into more than just this baseline status. Because if you take a look at web platform tests, for example, and if you were to base your assessment of whether a feature is supported based on test results, you’ll end up saying the web platform has no supported technology because there are absolutely no API that where browsers pass 100% of the test suite. There may be a few of them, I donít know. But there’s a simplification in the process when a feature is set to be baseline, there may be more things to look at nevertheless, but it’s meant to provide a signal that still developers can rely on their day-to-day lives if they use the feature, let’s say, as a reasonably intended and not using two advanced logic.
Jeremy Jung 00:58:04 I see. Yeah. I’m looking at one of the pages on MDN right now, and I can see at the top there’s the baseline and it mentions that this feature works across many browsers and devices, and then they say how long it’s been available. And so that’s a way that people at a glance can tell which APIs they can use.
François Daoust 00:58:25 It also started out of a desire to summarize this browser compatibility table that you see at the end of the page of the bottom of the page on MDN. But there are where developers were saying, well, it’s fine, but it goes too much into detail, so we don’t know in the end can we use that feature or can we not use that feature? So it’s meant as an informed summary of that it relies on the same data again, and more importantly, we’re beyond MDN, we’re working with tools providers to integrate that as well. So I mentioned the visual Studio is one of them. So recently they shipped a new version where when you use a feature, you can have some contextual, a menu that tells you, yeah, that’s fine. You, this CSS property, you can use it, it’s widely available or be aware this one is limited availability only available in Firefox or Chrome or Safari or Work kit, whatever.
Jeremy Jung 00:59:21 I think that’s a good place to wrap it up. If people want to learn more about the work you’re doing or learn more about sort of this whole recommendations process, where should they head?
François Daoust 00:59:35 Generally speaking, we’re extremely open to people contributing to the W3C and where should they go if they, it depends on what they want. So usually how things start for someone getting involved in W3C is that they have some kind of a technology in mind. They have some kind of pet project, something that they care about and so they find the right GitHub repository in this case probably, and they start a discussion with people who are actually developing the spec. And theRE may be feedback, it may be negative feedback, but it may be, hey, I had this ID, there are different places where you can raise IDs. So I don’t know where to point at a single place, I guess. Where I mean, there’s probably a contact email, but it doesn’t make a lot of, and that’s usually how things start.
François Daoust 01:00:19 And then when people get involved, some of them will like it. And it’s always a pleasure when someone enjoys interacting on standards and will find ways to encourage their contributions. Of course, if they represent companies will come a time where the process is going to kick in for patent reasons, for different things and for membership fees, because again, that’s how the W3C, the budget from the W3C comes from membership fees. And so things will kick in and we’ll get to the next level. But initially it’s really about just come find a feature you that doesn’t work for you, where you find something that looks clunky in one of the spec. If you read the spec, it’s not always a fun fancy read, but, and report to bug, not in the implementation, but in the spec. Start or propose a new ID to the group that develops the specification you like, and we’ll take it from there. And we’re very happy to have that feedback.
Jeremy Jung 01:01:16 So maybe find the, if there is an existing specification, go find the GitHub repository where that’s located. If you have something that you want implemented newly, maybe find the working group that would be related and start from there?
François Daoust 01:01:33 Absolutely.
Jeremy Jung 01:01:34 Francois, thank you so much for chatting with me today. I think we covered a lot about the W3C and how it relates to all the things that we use in our browsers every day. So thank you.
François Daoust 01:01:44 Thanks a lot for the invitation. Again, been a pleasure.
Jeremy Jung 01:01:47 This has been Jeremy Jung for Software Engineering Radio. Thanks for listening.
[End of Audio]


