|
In this episode we talk to Galen Hunt about the Singularity research OS. Galen is the head of Microsoft's OS Research Group and, together with a team of about 30 other researches, has built Singularity. We started our discussion by covering the basics of Singularity: why it was designed, what the goals of the project are as well as some of the architectural foundations of Singularity: software isolated processes, contract-based channels and manifest-based programs. In this context we also looked at the role of the Spec# and Sing# programming languages and the role of static analysis tools to statically verify important properties of a singularity application. We then looked a little bit more closely at the role of the kernel and how it is different from kernels in traditional OSes. In a second part of the discussion we looked at some of the experiments the group did based on the OS. These include compile-time reflection, using hardware protection domains, heterogenerous multiprocessing as well as the typed assembly language We closed the conversation with a look at some of the performance characteristics of Singularity, compatibility with traditional operating systems and a brief look at how the findings from Singularity influence product development at Microsoft. TranscriptSo welcome Galen to the show. Oh thank you very much Markus. So why don’t you introduce yourself a little bit to our listeners. So, I am Galen Hunt, I run the Operating Systems Research Group here at Microsoft Research in Redmond, I’ve been at Microsoft Research about ten years before that I was at the University of Rochester where I did my graduate degree. And my primary interest is around operating systems but I’ve done some distributed systems and a few other things. I spent most of my ten years here in Research but I did spent four years of my time at Microsoft in the Windows, no, two and a half years, sorry, in the Windows server division. Can you give us one minute overview what a Singularity in a nutshell and maybe how is different from traditional operating systems such as windows. So in a nutshell, Singularity is a brand new operating system, it’s a research prototype that we’ve build from scratch in using managed languages, so C#, in particular C# and other languages. Singularity is a microkernel-type architecture, so its very similar in that way, but the key idea is to see if we could build an entire operating system out of safe code from the ground up and to see how aggressively we could use tools and language and operating system architecture innovations to allow us to build the system that the fundamental we could check for errors before we ever ship the code rather than try to test them out. So the problem we want to solve is stability, reliability, dependability, or how would you phrase that? So the way I describe is dependability, the simple way I describe it is, how do you make a system that has predictable behavior because if you think about what users want, mean when they say dependable, they mean I know what it’s going to do and I know how it’s going to react, and it’s going to meet my expectations. Maybe before we go to the solution approach and the patterns and the stuff you did, why, I mean, why are traditional operating systems not dependable, what’s the problem? Is it just history, it just became like that or are there any new key technologies that make operating systems potentially dependable these days? Well, I think if you look at the what I call commodity operating systems Windows, Linux, the systems we use very widely, if you think about them, they are really well-designed on architectures from the early nineteen seventies, ok, thirty year old languages, thirty year old operating system technologies, and it was a very different world than we live in right now. It was a world in which for example you either in the nineteen seventies, if you had a computer, you either weren’t connected to a network of any kind and had never seen a network or you were connected to a network that everybody else on the network was funded by the same funding agency. Whereas now you are connected on the internet with lots of people that not only aren’t paid by the same funding agency, but some of them would like to separate you from your funding agency for example and your funds, right. A very hostile world. Another place where I think things have changed very radically over the last thirty years, thirty years ago if you where using a computer, using an operating system, you were probably a graduate student or at least a computer science geek or something like that, right. And whereas now we had you know - so then you really wanted to know about computers and now if most people use the computer they don’t have any clue how a computer works, they don’t want to know how a computer works, they just want it to work for them. So our assertion is that, hey, if the world has changed that radically, well then maybe the systems that we are using, well, they obviously don’t fit very well. And we see this with all sorts of security problems and bugs and other things and so this, well let’s start over from scratch thinking about what would a computer system look like that was designed for the modern world in its’ context. Ok, so let’s look at the solution at the architectural foundations of the solution. Can you give us an overview on what these are? So Singularity, as I have mentioned, one of the probably most important things is, it is written in safe code from the ground up, so there is a whole class of errors that exist in current systems, buffer overruns, things like that, that don’t exist in this system, because, you know they have been by design not allowed in the system. Singularity itself, as I already mentioned is kind of a microkernel-type architecture, so there is a closed kernel at the bottom that’s sealed off and then all of the code runs above the kernel in processes. And Singularity supports three kind of primary obstructions, I guess so they call them, so there is three core ideas in the architecture. The first one is what we call the software isolated process, the second one is that the communication between processors is through channel-based contracts, so with rich information and the third one is an idea that we call manifest-based programs. So if I were a buzz word person then I would say this sounds like SOA, but I don’t want to do that. No, I am not going this way. But there is clearly something which people often call as component-based software development where you, where a component has a well-defined dependence or well-defined dependencies on its’ environment and so it’s not a surprise if you run a component what it wants from its’ environment is formally specified. So it clearly isn’t that traditional, right? Yes, you know, traditionally when you talk about component, a good way I would describe it is: You have a piece of software and it has some dependencies and those dependencies are expressed in a way that a human can understand them. That’s the traditional way. That’s the traditional way. Yeah, right. And in Singularity with this idea of a manifest-based program and also the channel contracts, what we have actually done is specified the requirements in such a way that a computer can process them. And not just in the source code but in the final binary. We have this idea, a term that I use which is a self-describing artifact. And our idea is to be able to pick up a piece of software the binary, look at it and know exactly what its’ requirements are to run and how it interacts with the rest of the world. So, if you specify what this thing is and what it expects from the rest of the world, how far down do you go there, I mean, obviously one thing is easy, you can say I want to communicate to this other thing which maybe has a type, so that is relatively easy to specify, and then you have this concept of channels so I guess your components say I want to talk to this channel, but do you also formally specify something about the real behavior of the thing? So, we aren’t doing formal specification, so the way I would answer to that is we describe how you interact, Yes, black box. So, like what your, yes, it is kind of a black box, but the channel contracts describe your behavior as another party would see it through this particular communication interface. But within you, we don’t describe what happens inside your inside, you know. A good way of describing it is we’ll say we know what you are going to say and what are the ways you could put language together, but we are not going to tell you how you are going to think. And what this means. Yes. So, the contract-based channels basically means that you specify message interactions what goes in, what goes out, how you react to a specific message? Yes, so we specify the contracts, specify the messages that are allowed to go across the channel this particular type and it also has a state machine that describes the patterns of the communications, so a good example of this is: Most people would recognize pretty easily is, when I access a file in a traditional operating system, I’ll open a file, the operating system will say ok, it’s open and then I can do my read and write request and then when I am done, then I can say close and you know there is an implied state machine, I can’t do reads and writes before I do an open or after a close. So this is just a kind of formal mechanism that you can express those types of relationships. And it let’s us do some very powerful, so we can actually compile these contracts, we can also take the binary for a program that implements the contract, say client for example and we can analyze the code. Oh cool. And we can detect if the client actually conforms to the contract. So you don’t do it with a runtime watch dog that stops shouting if you don’t comply to the contract but rather you can do this statically. Yes, so we can actually check it on the developers’ desktops. I have a for me a very fun story around this, which is: We got the key channel mechanism up and running and had Singularity running web server benchmarks for about a year before we had the verifier online. And when we got the verifier online, we loaded in the contract for TCP sockets, and talking with the two TCP/IP stack and we ran it across the channel contract and we found there was an error in the contracts, we fixed the contract and then we ran it across the server code, our web server and it turned out we had a case where we weren’t handling what happens if you went, if the web server went to the socket, asked for data and there was no data available. And it didn’t correctly handle this case, so it was a, what we call "Heisenbug", so there has a bucket been in the code for literally a year and we had never diagnosed it, never, probably never even realized it was there just had server runs that didn’t run very well. We ran this verifier and literally within like two minutes of when we started the verifier on the code, the verifier came up and said you have an error in this line of the code, you aren’t handling the case when there is no data. Cool. Because the state machine basically had a state or a transition that lost the followed. Yes, there was this message that in this state, so there were three messages that here is the data, the other end has closed the channel, while there was no data, we weren’t handling no data case. And this was the day that the engineers on the project, up until then they had complained very vocally about these contracts, they were in the way, they were a lot of work, and the day that we found one of these bugs, which was a very subtle bug, the engineers stopped complaining. From then on, they have been very gun ho about that. It is interesting, we had an episode a while ago on static analysis tools and actually I forget who I did this with( Jonathan Aldrich), anyway, that guy basically explained that building these tools is quite complex because you have to understand what the code does, now I assume, because you have the contracts formally specified in the state machine, so you do a lot of these things that claritively, so the analysis is probably relatively easy, because it’s clear how the communication happens, it’s not just the side-effect of something. Yes, in fact this was one of the key insights in the team, if you look at the composition of this Singularity team, you find that a large portion in the number of people involved in the project, are static analysis, Jim Larris’s software improvement group so these guys that are world experts on static analysis and one of the insights to Singularity who said we are going to create the operating system with the intent that we can do as much as static analysis at all. So we have contracts that are easier for the verifier, not only do we have the world’s best static analysis researchers, Who also make their lives easier. We have done everything we could to meet them in the middle, the half-way ground as well. Well, that’s actually something I talk about a lot, I work a lot in model-driven software development and code generation and I always say the more we can do declaratively as opposed to having it as a protocol somewhere in the code, the better, because all the stuff that’s buried in the code is gone. Absolutely. Ok, so that was a long tangent, anyway. Let’s look briefly about the software isolated processes, I mean, to me it sounds, I know that is probably not the case, you could argue that sounds like overhead and sounds like it’s slow because there is all this stuff in managed languages and a switching context overhead might be worse compared to native low level stuff. So do you have any comparison with regards to performance? In fact, it’s faster. So the key thing, so let me drill in explaining software isolated processes just a little bit more. So in a traditional operating system you think of a process as a set of pages and then the operating system uses hardware protection mechanisms to keep you from using a pointer to touch another process’s memory. In a software isolated process, what we do is, we instead we take the code, the code has to be written in a safe language, and we verify that it is statically, that it is incapable of generating a pointer to someone else’s memory. Then we can run the processes without any hardware protection at all, replacing hardware protection with static verification, so we actually have a research paper that’s on the website where we did a detailed analysis of what the cost of hardware is and guess what, this hardware protection, that you always thought was free, in fact isn’t free. It costs you depending on the scenario and how you are using it, anywhere from 3-38 % of your performance. In the middle case, just turning on the TLV, the translation look aside mechanism from virtual to physical addresses costs you on the order of about 3 % of your performance. And so we were able to get rid of all those, of the overhead of the hardware, well also, when you talk about a context switch, we don’t have to change address spaces and reprogram the hardware when we change context, and so actually our context switches are between five and ten times faster than traditional operating systems. Cool. The last thing we didn’t talk about explicitly is the manifest-based programs. Again, the key idea with these manifest-based programs is getting back this idea of a self-describing artifact. And being able to analyze not only individual programs, but the entire operating systems, so even the kernel in our case has a manifest. And what this allows us to do is we can take the entire system and when you come and try to install a new program, we could tell you at install time, if it is going to break compatibility or interact in a bad way with any other part of the system. Version digest, so there is a notion of a version. Very rich versions, and we also have very rich dependency information so you can literally ask a question before you ever run the code, you can say Is this program capable of running correctly on my hardware and software configuration. So, something we should probably tell our listeners is that there is a real nice paper you wrote and that’s what I read somewhere on the web and that’s the basis for this interview. So I read about the Sing# language, so what language support do you have? Probably it is built on top of C# I guess, so what’s in addition to C# in that language? There is actually three languages involved here ok, so Sing sharp is built on top of a language called Spec#. It cannot evolve from Shelty’s group here in research, which is built on top of C#. So Spec# adds pre-imposed conditions surround functions and invariance, you could put an object invariance. And then what Sing# adds to Spec# on top of that is first class mechanisms for talking about communication. So sending and receiving messages and operations around communication or first class operations, there is also support for what we call linear types so in Singularity when you communicate, there is no shared memory between processes. You communicate by exchanging memory. Yes, sure. And there is a mechanism that we can use, it’s called a linear type. What that means is, this is how I describe it is: Only one person can hold a pointer to this particular piece of memory at a given time. And so when you give the memory off to someone else, we can actually look at your code and know that you are never going to try to access that memory again. So it makes this so that we can do very cheap communication. So that’s the point because if you do massage passing in traditional systems, you physically have to copy the stuff and that’s a big overhead. What you do is you avoid that you are basically exchange the pointer but you are sure that nobody can mess around once you handed it over. That’s right. And so it allows the communication to be extremely cheap and also statically verified. Right, and because it’s all in one big process, because you isolate via software and not via machine process things, you actually can share the memory. Yes Reminds me a little bit of Erlang, the Erlang language. There are some similarities. I recently talked to Joe Armstrong about it. What’s the process you work, you have probably first described your contracts and then you, I would guess, generate some kind of skeleton, against which you then implement the component, or how does this work? Well, so you specify your contracts first and you write them in, the other thing that Sing# has, the contract language is actually Sing#. So it is just like you are writing structs or something. Here is my set of messages, here is the channel contract, and then from a piece of code you, in your C# code, you import that name space that has the name of the contract and then you can just use it. A program against. Because it is all based on the CLR and there is no code generation involved there. What is the role of the kernel, actually we, in se-radio we didn’t have an episode on operating systems yet, maybe we should. So can you briefly explain what the kernel does and how it is different from traditional kernels, if there is a difference? Ok, well so, in a traditional kernel, a traditional operating system like Windows or let say Linux, the kernel does everything. Ok, the kernel knows how to talk to the disk, the kernel has the file systems, the kernel has the network stack for sending messages, the kernel hosts all of the device drivers and then the kernel does basic things like providing access to memories, scheduling threads, managing resources and things like that. Singularity is what we call a micro-kernel design and it’s an old…The idea of a micro-kernel is pretty old, it dates back to the well, relatively old, it dates back to the eighties with MACH and some other systems. Your European listeners should recognize L4 for example as one of the key micro-kernels. If they know operating systems. Singularity, so what this means for Singularity is that the kernel in Singularity provides some very basic services, it provides the ability to create processes and shut them down, the ability to create threads of execution within processes, the ability to force processes to acquire and release memory when they need it. And it also contains the ability to send and receive messages to other processes. So, it dispatches them, or is it like just another component which others can talk to? The abstraction we present is that of a channel endpoint. And then the kernel mechanism as you can send a message on an end point and it does the kernel manages the transfer of that message It is like an orb in some sense. Yes, to the other process, and then you can receive a message which is to take the message off of the channel. And that portion is actually very very light-weight. But, so it is a fairly minimal kernel and then all of the services that you can physically think of operating services, so the network stack, the file system, the device drivers, run as separate processes, as SIPS (software isolated processes) up above the kernel and one of the key benefits of that for Singularity is that if you think about it, a traditional system, like say Windows, device drivers run in the kernel space and so if, you know, they are just bits amongst the rest of the kernel, bits and so if a device driver fails, the whole kernel, The Blue Screen of Death. Well, one of the things you find is that device drivers are much more prone to failure than say kernel code. In Singularity this device drivers run in separate SIPS and so if a device driver fails, we can just close it and start up another one for example. So, you mentioned it before, the kernel is written in C# or Sing# and not in C or assembly language or is there a little? Yes, so the way I describe it is, the primary language we have implemented is C#, so I always say C#, it is actually Sing#. Sure, plus your contract stuff. Yes, plus these contracts and things. And the way to say this, 95 % of binary line count, 95 % of the kernel is written in C#. And then we still have an assembly code, basically the same place. If you looked at say the Linux source code, you would see that most of it is written in C, but there is places where there is assembly, so like the context switch code, for thread context which interrupt vectors at the bottom and things like that are still assembly. Another abbreviation that sounded to be quite central is API, what is that? You know, API, that is funny because everyone asks me “What does API mean?â€. It’s actually a very old term, particularly from the Unix world and it means an application binary interface. And what it means is, it’s a very formal description of the interface between a program and the kernel. So typically in historical Unix context it means you know, what is the binary format of a program, what is the syscall interface between the program and the kernel, etc. And in Singularity we have likewise have a very concrete kind of formal description of what the interface is between a program and the kernel. So what are the services that a program receives from the kernel? Sounds to me like channels or endpoints. This is a very good question. Here is the simple way I can answer that question. In fact, in some of our academic papers, some of the comments that come back is why aren’t you using channels for everything? Why is this kernel API? Here is the simple reason. If I send a message on a channel, there has to be some way for me to tell the kernel ok, send this message on this channel. Ok, so there is some basic primitive operation: send and receive, ok? Those are exposed as APIs. At Singularity there is actually more than just that. I think it is one of the very unique aspects of the Singularity design, is that the API, the way I would think describe it is, it gives a program a guaranteed subset of functionality around pure computation. I can create threads, I can run threads, I can synchronize threads, I can get memory and I can communicate with other process tree SIPS. And that set of functionality is always available to any program regardless of which version of Singularity it’s running on or what other services might be there. It also means that you can always count on it and it also, one of the things it let’s us do very easily is sandboxing. So Windows for example or Linux has a very hard time with how do I run a piece of code like some random code that I bring off the internet that I want to be able to say do a computation, but I want to know that it is not going to talk to my file system or send spam or anything. That API let’s you give a base abstraction for everything you need for a language run-time to run. And then we give you channels for the additional things that we want you to do. So, if we want you to be able to draw to the screen, we’ll give you a channel to the screen, if we want you to be able to talk to the file system, then we’ll give you a channel to a file system or a subset of it. And one of the important words you mentioned was ’give you’, so you cannot just grab a channel, you have to declare it probably somewhere and then you are given the channel and the giver decides whether this is maybe just a mock or whether it’s the real thing. Yes, that is exactly the case. So, the giver can decide and it does have to be declared in your manifests stat and not only does it have to be declared in your manifest, but we also take your binary and we analyze it. You know, at install time it makes sure you aren’t hiding in some place in your code that you are going to try to talk to say the TCP/IP’s stack that’s in spam for example. So there is this other thing called capabilities that I have to kind of make sense of in that context, is that something we already covered? The short answer is that channels are capabilities. If I give you a channel endpoint, I am giving you the capability to communicate with somebody. And so this allows us to use Singularity as a capability-type system, where you know, where I can talk about our program has these capabilities because I have given it these endpoints. One last question about this core stuff, you mentioned that SIPS have threads, now, SIPS sound to me relatively light-weight, why do you need threads, why don’t have just hundreds of thousands or whatever of SIPS, of processes? That’s a good question. So, actually our threads are even lighter weight than traditional threads, they don’t have to have, they have a mechanism we call link stacks for example, so they are very very light-weight, they are even lighter weight than SIPS. The answer is that, well, we think that there is quite often that case that people in order to express parallelism that they need shared memory. And since we don’t allow shared memory between SIPS, the only way to do that is inside of a given SIP with threads. So the argument, that they have shared memory and therefore are faster than processors, doesn’t really count because you have shared memory at least on the exchange heap between processes. Yes, well, except for, I will be very clear with this, the exchange heap is not shared across, Well, from a performance perspective it’s shared, but the programming model is different. That’s right, from the performance perspective it’s like shared memory, from the correctness perspectives, we always guarantee that only one SIP owns can access that memory at any given time. Very nice. So this was kind of the core of the system and you mentioned obviously before that this is still a research system, so you also do explorative, exploratory, you do experiments based on that platform and so can we look a little bit about what some of these experiments are? One thing you have mentioned in your paper is compile-time reflection. Yes, so that was one of my favorite experiments. So, one of the things that we say in Singularity is that the processors are sealed. So this means when we start up a process, before you start the process, you have to name all the code that is going to run in that process. And once you begin execution, you can’t alter the code that’s inside that, you can’t add a new code for example. And the other thing we do is, I think I mentioned already, is, even though we write the code in C#, we are not JET, Just-In-Time, environmental, like the JVM or the CLR, where actually ahead of time compiled environment. So at install-time, when you install a program, we take the MSIL, the intermediate language from the CLR, the C# programs are compiled into, we take that MSIL and we compile it into the native instructions set, the processors, processor or processors on your machine. And so this makes traditional dynamic reflection fairly difficult. And so we came up with this alternative mechanism called compile-time reflection. And basically the idea is, if you look at the way people use the reflection, almost like 90 % of the uses of reflections are from marshaling, and inter-process communication and serialization and things like that. And that stuff doesn’t really need to be dynamic. You know, the develop, the way I think about it is, a human has expressed their intend before that code runs and so what we have done is we have given a mechanism called compile-time reflection where the programmer can say ’hey for example I am generating a marshaling code’. Here is what the marshaling templates look like: And then we actually, when we compile, we go and generate all the code instead of doing it dynamically. So to do that, the Sing# language has this transform primitive or this transform thing. Yes, that’s right. And basically you build some templates and you say ’here is, you have a matching template, so this is, you know, whenever you see a pattern that looks like say a class that has this serialized attribute, then you have a template say then generate this body of code’. So it typically looks like, if you find a class has a serialized attribute, generate the right to stream a method, and here is what the body should look like. And because it’s compile-time, the nice thing is that if in subsequent code you actually call this automatically generated method, the compiler will not complain because at that time the method is actually really there. That’s right, the method is really there. And it’s also much faster than if you do it dynamically. I did some experiments, I actually do not remember if we ever published this result. But, I actually did an implementation on the CLR, did communication between to app[lication] domains in the CLR, one using the CLR’s built-in remoting and the other one using compile-time reflection. Compile-time reflection was fifty times faster. Yes, I believe that. We actually had an episode a while ago with a Lawrence Tread about his Converge language, which also involved compiled meta programming and compiled-time macros, so that ties in nicely. Yes, he is very active in research. Absolutely. So, you already said that there is this possibility of adding an attribute to something and then automatically transforming that into codes so that probably, so the question is how is this meta programming thing used in the platform, how do the manifest-based things, how does that, you know, go together? So one of the, in fact the very first place that we applied it, in fact kind of the motivating scenario for doing the compile-time reflection was for the device drivers. If you think about a device driver, if I am a video device driver, I have a set of resources that I want from the hardware, I want access to the IO-ports for that device that I am the driver for, I want access to the memory buffers, and other things like that. And traditionally the way that this is done is when your device driver starts up, it goes off and it asks for these resources and interacts with the operating system, dynamically to ask for the resources. And as a side effect, if you pick up the code for a device driver, the binary in a traditional operating system, you can’t say anything about, you have no idea, what the dependencies are. Well, at Singularity what we do is we actually put in, there is an object that you built into your device driver that has a set of attributes and you label explicitly what are the resources that you want as objects with labels and then you just use that object, you never ask for your resources as you express your code, you just use those resources. And then what happens is, well, we read your binary, we create the manifest for your binary and then we use the compile-time reflection to generate all the code that knows how to given some operations from the kernel to Yeah, how to tie these in. To give you access to these resources. And so we can check all your dependencies, but we can also, we give you this added benefit of when we generate this code for you automatically. It sounds a bit like dependence injection in some sense. I mean, you specify what you want explicitly, formally, machine-readably and then you have some external power which reads the stuff and gives you what you need. That is exactly what happens. Very nice, I mean I have to a little tangent here, I mean, I have been talking about this model-driven component-based development for ages, specifying in models what a component does and all that stuff, so to me what you do here is really taking this to the right level, making this the operating system paradigm as opposed to having that somewhere in a system. So to me this is really absolutely cool, I really like it. We are trying to be cool. That is what we are trying to do. I am not sure that it’s necessarily cool to the general public, I mean it’s not a hype topic, you should call it SOA-based. Yes, it’s hard to explain it to my wife. So and one of the nice things of compile-time meta programming as opposed to runtime is of course also that your static analysis tool and your dependability on the analysis work on the result of the meta programming transformations steps. That’s right. You can’t cheat. Yes, and in fact the templates are specified in such way, we can statically check the template and know that it will only ever generate the valid code. Whereas if I used dynamic reflection it’s a very low level interface and there is no way to tell ahead of time that the code will be correct. Ok, let’s look at another experiment, and everybody knows that the world is going to be radically different in a couple of years because we all have fifty-five thousand different cores on the processors, so this whole thing of multiprocessing, SMP, multi-core is probably something you couldn’t have ignored when building Singularity. No, we couldn’t have ignored it and we in fact found a few opportunities. So the base Singularity system is multi-threaded, multi-cored, so it, you know, give it 16 processors and it’s just happy to chunk away and use all of them. But we have also done some experiments around this idea of what we call heterogeneous multi-core. So if you have 55,000 processors on your machine, just to pick a random number, we don’t think they are all going to look the exact same. There is lot’s of reasons for power consumption or resource utilization. So that you actually want to have different sets of cores if you have got that much hardware. And so one of the things we have been looking at is how do we make a programming model that let’s you express your program and run it and if you have, you know 15 different types of cores there that we can use them. A good example is we can let you build a program that’s able to use your GPU and your CPU and say a special processor on your IO-cards simultaneously and you don’t have to worry about ’how do I program a GPU’, you just write the code and we manage to generate the right code to run on the GPU for you. The other thing that we have done in this experiment is an idea that we call a subservient kernel. So, if you look at these many core systems, if you have 55,000 cores, just to pick a random number, they are not all going to have the same size cache we have on current machines, the caches tend to be smaller. And so that means, the amount of code you want to run on that particular core, you want it to be smaller. And so we came across this idea of well, you know, a lot of these cores don’t need to actually run very much in the operating system, in fact they might want to run just a very very small subservient of the operating system and this thing we call the subservient kernel. And we can scale it everywhere from it’s the pure kernel that runs side by side with the other kernel just on separate cores, or down to a very small kernel that’s about 12k in size that doesn’t do anything other than to forward a request to a master kernel on another processor. The other thing I read about was hardware protection domains, so, although we discussed, or you mentioned that you do everything software-isolatedly, there is still some hardware thing in there. So why and what does it do? Well, so one of the questions we got asked when we were going around and giving the academic talks in Singularity is ’is hardware protection really that expensive’ or ’what if I run out of address space because you are sharing it with all the processes’, so we added this idea of a hardware protection domain and it is a hardware-based protection domain, it’s an address-based, so the way you think of it is almost the traditional heavy-weight hardware type process and we can generate them as the runtime decision, you could say ’oh, I want to take this SIP or this set of five SIPS and place them into a hardware protection domain. So for example if you have a SIP that came from the outside world, you know, from some hostile entity you are not sure what it is really going to do, but you really need to run it, you can put it in one of these hardware protection domains and get an extra level of protection. Or another example is a piece called a SQL Server, wants to have lots of virtual memory, and so you can give it its’ own virtual address space and so we can have hardware protection domains as separate address spaces and they can be either privileged, so they run a the same protection level as the kernel, or as user modes. So we can, so you can either look at it as belts and suspenders or a way to build exposed more rich capabilities out of the hardware, but not have the, you know, the key thing is, you can use it when you want it, but if you don’t need it, you don’t have to pay for the cost. You also have a typed assembly language which to me almost sounds like an oxymoron, so…So what is that about? So, typed assembly language is about this: We have so, remember we were saying we were relying on static verification to provide protection, so we take MSIL code and we convert it to x86 code and in Singularity, as originally conceived without typed assembly language, what that means is essential, the compiler is part of the trusted computing bases. If the compiler messes up and let’s the programmer do, you know manipulate memory incorrectly, the security of the whole system is corrupt, being compromised. What typed assembly language should do is, the compiler in addition to admitting the assembly code, you know the machine operations, also admits a proof that the code that it has generated is type-safe. And then when you load in that binary, there is a very small checkers, you know it is on the order of a thousand, a few thousand lines of code, that checks the bind, the typed assembly language to make sure that it is in fact type-safe. And that allows you to take the compiler out of the trusted computing base, so anybody can build a compiler, they can build a compiler for whatever instruction-set or they want, but we can still verify and still have safety. Assuming that other compiler also outputs this meta data that you use for proving things. Yes. Ok, so we are coming up towards the end slowly but certainly, so let’s think about a couple of concluding thoughts. We already talked about the performance of this thing a little bit. The interesting thing with the performance, when we designed the system and when we were originally planning it, we said, you know, dependability is far more important than performance. Absolutely. You know, I think there are people there, who would be happy to have a computer that was 20 % slower if they knew that they were never going to get another virus. Especially since next year is going to be as fast as now. Yes, exactly. And so we set off saying, we are not going to care about performance, it needs to be fast enough that people will use it, but it doesn’t need to be the fastest operating system on the planet. And we were very surprised when we got the full thing up and running that in fact it’s got very good performance because of the software isolation for example. So this dependability actually buys us performance in some cases. So it is a pleasant surprise, but not, you know, even if it wasn’t the case, I think the system would still be a success. Absolutely, it sounds like it. But, compatibility with current operating systems probably doesn’t really exist, I can’t run my Windows thing. There is no compatibility whatsoever. Well, that is maybe a good thing for a change. I mean, we have to start at some point again. Well, there is, I guess, so we do have one former compatibility, which is if you have C# code that runs up on CLR, you can run it on Singularity. It’s kind of like porting between different versions of Unix - high level, you just run the code, knocked around low levels it is not. You know, the, I guess the compatibility really gets back to the core idea, this is a research prototype. My job as an operating systems researcher at Microsoft, is to try radical new ideas as far out in the future as I can. See which of those ideas work. The ones that work, I come back to the product guys and I say hey, you should come and try to put the incorporate these ideas into your product. They can figure out how to do compatibility. And the other thing is, sometimes we produce negative results, in fact, Dan Ling, our former vice president used to say if we don’t fail half the time, we are not beating grass. So when we fail, we go to the product guys and we say hey, we tried this really aggressive thing, it is not going to give you the desired results, so don’t go down this path at all. So is there any interest from the product crew, is it going to be in Windows 2012? Or Something? Well, so the way we do tech-transfer most of the time at Microsoft is, we don’t pick up, say the entire code-based, say, ok, Singularity is going to be the next version of Windows for example, but what we do is we transfer as many ideas. And there are already ideas filtering out into the product groups in this CLR team and into the Windows team and you know, some of the ideas will get out there sooner and some of them will get out later. I think we have had a much stronger impact than I anticipated when I started the project. So, to wrap this up, since we are talking about ideas and concepts, I think one other conclusion from this project might also be the combination of a good architecture, good concepts, a language that has those concepts as first class citizens and the useful tools for it, for example analysis tools, that these three building blocks are probably a very good set of ingredients for building really good systems. That’s exactly the message. And of course I didn’t come up with that, it’s in the paper, but. I mean, I really like this message. And it is not limited to the top rating systems. Yes, I think that is very true, I think the ideas can be applied in lots of places, you know, when we started the project, one of the key ideas was particularly around the static analysis and the verification tools and the tools people in the research and the product groups have made tremendous progress over the last decade and it is great to see those tools getting applied. Ok, is there anything else I should have asked or some words of wisdom that you want to leave for our listeners? I have got no wisdom whatsoever, however, I did want to say one thing, you know. So we have been talking here with me, one person, and this is, Singularity it was actually the product of a large number of people. We had about 30 researchers in Microsoft research involved across both our Redmond lab and our Silicon Valley and our Cambridge labs as well, it’s been a fantastic experience just working with some of these best researches in the entire world and I am just thrilled to death that they let me work with them as well. Absolutely, so thank you and all the others for sharing this with us, I think this was a really, for me it was a really inspiring paper and episode, thank you. Thank you. |