Posted in Ivan Krstić


Ivan Krstić of OLPC

Ivan Krstić, Chief Security Architect, One Laptop Per Child, presented on OLPC technology at Google Tech Talks on April 12, 2007. He went into amazing detail about the Children's Machine XO structure and purpose.

Due to its length, the transcript of his speech was divided into two parts. Below is Part 2 of Ivan Krstić speech. Please continue to Part 1 for the complete transcript.


[Audience question] what happens when you put a fat SD card into the machine. The answer is you can grow your object store to it. But was there something particular you were asking about? [Person says something] I don't think there is a super quick answer; the best answer I can give is it'll just work. [Audience laughs]. The way this works is that you talk to the object store service and you say I would like to open a file, the operating system draws a little dialogue box for the user to chose a file, the user can chose the file based on search or anything else. Then what happens is that the object store will make that file appear in your application's directory space, OK, so it doesn't matter where it actually backed, the file will always appear in your application's directory space.

[Audience question] Yes, the question is that will we be able to read a file which is already on record and the answer is yes, ok.

One of the things that we a re looking at is that we have limited storage on the machine and we'd like to do something which we call the dropoff. The idea that we will let you star things that you really care about like you would do in Gmail to say that this is something that is really important to me. But hen if you think about it, if you're writing a paper, you'll do 200 revisions in an hour or two hours or something and while you are writing, you want to keep all those revisions for things like undo and to be able to see what you actually did; but then a month from then you probably don't care about all 300 revisions and 6 months down the line you almost certainly don't care about more than one revision for every hour you spent working on something.

So what you'd like to be able to do is for the system to be smart enough to figure out basically that the things you haven't starred you can drop off when you start getting low on space and if you've backed them up somewhere, you particularly don't care because if you want it back, you can get it back. But if you don't have backup, you want to be clever about this, for instance, if the oldest hundred documents are little text files but then there is six months old video that is fife times bigger than the oldest 100 text files you kind of want to drop the video than the text files, right?

But it is not clear what the heuristics should be for that, so that's still somewhat of an open problem, if you have ideas on how to do this, come talk to me, I'd like to hear them.
Our GUI is called sugar, its something you can play with today, its running on the laptops. You can download its, all the sources. Its based on technologies that are somewhat familiar, the toolkit is GTK+, but we're not using it to draw widgets, we built our own canvas that lets us draw the widget called Hippocanvas. We're using D-BUS for ipc. And sugar is a different approach to GUIs.

We're ditching overlapping windows; we're going with full screen windows. We want to provide much more context as you are actually doing things and using programs and we're building collaboration into the operating system. What I mean by this is that, network and presence are things that we consider fundamental to how OLPC is supposed to work as an educational program. So there is a presence service that we expose to you as an operating system level service and that means that if you're an application developer and you want to write programs where people can work together, you want to write a drawing program where people can work together etc. its going to be fairly easy to do this because we'll going to give you the APIs where the OS keeps your buddy lists essentially, and applications have access to this and applications can depend on their being a standard for what presence means and be able to use it for collaboration etc. and because all this is based on completely open stuff like XMPP and jabber, you can do neat things like mesh extrusion.

It mean that if you are a part of a mesh, say, your village, but then maybe you go away on a field trip somewhere or maybe someone comes and visits your village for a couple of months and goes back to their country they can actually join the mesh sort of remotely, which has some really nice implications.

This brings us to what I've been spending much of my time on, which is security. Security for OLPC is a very different ball game than most security that people who do security are even used to. I'll tell you in a second the things that constitute security for us, but one thing that should be obvious to you guys is that if we succeed, we're putting 10 million laptops out there, this year alone and possibly up to 50 million next year, we are creating one of the largest computing monocultures in the history of computing and there are several doomsday scenarios that you really don't want to see come to path. Certainly you don't want someone who dislikes OLPC, and there are plenty of people who dislike OLPC, being able to write a worm and kill 10 million laptops for kids that are using them to fertilize their education.

You guys in particular don't want to see someone writing a worm that DDoSes Gmail, I mean once you have 50 million machines connect us from all over the world, well maybe that's a real threat. Maybe its not going to be for Google, but its going to be for everyone else pretty much certainly, right? So we're taking security very seriously. We're probably the only project that I know where a vendor is trying to make a mainstream mass produce computers in large volume is actually willing to prioritize security and say "this has to be done securely form the start", so I was involved as a security person essentially the get-to-go of the project which doesn't happen very often.

So let me tell you what security is for us, we have parts of this machine that are hardware but you can damage it from the software to the point of requiring hardware replacement or repair. Two chips, the NAND flash, the primary storage has a limited number of write-erase cycles, if you run it down, that machine is not going to be doing anything until we replace the NAND flash. The BIOS chip is rewritable. Because we want to be able to do BIOS updates, but if someone overwrites your BIOS chip with a string of zeros, you're not going to be getting very far booting.

So that's the first thing we want to be able to do, we want to provide you recoverability and openness. We're going to this effort of saying we'll have a view source button on the keyboard because we want the kids to get curious and be able to change things etc, but its very easy to say that and laminate that and what will happen is first time someone screws up their machine because they've edited a piece of source and deleted a bunch of lines and their machine now either doesn't work or doesn't work as well or doesn't boot, if the kid have to now take their computer to an authority figure of some kind and say well, pa, I screwed up, please, please, do something so it works again, and they get yelled at for damaging their expensive laptop, kids are pretty quickly going to not really free to experiment with there machines any more.

So, you have to be open, you want the machines to be open, but you have to be recoverable in a way that doesn't really depend on other people,. You should be able to screw up your machine for any value of screw up that doesn't involve a slaughtering iron essentially and be able to get back to running state. We want to prevent permanent data loss, which goes hand in hand with this because if all we are going to tell we've built a mechanism in so that if you screw up your machine, you can press some buttons, the operating system will get restored but all your data is gone, it flew out the window, then that's some of the same kind of disincentive for experimentation and we're trying to fix this and this is one part where Google comes into the picture because the rumor is that you guys are pretty good at storing big amounts of data.

Huge privacy issues, kids as young as 5 or 6 in front of machines with microphones and video cameras. You can sense any number of potential disaster scenarios here and its absolutely critical that we go out of our way to make sure that privacy is protected. We want to make sure that it is difficult to use these laptops as a platform for attack and as long as you are providing this much openness to the users you can't really prevent them from being used as an attack platform but you can do things that make it so that they are not very attractive.

If you're trying to launch massive DDoS attacks or if you want to inform all the kind Google people of the deposed dictator that left a large cash deposit with his general etc, which is mind you something which gets brought up quite a lot since Nigeria is one of our launch countries. And one of the most controversial things is that we want to keep the laptop under control of its owner, meaning both that we'd like to make sure software cant do things that the user doesn't want to happen but it also means that these laptops should not be an attractive target for theft. You really don't want tens of thousands of these stolen and put on eBay.

So this, to start with is really hard, I was moderately depressed after putting together that list for the first time and then it got worse. So our goal while doing all this security wise is we cant have passwords, they're 5 and 6 year old kids, they don't remember passwords, it doesn't work. You cant expect them to read, (a) because they might not be able to read yet and (b) because, If all the security system does is, put a dialogue box on the screen that expects you to know the internals of TCP/IP and how your machine works to be able to make an educated decision so as to allow or deny something, that's not actually protecting anyone; its maybe protecting the vendor legally, but its not actually getting anything done.

You can't depend on massive amounts of updates being pushed to machines on a frequent schedule. You might not have connectivity, when you have connectivity, you might not have a lot of bandwidth. Right now my favorite Linux distribution which is the same one as you guys use, if you install their latest long term support release from CD and you connect to the internet, it will happily tell you, that it has to download 316 megs of security updates. That's not something we really want to do for these 10 million machines and 50 million next year.

You can't, well you can, but we're not going to depend on secrets in hardware and in software to do or need this. We're not going to do security by security, its simply not an option. And finally we're not going to lock the machines down. There are a lot of machines that are managed and embedded machines, they only let you run software that their vendor signs cryptographically and you've heard my position on the educational value of this project and what we are trying to do clearly it is not an option. We're not going to be locking these machines down.

So, we built something new, its called bitfrost, you can find its complete specification on our wiki, there is a Wikipedia article with links to the specification and there is a big shift herein that where, instead we're trying to make sure that untrusted code never runs, which is what anti virus software or anti spyware software try to do. We actually try to protect the system while under the assumption that any code that is running could be untrusted and malicious. There is 48 years of capability systems where you can search on how this works and the way we're actually implementing is real virtualization in the OS. So each application is running in its own VM and only has the permissions it needs to get the job done.

The interesting thing is that if you have this approach the definitions of what virus and spyware mean, no longer really apply because if all you can see as a n application is your little application space then if you're a virus, what are you going to infect, there isn't anything else in your application space than you. I'm necessarily glossing over some technical details, I'm happy to talk about them more or see the spec because I'm trying to move quickly. But what this comes down to, we were protecting against hardware damage by doing some pretty clever token bucketing on the NAND flash, we have crypto protection for the BIOS, we can restore the entire factory system just by an operation that is essentially regenerating some links on the file system, we can do that, restoration, without keeping a separate partition that contains the OS image, so we are not actually paying a price to be able do it.

Because all the data lives in a centralized store and is revisioned we can do trivial backups that actually work. We can get back to the point where as soon as you come in range of a server, that you don't even necessarily have to trust, all that you care is that it says, I provide backup service, you can start dumping encrypted backups to it and then get them back later and the user doesn't even have to be involved in the process. We're putting in LEDs that are actually wired in series in the hardware with the microphone and the camera so even if the kernel is compromised and the firmware is compromised you will actually be able to see a soft glow telling you that microphone and camera are on and if that happens when you are not expecting them to be on, you'll know something's up.

The way this works very briefly is there is a big kernel patch its 40 thousand lines of code but that covers every architecture that Linux builds on so its quite a lot small for just x86. Its something called Vservr. It is an existing patch built for ISPs that want to offer virtual Linux servers to customers and I'm twisting it to do something it has never been used to do before, that is desktop security. And so we are putting in a bunch of code, adding in a bunch of code to it to let us do some of the things that we care about. The interesting thing about this by the way is, people are terrified of how are you going to do virtualization on a 466 Mega hertz CPU. With the Linux VServer, the overhead you pay is 32k per task struct, but there is 0% measurable CPU overhead with up to 65,000 virtual machines running .

I'll let that sink in for a few seconds. It lets us do full network-stack isolation lets us completely isolate the filesystem, it lets us do this copy and write mode with just a twist on what immutable links do so we can actually do the said at no overhead on the file system. It provides various hooks which we can use, we can add scheduler bios for system services etc. directly on the kernel. There are no policies with this so the mental model is simple. We tell our application developers essentially, the mental model is that you are the only application executing on the machine and you can use a number of the interfaces that we provide to interface with the rest of the system but essentially, you are the only application running on the machine.

One of the things we do is we have these immutable application bundles meaning that installation is essentially dropping a file into the directory tree and suddenly you have installed the entire program. The application, when you do this, it only gets three writable directories, a temporary directory, configuration directory and a data directory. So the application cannot change itself, it cannot change it own binary, you get some nice security benefits out of this.

We have a system wide update service and we can do atomic filesystem updates when we are wanting to update the core operating system. We want to have that same system service doing updates for the individual applications, we don't know how exactly that should be working. Still somewhat of an open question. So I want to very briefly tell you, I've told you about a lot of cool technology, I want to tell you where things are right now. Most everything of the hardware you have heard about is done and works, certainly it works, and b-2 laptops, b-3 laptops are entering production, I think, today. And the b-3 laptops have the hardware improvements that I was telling you about today.
Sugar which is the GUI piece, it works, it runs, there is a lot of work left here to implement our entire human interface guidelines which are available on the wiki and the guidelines keep evolving, which makes it much of our moving target.

Our bulletin boards which are sort of our shared spaces, the idea that this entire room could open up their laptops and connect on the mesh and have a single space where we can put objects and work on them etc. we have some preliminary design, it hasn't been implemented yet. On the security side the kernel patch is ready to be merged but we haven't merged it yet, because the power management work has been dominating the kernel work. The user space work hasn't been started because, basically, its blocking on the kernel patch being merged.

The tech preview of yellow, which is the centralized object store is just about ready. It'll probably go light in a matter of days, there are parts of it that are really hard, yellow is something that I have also been implementing in addition to security and there are parts of this like the synchronization, like the smart tuning that I had mention in the dropoff etc. like being able to do smart caching that are still very much open questions and I keep sort of turning out designs and trying things but I'm sure we can get it a lot better. In terms of the kernel, suspend/resume work today, can take about 233 milli seconds, we want to half that.

Firmware work, there is plenty of it remaining, we need to integrate some of the crypto, stuff that we need, there is a lot of power management work is happening in the firmware because there is no acpi. Tones of profiling and optimization to do, it gets started pretty frequently but we don't have people who are just performance people on the team. So every time some other fire pops up, the people get side tracked into helping out with that so that interrupted a lot. The server side of things, our school servers, just now started, there are a bunch of software needs we have here, something that is fast and sane- multicast as a web cache and there are some people who are maybe going to work on that.

That hasn't really been started yet. Lots of work with python, python is memory hungry, the python community, there are a lot of people who care about optimizing python but there are people who generally care about optimizing CPU, meaning run-time performance, not memory footprint and to turn side the two are not alone in really wanting memory linear python. Nokia has this issue because they are actually trying to put python as the plugin language on their cell-phones and the symbian kernel does seem to do copy and write for memory pages.

They cant even do the trick that we can do which is run a python interpreter in the beginning and then fork it off, when we need to start activity, you know applications, which for us on Linux of course gets copy and write going and a bunch of pages get shared and so its not as bad a memory hit, they cant even do that, so this is something we would really like to see happen, more python , we'd love be able to do some of the more sort of small talkish and listish things like being able to interrupt a running program when it hits exception, change code and continue from where it started, this is something I understand we cant really do with python right now because the stack gets unwound completely. Its something to look at, maybe it is doable.

Canned modules, the idea that you would be able to, load a module and look at the objects that the module crated as initialization and then you can basically, serialize all of that so that when you are loading the module the next time you can just load the memory images instead of rerunning the initialization, it something that we are also looking at. I think the Nokia guys are actually working on implementing this.

Plugins are something that we deeply care about, we think that they are one of the only promising ways at this point to manage software growth and there are no good standards on how plugins should be designed and how software should be designed to accommodate plugins and its something which would be really nice to have for the wider community, not just for OLPC. Lots of questions about how plugins should work with security, I think that we are working with a set of pretty fantastic technologies, there is a ton of work that still needs to be done here and we are shipping in 6 months.

OLPC, I think is proof that even in 2007, there are people who deeply care about the fact that there are other people who care about writing beautiful and clean code. Really, this will matter to 10 million kids this year and 50 million the next year and all the technology aside we think that what is going to happen if OLPC succeeds is that there is going to be a generation of kids globally, that are going to grow up with access to knowledge and learning in a way that never really has been the case in the history of the planet. You know, this is a rally dangerous idea, I don't think I have to explain why although I'm happy to talk about it, but its also an idea which we think has to happen.

So, in closing, one of my favorite quotes is Thomas Edison who once said, "hell, there are no rules here – we're trying to accomplish something" and I think that's the best summary of OLPC that you'll find anywhere, so, if all this strikes a note, please come help out! We have a development live cd available now with sugar and development tools which you can just download either in parallels or QEmu or on a separate little machine and you can help out that way. We can get you laptops if you want to be helping out with any of the hardware work or the kernel work. All of our code is free software, its public, its on dev.laptop.org, along with our entire list of bugs, there are no private bugs, there is no backroom water cooler bugs and secret discussions in OLPC.

We're completely transparent, and … you know one thing that… we have very little people from Google helping out and the reason I find this troubling is because I see two explanation for this, one explanation is that you guys simply don't know that we are doing all of this or have found it too hard of a barrier transferee or collectively you guys are some of the smartest people… anywhere in the world have decided that what we're doing is so completely wrong that you don't want to get anywhere close to it and if either is the case, I would really like to hear about it.

So again, please do help us, I think we can get you hooked up with really exciting things to work on and that will also make you feel good about what you are working on and we have lunch afterwards, come up, talk to me or use any other resources that I mentioned, our wiki, our dev site, or find us on IRC and help out. So thank you, and I'm happy to, I think we have a little bit more time, I'm happy to answer questions about anything including why the sky is blue, but OLPC questions are preferred. So thanks, yes!

[Crowd applauds]

[Audience asks the questions]
So, the question is what happened when we handed it to kids? The first thing that happened was that the adults who handed them got really frustrated because we have not found a single adult who can open up one of these laptops in less than one minute and we haven't found out any kid who cant open them in less than thirty seconds. Past that, you better kid five hours to the kid and go away, the kids dive right in, the GUI which is unfamiliar to all of us who grew up with normal GUIs, takes them five minutes to figure it out.

We have a music program that lets you play with all sorts of samples, the piano and what not, you better have a soundproof room, when you give the laptop to them because kids will just bag away with this for four hours. So far, we've been exceedingly pleased with what we have seen when we actually handed the laptop to the kids. And there was a story just yesterday on CNET, about actual classrooms in Nigeria getting that laptops for the entire classrooms. You can see the picture online at CNET and you can see I mean the kids are ecstatic.
Yes in the back…

[Audience asks question]
So the question is how adults come into play. How do we keep parents from just taking the laptops away from the parents. There are a couple of answers to this, the first unofficial subtitle to once laptop per child is one laptop per teacher as well, so teachers will also definitely have laptops and to want to get them evolved and have them see that this is something that's important. We want to exercise a lot of community and social pressure on parents to not take away the laptop.

We've seen that this works, not just with OLPC, I mean there are place like exemadora in Spain where admittedly it wasn't laptops that were used, these were desktops, but… there was a lot of resistance initially to sort of the disruption that the introduction of many computers in schools created, and after a while because of the social pressure and the community pressure exerted throughout the state for parents to really understand that this is something that is important to their kids, the story came around completely.

Its now up from principals and parents and everyone sort of being really afraid and not wanting it to come anywhere near their kids to now being sort of a matter of pride for principals and schools and parents that their kids are using computers etc. so we know this works and we're hoping to do the same.

[Audience question]
The question is that how do you prevent the laptops from being sold. There is an anti-theft system in the laptop. If the country chooses to use the anti-theft system, these laptops are going to be pretty hard to sell, I'm happy to explain more about how the anti-theft system works maybe over lunch.
Yes!

[Audience question]
The question is that we sell to anyone, or just the countries. Right now just the countries. You know one thing that surprises people when I talk to them about OLPC is that the core team at OLPC, the people who actually designed most of you've heard, it used to be a twelve person team, now I think upto 14 people but that's fine, we were essentially re-inventing a computer with fourteen people and part of the reason we have been able to do it is because we've done it with so few people, we've been moving very quickly.

But it also means that if you want to handle logistics, it's a lot easier handling logistics for five crates of a million laptops each than is to handle logistics of five million crates one laptop each, so its just right now, it's a matter of not having the time and energy and people to be able to do logistics for our smaller orders and it is something that hopefully will change in the future.
Yes!

[Audience question]
The question is why not, you know because we want to do more lispy and small talky things. Why not just use list for small talk. And what about list machines and things like this. So if you guys have paid attention to what I was saying we really want to implement as much of everything in python that comes sort of very close to the general list machine idea, now we're not running python on bear metal as list machines did with pliss but we're coming pretty close to it. I mean, if you take away the kernel and you look at everything else you're essentially given a complete python… you know.. everything is python.

We are also shipping squeak on these machines, so there is a small talk implementation going to be available but I think right now for any number of reasons such as size of the community, momentum of the community… you know… I certainly think list print has a much higher learning curve. Steeper learning curve than python does for kids… you know… there are a lot of the same reasons applied to small talk… I mean, I really think that right now, the best thing we can do is run something like python.
Yes!

I'm… paying no attention to the man behind the curtain who is informing me that we're done with questions and so, join me for lunch if you'd like to talk more.
Thank you!

[Ivan's Speech Ends]

Tags: | | | | | |

Comments

Wonderful talk.

Transcription correction: every place it says "list" should be "lisp" It's a legendary programing langauge.