05 September 2008

Google Chrome: The Story of Open Source Browser

Google Chrome opensource browserWhen the first time created, web browser was definitely only a web page. There was no video upload, chat, games. They aren’t just like nowadays. Today they become an 'application'. But I think they need to be increased for the next day. What do we need from a browser? First browser need to be more stable, faster (even lot faster for web apps & Javascript) and also secure. Browser should have to find that sweet spot between too many features and too few with clear, simple, and efficient user interface. One of the problem with browser is they’re inherently single threaded. For example, once you have Javascript executing, it’s going to keep going, and the browser can’t do anything else until Javascript return control to the browser. So developer writes apis that asynchronous then the browser locks up because the Javascript is hung up on something.

Multiple threads must be much better than single, but what if we have multiple processes? Each having its own memory and its own copy of the global data structure. The process installation is the same as you find in modern operating systems. So, separate process rendering separate tabs, then Javascript threads will separate as well. One tab will be busy, while we’re still using all the others. And if there’s a browser bug in the rendered we still only lose the one tab. When one tab goes down we get a 'sad tab' but it doesn’t crash the whole browser. A multi process design means using a bit more memory up front. Each process has a fixed additional cost, but overtime it will also mean less memory bloat.

In a traditional browser we only have one process and one address space that we keep loading web pages into. When we have too many tabs open, we can close some to free up the memory. When we bring in another tab, we use the memory that was previously used. But as time goes on, fragmentation results little bit of memory still get used even when a tab gets closed. Either we have memory that nothing can refer to again, or there’s a piece of reallocated memory we still have a pointers to. So when the browser wants to open a new tab, it can’t fit in the existing space and so the O.S has to grow the browser address space. And this problem grows all day, as the lifetime of the browser extend. But when a tab is closed in Google Chrome, we’re ending the whole process and all the memory gets reclaimed. When we open the new tab after, you’ll start from scratch. So as we browse, Chrome creating and destroying processes all the time. If there’s a crazy memory leak it won’t affect us for that long because we’ll probably close the tab at some point and get memory back.

And Chrome is taking it one step further. Suppose we navigate from a domain A to domain B. There’s no need for any relationship between the two sites so now Chrome can throw away the old rendering engine, the old data structure, and the old process. So, even within a tab, we can be collecting and tossing out the garbage. Recycling the whole process. And just like with our OS, you can look under the hood with Google Chrome’s task manager to what sites are using the most memory, most bytes, and abusing our CPU. We can even see plug-ins within the tab, since they appear in Chrome’s task manager as separate processes. So when things start freaking out, we’ll finally have some insight into who’s misbehaving and eliminate them.

Google Chrome is a massive, complicated product that will need to load billions of different web pages, so testing is critical. Fortunately, here at Google, Chrome has an equally massive infrastructure for crawling web pages. Within 20-30 minutes of each new browser build, they can test it on tens of thousands of different web pages. Each week, 'chrome bot' test millions of pages, giving their developers early result they’d otherwise have to wait until external beta for. The key is catching problems as early as possible. It is less more expensive and easier to fix them right away. After a few days it is harder to track them down. And catching them early helps engineers write better code.

There are several ways they test each check in. From unit tests of individual pieces of code to automated UI testing of scripted user actions like ‘clicked back button…went to page…’ to fuzz testing: sending your application random input. In layout tasting, web kit found that producing a schematic of what the browser thinks it’s displaying is a more precise way to compare layouts than taking screen shots and creating a cryptographic hash. When started we were passing 23% of web kit’s layout tests. Moving from there to 99% has been a fun challenge and interesting example of test driven design. There are limits to what we can do with automated testing. We can’t test websites that require a password for example. And it’s not the same as a human being walking around and misusing things, they are using browser in the way they are designed it to be used.

Source: http://www.google.com/googlebooks/chrome/


Post a Comment