Firefox/Projects/Startup Time Improvements Notes
Contents
Notes
Emails From Vlad on Startup
- Importance of Cold Start
Hey guys, Here are the details on getting a cold start (no FS cache) for perf analysis: MacOS X: sync purge Linux: sync echo 3> /proc/sys/vm/drop_caches Windows: Much trickier. First, if you're on Vista/7, you need to delete the preload cache data. In your windows dir, ... argh, can't find it atm, but there's a precache or preload or something, and there should be a firefox.exe-like dir inside it that you need to delete. Otherwise vista/7 will start doing its app preload acceleration stuff which will screw over the data you're trying to collect. Then, grab something like flushmem: http://aegisknight.org/2009/04/flushing-disk-cache/ There's another tool that's more flexible and might be faster, but that one should get the job done. Then run the app. Gotta do these steps before each start to simulate cold start. - Vlad
The purge command on Linux, echo 3> /proc/sys/vm/drop_caches, requires root privileges. [ddahl via adw]
Startup post on dev-apps-firefox
Shortly before our office move, we kicked off an effort to take a hard look at our startup time, to both understand what we all do, and to figure out how to improve it. zpao (Paul O'Shannessy), ddahl (David Dahl), and I have been working towards a few goals: - Document how to reproducibly get a cold and warm startup on Windows (XP/Vista/7), MacOS X, and Linux - Create tools to capture both JS execution during startup, as well as file IO - Add instrumentation to firefox to identify "big blocks" of startup for timing - Create tools to visualize the captured data in a way that's easy to analyze One thing that's fairly obvious with playing with startup is that "warm" startup is significantly faster than "cold" startup; that is, when you've launched Firefox before, the OS caches a bunch of the data off the disk, and it doesn't have to hit the disk again. This directly points to IO being a major component of our startup time, which is why IO is part of the capture above. This is a pretty big problem even on desktop systems; on my fairly beefy Windows 7 box, a cold startup takes upwards of 12 seconds (!); warm startup is also fairly slow if the system is under load. We've fixed some bugs in our dtrace javascript provider along the way (bug 403345), so dtrace will actually give correct (and sane) data now. Also, I've been doing a lot of work with Microsoft's xperf (part of the Windows Performance Toolkit), which can capture much the same data. (In theory we should be able to create JS providers for xperf as well, but that's out of scope for this particular project.) One example of the type of data we're capturing and tools that we're building is http://people.mozilla.com/~vladimir/misc/startviz/startviz.html -- this is just a quick io capture with xperf, with the data dumped into a Timeline widget from the SIMILE project. (The time scales are a bit off; the raw data is in microseconds, but SIMILE only handles milliseconds... so all times need to be divided by 1000, which becomes a problem when you go over 60 seconds -- which is actually just 60 ms! Something that we'll fix.) Another example is the result of a startup trace; zpao is still working on the visualization and data capture, but you can see an early version at http://playground.zpao.com/dtrace_treemaps2/ -- the "Exclusive function elapsed times" view will provide the most accurate data, basically telling you "how long did we spend in a given function, ignoring all descendants". In this view, the "null" filename dominates, generally indicating native code. And within that, calls to "getService" also dominate, which indicates that much of the time is spent within getService, presumably initializing whatever the requested service is. In the future, we hope to have hierarchy correctly represented in the inclusive view, as well as adding IO operations as part of that hierarchy. Also, these tools aren't really limited to analyzing startup; they will hopefully form the basis of a set of javascript performance analysis tools that we can apply to any browser operation. Besides IO and JS, Taras Glek found in earlier examinations of startup that loading CSS/XBL/etc. was taking a significant amount of time. We're working on instrumenting those parts of the code as well, so that we can capture it along with the raw js/io/etc. portions. Is there any other data that we should be capturing? Let us know, and we'll see if we can figure out how to add it in. I'll keep posting updated data as we have it, and will probably create a web page to collect it all -- at that point it'll be open season on any issues that can be identified. - Vlad
Rob Arnold notes on simulated cold startup on Windows
[14:00] <robarnold> for the disk, you open the disk and flush it... [14:00] <robarnold> for the cache, there's a sysinternals utility [14:01] <robarnold> taras: see http://technet.microsoft.com/en-us/sysinternals/bb897561.aspx [14:02] <robarnold> there's also http://twpol.dyndns.org/weblog/2009/07/29/01 which someone on vlad's blog found [14:05] <robarnold> ok. your results will probably be tainted by the windows feature that predicts file io for a process based on past runs (don't remember the name) [14:07] <robarnold> ah, I think it's in \Windows\Prefetch (at least it seems to be so on windows 7) [14:07] <robarnold> note that you might not have access to that folder since the systems, not the administrator, owns it [14:07] <robarnold> *system [14:17] <sid0> taras, robarnold: suggest you disable the prefetcher altogether [14:17] <sid0> HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\Prefetcher [14:18] <sid0> set EnablePrefetcher to 0 and (on Vista and above) EnableSuperfetch to 0 [14:18] <sid0> this might also need a reboot + clearing out of windows\prefetch
On XP at least the final fragment of the regkey mentioned by sid0 is slightly different. Not sure whether it's different from Vista/7 or he was just remembering wrong: [adw]
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\PrefetchParameters
(sorry, I remembered wrong. -- sid0)
The default value of the EnablePrefetcher key is 3 (on XP at least). [adw]
(it's the same on Vista/7 -- sid0)
adw's Windows XP experience
The notes above point to three tools for purging disk cache on Windows:
- CacheSet from Microsoft
- Uses a system call to request that the working set of the system's cache be cleared.
- purge.exe from Silver
- This appears to be equivalent to CacheSet.
- flushmem.exe from Chad Austin
- Allocates memory in 64 KiB chunks until it can't anymore, and then writes to each page, forcing older pages out to the page file.
I noticed no difference between starting Firefox warm and starting it after using both CacheSet and purge.exe. Whatever they may do, combined with disabled prefetch they are not sufficient to simulate cold startup.
After using flushmem.exe, Firefox starts up in about the same time it takes for it to startup cold, but it ground my system to a halt for nearly ten minutes.
These were Vlad's experiences as well:
[12:06pm] dietrich: vlad: what's the recommended way to force cold-start on windows? [12:08pm] vlad: all the stuff that people suggested doesn't work [12:08pm] vlad: with cacheset/purge/etc. [12:08pm] vlad: so I have no idea, other than reboot [12:09pm] vlad: reboot and turning off all the prefetching [12:09pm] adw: vlad: is https://wiki.mozilla.org/Firefox/Projects/Startup_Time_Improvements_Notes not correct? [12:13pm] vlad: -that- sadly is [12:13pm] vlad: flushmem would work [12:13pm] vlad: but it's faster to reboot [12:13pm] vlad: since flushmem causesyour system to grind to a halt for a few minutes, even more if you have lots of memory
Quoting a Microsoft software tester on this MSDN forum thread:
This is actually a very complicated thing to do, and to do it correctly, these are some of the things you need to worry about:
- Invalidating the CPU caches
- Invalidating the cache on the storage media
- Invalidating the OS's read cache (note that this is not the same as the write cache which can be flushed with Sync.exe via FlushFileBuffers)
- Removing items from the OS's KnownDLL cache (Larry, KB)
- Removing items from the CLR JIT compilation cache (.NET apps only)