CrashKill/Analysis
From MozillaWiki
A Time Before Automated Crash Reporting.
When Automated Crash Reporting Began
Figuring out the Who, What, When, Where, Why, and How of Crashes
Mostly Focused on The Top Crashes
But there is a long tail too!
Contents
Detective Work To Get to the Bottom of Crash Problems
- getting to a reproducible environment/test case
- identifying a common config
- OS version, hardware & driver correlations
- addon/plugin correlations
- graphics cards and driver correlations
- urls
- protecting users privacy
- concentration of urls helps to find crash problems fast, but its also rare.
- wide range of urls points to general browsing and other things that might be going on. garbage collection, software update, UI interaction...
- absence of urls and connection to start time
- domains and locales/regions of sites visited connection to specific fonts, regional malware outbreaks,
- common user actions in comments
- e-mailing some some users?... (new automated system coming on-line)
- Product Releases where the crash is seen.
- calibrating volume on product releases
- identifying which mozilla-central releases have the crash...
- identifying a common config
- Connecting To Events Time & Regression Ranges
- -- looking at the various forms of time data:
- And Finding Spikes and Volume Regressions
- time of crash (breakpad/socorro dup reports) https://bugzilla.mozilla.org/show_bug.cgi?id=579136
- time since last crash
- time since start up (uptime)
- time of the build
- install time
- Connecting to External and Other Events
- Plugin/Addon Releases
- Web Site Changes
- Firefox/Gecko Source Code Changes
- connecting changes on the stack to recent checkin's
- Build Environment Changes... new compilers, build configs.
- With a reproducible test case...
- reduced test case
- crash recorded on VM...
- developer investigations can happen in parallel
- finding problems though source inspection
- using regression ranges to examine change sets
- Running Minidupms in the debugger.
- finding problems though source inspection
What data do we have?
Individual Reports
Stack and Links to Source Code Modules Loaded Mini-dumps and beyond 1 signature 2 url 3 uuid_url 4 client_crash_date 5 date_processed 6 last_crash 7 product 8 version 9 build 10 branch 11 os_name 12 os_version 13 cpu_info 14 address 15 bug_list 16 user_comments 17 uptime_seconds 18 email 19 adu_count 20 topmost_filenames 21 addons_checked 22 flash_version 23 hangid 24 reason 25 process_type 26 app_notes and more:
Beyond http://crash-stats.mozilla.com/
http://people.mozilla.com/crash_analysis/20101213/
20101213-pub-crashdata.csv.gz summary of all the reports for a given day 20101213_Firefox_3.6.9-core-counts.txt -- for finding threading issues 20101213_Firefox_3.6.9-interesting-addons-with-versions.txt 20101213_Firefox_3.6.9-interesting-modules-with-versions.txt.gz
http://people.mozilla.com/~jst/new-crashes/Firefox/latest/
whats new? finding crashes when they are introduced helps fix them easier and faster
http://people.mozilla.com/~chofmann/crash-stats/20101213/
top-4.0b8pre.html ( just the firefox crashes... ) compare-rank-40b8pre-40b7.txt ( simple version of what new? ) compare-rank-40b6-40b7-40b8pre.txt (what's fixed? ) mozilla-central-crash-trend.csv (which mozilla-central builds are most crashy?)
http://people.mozilla.com/crash_stacks/
beyond just a signature a single signature might represent several bugs many signatures might represent the same bug Searching for common code pattern with different signatures searching the long tail Bug 480503 can't search for stack frames except for the top frame https://bugzilla.mozilla.org/show_bug.cgi?id=480503
http://people.mozilla.com/crash_stacks/stack-summary-4.0b8pre.txt
Bclary's Automated Crash Hunter
What if we could test all the automated crash URLs Automatically?
We can and do [Bug 532972] crashes found using sisyphus crash automation
https://bugzilla.mozilla.org/showdependencytree.cgi?id=532972&hide_resolved=0
137 bugs found 72 bugs resolved 65 bugs open