CrashKill/Analysis

From MozillaWiki
Jump to: navigation, search

A Time Before Automated Crash Reporting.

When Automated Crash Reporting Began

decade-of-fixing-crash-bugs.png

Figuring out the Who, What, When, Where, Why, and How of Crashes

Mostly Focused on The Top Crashes

But there is a long tail too!

Detective Work To Get to the Bottom of Crash Problems

  • getting to a reproducible environment/test case
    • identifying a common config
      • OS version, hardware & driver correlations
      • addon/plugin correlations
      • graphics cards and driver correlations
      • urls
        • protecting users privacy
        • concentration of urls helps to find crash problems fast, but its also rare.
        • wide range of urls points to general browsing and other things that might be going on. garbage collection, software update, UI interaction...
        • absence of urls and connection to start time
        • domains and locales/regions of sites visited connection to specific fonts, regional malware outbreaks,
      • common user actions in comments
        • e-mailing some some users?... (new automated system coming on-line)
      • Product Releases where the crash is seen.
        • calibrating volume on product releases
        • identifying which mozilla-central releases have the crash...
  • Connecting To Events Time & Regression Ranges
  • -- looking at the various forms of time data:
  • And Finding Spikes and Volume Regressions
  • Connecting to External and Other Events
    • Plugin/Addon Releases
    • Web Site Changes
    • Firefox/Gecko Source Code Changes
      • connecting changes on the stack to recent checkin's
    • Build Environment Changes... new compilers, build configs.
  • With a reproducible test case...
    • reduced test case
    • crash recorded on VM...
  • developer investigations can happen in parallel
    • finding problems though source inspection
      • using regression ranges to examine change sets
    • Running Minidupms in the debugger.

What data do we have?

Individual Reports


 Stack and Links to Source Code
 Modules Loaded
 Mini-dumps
 and beyond

1 signature
2 url
3 uuid_url
4 client_crash_date
5 date_processed
6 last_crash
7 product
8 version
9 build
10 branch
11 os_name
12 os_version
13 cpu_info
14 address
15 bug_list
16 user_comments
17 uptime_seconds
18 email
19 adu_count
20 topmost_filenames
21 addons_checked
22 flash_version
23 hangid
24 reason
25 process_type
26 app_notes

and more:

Beyond http://crash-stats.mozilla.com/

http://people.mozilla.com/crash_analysis/20101213/

20101213-pub-crashdata.csv.gz summary of all the reports for a given day
20101213_Firefox_3.6.9-core-counts.txt  -- for finding threading issues
20101213_Firefox_3.6.9-interesting-addons-with-versions.txt  
20101213_Firefox_3.6.9-interesting-modules-with-versions.txt.gz

http://people.mozilla.com/~jst/new-crashes/Firefox/latest/

whats new? finding crashes when they are introduced helps fix them easier and faster

http://people.mozilla.com/~chofmann/crash-stats/20101213/

top-4.0b8pre.html  ( just the firefox crashes... )
compare-rank-40b8pre-40b7.txt  ( simple version of what new? )
compare-rank-40b6-40b7-40b8pre.txt (what's fixed? )
mozilla-central-crash-trend.csv (which mozilla-central builds are most crashy?)

http://people.mozilla.com/crash_stacks/

beyond just a signature
a single signature might represent several bugs
many signatures might represent the same bug
Searching for common code pattern with different signatures
searching the long tail
Bug 480503  can't search for stack frames except for the top frame
https://bugzilla.mozilla.org/show_bug.cgi?id=480503
http://people.mozilla.com/crash_stacks/stack-summary-4.0b8pre.txt

Bclary's Automated Crash Hunter

What if we could test all the automated crash URLs Automatically?


We can and do [Bug 532972] crashes found using sisyphus crash automation

https://bugzilla.mozilla.org/showdependencytree.cgi?id=532972&hide_resolved=0

137 bugs found
72 bugs resolved
65 bugs open