CrashKill/2009-10-12

From MozillaWiki
Jump to: navigation, search

Misc

  • (Dolske) Got confused about exactly how throttling works, so looked into it...
    1. For release builds, we currently throttle at 15% on the server, and 10% on Windows clients. The client is unthrottled on OS X / Linux.
    2. For nightly and beta builds, the throttling is disabled on both the client and server.
    3. The client throttling actually just sets the default value of the "Submit Report" checkbox, for the very first time a user sees the crash reporter. The user can check (or uncheck) the checkbox, and we save that as the default for the next time they crash.
    4. Reports that were throttled by the server will be processed if explicitly requested by the client (eg, by clicking an about:crashes link). Also, reports with comments are always processed. Both are probably a small number in practice.
    5. So, ignoring user changes, that would imply the actual number of release-build crashes seen by users is ~66x the number reported by Socorro for Windows, and ~6.6x for OS X. But we don't know how many users do change the checkbox, so these are at best estimates.

Breakpad & Socorro

Bugs

  1. Tomcat
    1. know now how to generate the urls list and working on Bugs bug 519755 - bug 519344 - bug 519752 with this lists and my automated vm's and will update the bugs as soon as i have results
    2. some un-reproducible bugs : bug 519729 - bug 519344
  2. _PR_MD_SEND - bug 489533 - Jonas/jimm. Potentially bad LSP issues. jimm will be looking at it this week.
  3. nsWindow::GetParentWindow(int) - there is one crash on 3.6b1pre for this, bug id is bug 470487, should be bug 506108 (destroy widget vs. create widget).
  4. nsCycleCollectingAutoRefCnt::decr(nsISupports*) - dbaron bug 500879 - One of the thread safety need to figure out contacts for each of those extensions and get them to change their ways. Today: There are four bugs and really 15-20 top crashes with this. There's two things we can do about this 1) contact them 2) There's also something we can do to make this crash a lot less, we have a patch for it, maybe 90-99% crashes fixed w/this. We are going to see how expensive the patch is. Talos, will show us. Will know in the next few days. I got a little stuck on outreach. I talked to macafee, and they will ship soon, if they haven't already. Need a bitdefender contact. There are four others: Relevant Knowledge (spyware), Move Media Player. bugs: 521745-8-52-53.
  5. nsGlobalWindow::cycleCollection::UnmarkPurple(nsISupports*) - dbaron bug 504392 - Same as above.
  6. nsEventListenerManager::Release() - jst bug 513334 - Same as above.
  7. UserCallWinProcCheckWow - bug 501429, - jst - No progress there, supposedly caused by the google talk plugin, unable to reproduce.
  8. _PR_MD_SEND - bug 489533 - jimm. Status: Nothing new. Timeless seems to know what is going on. Spyware and anti-spyware both hook into a library that causes network connectivity to not work properly. Trying to reproduce with fsecure. I'm worried that it's going to get to blocking spyware. We might consider adding a message after a re-start after a crash that says the crash was caused by a particular piece of malware/spyware. Jonas is still trying to reproduce this one. See also bug 467167
  9. RtlpWaitForCriticalSection - JST - Flash - bug 511757 - Still investing.
  10. RtlpWaitOnCriticalSection - JST - Not Flash, something else. bug 511759 - ADR toolbar. We need to reach out to them.
  11. @0x0 - bug 519616 - jrmuizlar - Need to get the stack unwinder done first.
  12. nsStyleSet::FileRules(int (*)(nsIStyleRuleProcessor*, void*), RuleProcessorData*) - bug 492675 - Landed, needs 1.9.1 approval.
  13. _woutput_l - bug 511756 - dolske - Haven't had a lot of time to look at this in-depth. This seems to indicate a smiley malware. This one is correlated with an extension, will update the bug with the info.
  14. KiFastSystemCallRet bug 514589 - Jonas - Code is written, just needs to be staged.
  15. NPSWF32.dll@0x77bd0 - Farmtown flash - JST - Need to know when Adobe will ship fix.
  16. GraphWalker::DoWalk(nsDeque&) bug 500105 - 10/12: peterv + dbaron went over minidumps for this last week.
  17. nsWindow::GetParentWindow(int) - bug 470487 - jst
  18. NPFFAddOn.dll@0x11867 bug 519343 tomcat will get in contact with AV vendors
  19. RtlpCoalesceFreeBlocks bug 519340 - dolske will file a new bug. - This is our number one top crash right now. Worked with Lars to extract this from the database to get a handle on this problem: This looks likes it's caused by an older version of AVG. The extension that they installed is just called 8.5 (I guess it never changes?), so we can't block list this version.
  20. nsBaseWidget::Destroy() bug 470487, bug 507928, bug 503196 - The first bug is the getParentWindow, which we talked about above. 503296 is fixed in 1.9.4. 507928 is the same as the jimm issue described above.
  21. GoogleDesktopNetwork3.dll@0x3dfb bug 519344 - Tomcat - Working this and the next one to find steps to reproduce.
  22. @radhslib.dll@0x3b6f bug 519348 - Tomcat - not reproducible, we will block this since the product is unsupported since 2006
  23. js_Interpret - bug 519363 - dmandelin, see also 517077, 514593, 519129 - I filled a bunch of stuff in the bug, I've figured out a lot of the details on what's causing the crashing, still it's kinda mysterious, and no one seems to have any idea how that could happen. Now, sifting over logged crash reports to get more precise answers on when it came in. Also, might do a patch to record what's happening into a 3.5 release. This was not in 3.5b4, but is in 3.5b99. Need two things: 1) Need urls (jst will help here). 2) I suspect that I might need to do something to create a patch that would help me catch this, there are 7 different cases where this problem could emerge.
  24. PL_DHashTableOperate - 516113, 503638, 303511 - Need to get this added to the filter list as this is rarely the source. - This is likely not a top crash but a lot of smaller crashes. - Ted needs to add skiplist items. Damon: Need to follow up with Ted here. Some of these are strongly coordinated with extensions (per dbaron).
  25. Flash Player@0x92160 - bug 520058 Module data would be useful here (i.e., this is flash version X). - Josh Damon: follow-up.
  26. nsPresContext::Release() - Need create another bug here, dbaron - Same as cycle collector bugs. bc: Flash Player@0x92160 showed up 08/01. [4:39pm] bc: probably 10.0.32.18, but it could have just been a different address and a different version.
  27. arena_dalloc_small | arena_dalloc | free | XPT_DestroyArena - bug 519356 - Clint - He spidered 5k pages over the weekend in compat mode, no luck. Any suggestions here? dbaron: thinks these are startup crashes.
    1. Spidered 5000 pages over the weekend while running in compatibility mode. Unable to reproduce crash. :(
    2. 10/12: Can we run these in windows 7 in compat mode? Asked in bug.
  28. arena_chunk_init - bug 515211 - dmandelin - 10/12: Update in bug.
  29. wcslen bug 519355 and bug 519353 - dolske - 519353 appears to be caused by divx. The next step here is to contact them. dolske will call. bug 508292 - This crashes same signature different stack, it's strongly correlated with turkish sites. This could be malware triggered. Next step: Keep trying to repro.
  30. objc_msgSend | CanonIJPDE@0x1531e bug 519451 - Tomcat - This seems to be a printer driver crash. Need to track down this driver. 10/12: Seems to be fixed by a driver update, but we need to test to see if new cocoa native dialogs fixes this if at all possible.
  31. libobjc.A.dylib@0x15688 | IdleTimerVector] bug 519718 - Tomcat - Steven thinks this is a dupe, this could be the divx issue.
  32. nsHttpsHandler::GetProtocolFlags(unsigned int*) bug 519729 - Tomcat - Need to get the URLs -- Looks like a startup crash.
  33. DTToolbarFF.dll@0x4bc19 and related crashes on version 1.0.8.552 bug 512040 - Tomcat - trying to to reproduce
  34. nsPluginHostImpl::TrySetUpPluginInstance(char const*, nsIURI*, nsIPluginInstanceOwner*) bug 519752 - Tomcat - url testing running
  35. nsGlobalChromeWindow::Release() bug 519755 - Tomcat - url testing running
  36. nsXULDocument::ResumeWalk() bug 519767 - Tomcat - will set up a vm to test
  37. memmove | nsTArray_base::ShiftData(unsigned int, unsigned int, unsigned int, unsigned int) bug 519771 - Tomcat - ShiftData needs to be added to the signature ignore list. Jonas will file a bug to do so.