JavaScript:ActionMonkey:Stage 0 Whiteboard

From MozillaWiki
Jump to: navigation, search

Up: JavaScript:ActionMonkey.

Stage 0: Replace SpiderMonkey's GC (jsgc) with Tamarin's GC (MMgc).

bug 387012 – ActionMonkey stage 0 tracking bug

The plan is to hollow out jsgc.c and replace it with a new implementation based on MMgc. The API in jsgc.h will be almost entirely preserved; but see "Features we will lose" below.

We'll use MMgc in non-incremental mode to start with. (It's easier. Avoid premature optimization.)

We'll build SpiderMonkey as C++, and some key types (JSObject, JSString) will become classes. This is unavoidable because MMgc's API for finalization is to subclass MMgc::GCFinalizedObject.


Features we will lose

JS_THREADSAFE - We will lose the threading model where a single JSRuntime has multiple threads, each with its own JSContext. (Firefox compiles with JS_THREADSAFE but doesn't use this threading model.)

We'll also lose optimizations like JSGC's per-thread freelists, which would prevent locking the GC for every allocation. This will ultimately have to be added to Tamarin.

.plan

  • bug 387938 – Modify MMgc to support ActionMonkey stage 0. (done)
  • bug 387034 - Get the build system to build Tamarin and link JS against MMgc. (done, for Mac anyway)
  • Get at least sketchy answers to most of the questions below. (done)
  • bug 389420 - Harmonize SpiderMonkey and MMgc finalization (by adding virtual destructors to JS types). (done)
  • bug 389713 - Harmonize JSGC and MMgc marking (by adding support for dynamic CustomMark() dispatch in MMgc). (done)
  • Debug and grow wise.

General issues

Destructors don't take arguments; therefore in JSObject::~JSObject() and other finalizers, we don't have a pointer to a JSContext. At the moment I'm working around this using magic. Suggestions are welcome. -jto

We aren't using MMgc's conservative marking much. Instead we use the existing exact marking/tracing code, which can do things like traverse non-GC-managed memory.

XPConnect garbage-collects XPCWrapNative objects and its own data structures. jorendorff doesn't know anything about this. (Brendan thinks this won't affect stage 0 work.)

Specific jsgc.h APIs

What follows is a dump of everything exposed through jsgc.h. Each item is rated * (one star, an easy exercise); ** (two stars, fun little puzzle); *** (three stars, hmmmm, that's interesting). The ratings are jorendorff's guesses.

  • GCX_OBJECT ... GCF_MUTABLE, js_GetGCThingFlags - **
    • JSGC sets aside space for 8 bits of flags per allocation. These are called GCThingFlags.
    • The low 4 bits of the GCThingFlags are a type tag. The constants GCX_OBJECT ... GCX_EXTERNAL_STRING are values for these 4 bits. Since almost all of the GC-allocated types are becoming vptr-bearing C++ classes anyway, the C++ vptr will suffice for this purpose. (In the long run we will likely store the type tag more efficiently, but that is not a stage 0 goal.)
    • GCF_MUTABLE lives in those low 4 bits. It will move into JSString (the only type that uses it).
    • GCF_LOCK is already gone (bug 389832).
    • GCF_MARK and GCF_FINAL are used only by the GC itself; MMgc has its own.
    • That leaves GCF_SYSTEM, which applies only to JSObjects. It can be moved into a JSObject field.
    • So long-term we want to get rid of all the GCThingFlags. This is a goal for Stage 1. In the meantime, we will store them within each allocation itself. The allocations types, like struct JSObject and string JSString, will all derive from a common base class with a uint8 gcThingFlags field... except for jsdoubles, which will store the flags right after the 64-bit floating-point value (at offset 8). js_GetGCThingFlags() will temporarily contain some illicit knowledge of MMgc internals, in order to tell whether the given allocation is a jsdouble or not. But in the end it'll go away entirely.
    • (jsdouble is special because the layout is visible right through the jsgc.h API. The 64-bit floating point number has to be at the start of the allocation. This precludes wrapping jsdoubles in a vptr-bearing C++ class. Tried it. It crashed. -jto)
  • js_GetGCStringRuntime - *
    • GC::GetGC(const void *), then (a) decrement by some offset; or (b) just give the GC a pointer back to the runtime (that is, make a subclass of MMgc::GC that contains a pointer back to the JSRuntime, and use that).
      • There's only one JSRuntime in Firefox and other Gecko apps (used by XPConnect), so just make a singleton pointer associated with the GC instance. /be
  • GC_POKE - *
    • No-op for now. Its effect is read only by js_GC currently so since that function is gutted you can gut this use of rt->gcPoke. /be
    • The MMgc equivalent is DWB() and its ilk. In incremental mode, these are required. In non-incremental mode, they're only necessary if a finalizer might "resurrect" an object (that is, cause an unmarked object to become reachable). jsgc has never allowed this, so it's OK. -jto
  • js_ChangeExternalStringFinalizer - *
    • External strings can be implemented by a JSExternalString class with a destructor that consults the table of string finalizers.
    • Emulate this on top of MMgc using a virtual method on JSString, which inherits from GCFinalizedObject. /be
  • js_InitGC, js_FinishGC - *
    • Straightforward.
  • js_AddRoot, ... js_MapGCRoots - *
    • Rooting API. Instead of using MMgc::GCRoot objects for this, we'll keep the JSGC hash table of roots (JSRuntime::gcRootsHash). They'll be marked because the JSRuntime is a GCRoot. This allows us to keep features like named roots and js_MapGCRoots without modifying MMgc to support them.
  • JSPtrTable, js_RegisterCloseableIterator, JSGCCloseState, js_RegisterGenerator, js_RunCloseHooks - *
    • These will become no-ops. All this has to do with iterator and generator cleanup, but these hooks are going away. See bug 380469. (Related bug: bug 349272.)
    • No-op'ing should not cause leaks since (AFAIK) no chrome JS uses generators. /be
  • JSGCThing - **
    • This can probably go away. It's mentioned outside jsgc.c in two places: (a) in the context of weakRoots.newborn, which I assume we'll keep, since the newborn guarantee is JSAPI-visible; and (b) in the declaration of JSContext, where they won't be needed anymore.
    • Both JSGCThing and cx->weakRoots should be removed. The latter should not be necessary given conservative stack scanning. /be
  • GC_NBYTES_MAX, GC_NUM_FREELISTS, GC_FREELIST_NBYTES, GC_FREELIST_INDEX - *
    • These can just go away.
  • js_NewGCThing - *
    • The only thing to worry about is the flags.
  • js_LockGCThing, js_LockGCThingRT, js_UnlockGCThingRT - **
    • This is the pinning API. Can be reimplemented on top of MMgc rooting. /be
    • "make one big root for everything you can keep track of - one GCRoot for each runtime." Reuse the "constants/pinning" GCRoot for anything else that needs locking?
    • See also: JS_LockGCThing API doc.
  • js_IsAboutToBeFinalized - *
    • It's legal to call this only in a few specific contexts (this is not documented anywhere, but JSObject finalizers and the JSGC_MARK_END callback, there, it's documented) and in those contexts, !MMgc::GC::GetMark(p) gives the right answer.
  • IS_GC_MARKING_TRACER - *
    • Unchanged. This is an undocumented feature of the trace API. js_GC will do an exact trace followed by a call to MMgc::GC::Collect(). During that trace, this macro returns true.
  • JSTRACE_FUNCTION ... JSTRACE_XML - *
    • Unaffected. The tracing API uses these.
  • JS_IS_VALID_TRACE_KIND - *
    • Unaffected.
  • js_CallValueTracerIfGCThing - *
    • Likely unaffected.
  • js_TraceStackFrame, js_TraceRuntime, js_TraceContext - *
    • Probably unaffected.
  • JSGCInvocationKind, GC_NORMAL, GC_LAST_CONTEXT, GC_LAST_DITCH, js_GC - **
    • The GC "invocation kinds" need to be maintained somehow. That will require some study.
    • If MMgc runs only a global mark and sweep in this stage 0 of ActionMonkey, then we can run out of memory (perhaps only after paging to death), and we do need to GC everything on last context destruction. So these should be kept as arguments to js_GC, and possibly even used in its new MMgc-based implementation. /be
    • Mode GC_LAST_DITCH is part of a mechanism to lock other threads out while collection is happening (possibly repeatedly); since we are losing the threading model, I don't think this does anything anymore.
    • The existing js_GC() can restart GC for any of three reasons:
      • (easy) In GC_LAST_CONTEXT mode, js_GC() simply collects repeatedly until no more garbage is collected. We will retain this behavior.
      • (subtler) js_GC takes the hint if a finalizer or GCCallback calls js_GC recursively. The recursive call just sets a flag, because GC is already underway; but after finalization, js_GC checks the flag and, if it's set, restarts GC almost from the beginning. We can probably easily retain this.
      • (subtler) js_GC also restarts in the same way if the gcPoke flag is set. (This flag indicates that a finalizer or callback released a root, unpinned an object, or hit any of several other gcPoke triggers. This makes JSGC a little more aggressive all the time, especially by seeking out the extra memory released by finalizers. We will drop this feature for now. If/when we go to incremental MMgc, we can reimplement it, using DWB to give us information equivalent to gcPoke.
  • js_UpdateMallocCounter - *
    • Unaffected.
  • JS_GCMETER, JSGCStats, js_DumpGCStats - *
    • Gone.
  • JSGCArenaList - *
    • Gone.
  • JSWeakRoots, JS_CLEAR_WEAK_ROOTS - *
    • These should be removed. /be

Other JSAPI support

These JSAPI functions involve GC features that aren't encapsulated behind jsgc.h.

  • JS_SetGCCallback(), JS_SetGCThingCallback() - * - MMgc support was added, bug 388011.
  • JS_SetExtraGCRoots(), JS_MarkGCThing() - ** - MMgc support was added, bug 388970. These APIs are no longer documented.
  • JS_CallTracer(), JS_TraceChildren, JS_TraceRuntime - * - These will continue to use the exact mark() methods that are built into every JSClass. However, when the tracer IS_GC_MARKING_TRACER, this will now use MMgc::GC::SetMark() rather than the old GCF_MARK bit to mark the object.