Javascript:SpiderMonkey:PropertyElementStorage
Contents
Objective
Summary: Store and represent properties of objects differently, depending whether the property is a string containing an unsigned 32-bit number (indexed: an element) or is not such a string (non-indexed: a "property").
Currently we represent all properties of an object through a single mechanism. Every property is represented either through the shape_
field of the object, or through class/objectops hooks that are passed a jsid
. There's no API separation between properties that are indexed by a uint32_t
("0", "17", "4294967295") and properties that are named ("baz", "-1", "-0", "4294967296"). And for objects which represent properties using shape_
, all properties of any kind (excepting low-valued indexes, in certain circumstances -- but not uniformly) are intermingled.
But the distinction between indexed and non-indexed properties exists in various APIs implemented in JavaScript. It's present in WebIDL, implicitly in Array
itself (if one excludes UINT32_MAX
), implicitly in typed arrays, and elsewhere. (Not to mention that most people write code that respects this distinction -- object literals use non-indexed properties, dotted property access is inherently non-indexed, most objects get used non-indexed-ly unless they're array-likes [in which case accesses are pretty much always obviously indexed or obviously non-indexed], and so on.)
One result of this lack of separation is increased overhead for accessing all properties. Any object whose properties split this way, that uses class/objectops hooks, incurs extra overhead to differentiate the two cases. A jsid
might be either an index or not, and that distinction must be checked before indexed or non-indexed behavior can occur. (Unless certain hacks are done, but those are pretty tricky to do, and they're one-off each time.)
Another result is that for objects where we've manually worked around this extra overhead -- for example, typed arrays -- we usually have to give up on representing all properties. (This is currently the case for typed arrays, which cannot have extra properties added to them. It was historically the case for dense arrays, too, before recent changes to make an initial sequence of indexed properties be stored more compactly, in some cases.)
A last nicety is that this all makes implementing a non-writable length
property on arrays much easier.
People
Accountable: Naveed
Responsible: Waldo
Consulted:
Informed: Product Marketing
Steps
Remove resolve flags
Resolve flags stand in the way of making our property-access APIs close enough to the ECMAScript internal operations to cleanly implement the split. The only remaining flag -- many have been removed since mid-2012 -- is JSRESOLVE_ASSIGNING
.
Time: M weeks
- Remove all tests of
flags & JSRESOLVE_ASSIGNING
in all files - ???-
dom/base/nsDOMClassInfo.cpp
- Fix the test in resolving readonly-replaceable properties - Waldo, 1 day?
- A moderate hackaround might be easy. Not sure.
- Fix the test to optimize fast expandos on
window
- bug 823227, bz, 1 day - Fix the test in
nsNamedArraySH::NewResolve
- a day?- Suss out exactly how this code gets called, write a test for new behavior if there's a change, done. Likely simple.
- Fix the test in
nsHTMLDocumentSH::NewResolve
- Waldo, 1 day?- This might just be totally removable -- unclear.
- Fix the test in
nsHTMLFormElementSH::NewResolve
- Waldo, 1 day?- This might just be totally removable. Unclear.
- Fix the test in resolving readonly-replaceable properties - Waldo, 1 day?
-
dom/bindings/Codegen.py
- This probably requires work from DOM bindings people, to implement
set
hooks rather than relying ongetPropertyDescriptor
to implement assignish behavior. Not sure how much time it'll take.
- This probably requires work from DOM bindings people, to implement
-
js/src/shell/js.cpp
- Waldo, ~no time -
js/src/jswrapper.cpp
- bug 836301, bholley - DONE -
js/xpconnect/wrappers/XrayWrapper.cpp
- bug 836301, bholley - DONE
-
- Remove
JSRESOLVE_ASSIGNING
completely - Waldo, ~no time
When all flags have been removed, we can either pass 0
everywhere or remove all flags
arguments. We should certainly do the latter at some point (and remove all the code that existed only so that flags could be used), but it's irrelevant to forward progress on property/element work. Probably a good mind-is-mush-need-a-break task.
Implement the property key stuff in ES6
ES6 property names aren't strings, they're property keys made by ToPropertyKey
spec op -- either strings or ES6 symbols. It'll be a very clean split to implement property keys but have indexes as a third kind of property key. This provides nice typing at the underlying API boundary level, and it enables a high-level type for undistinguished property accesses, that straightforwardly decomposes into the more specific types.
ES6 symbols aren't specified yet. We can carve out the API space for them underneath property keys, as currently-dead code -- maybe add a super-stupid symbol-like creator function to the shell if we want to exercise them.
Fully implementing property keys requires removing E4X SpecialId
s. Thus landing of some parts of this work depends on E4X removal.
Time: M weeks
- Implement
PropertyKey
as a class containing a Value, withis*
andas*
, with index/name/symbol subclasses and accessors forPropertyName*
/uint32_t
/symbol - bug 837773, 2 days DONE pretty much- This should be pretty simple to do in terms of
Value
's existing interface.
- This should be pretty simple to do in terms of
- Implement
JS::ToPropertyKey
and JSAPI entry points that takePropertyKey
- bug 837773 and ???, 2 days- This provides a clean, long-lived way (modulo ES6 changes, but we'll roll with them) for embedders to access spec functionality.
- Switch the
ObjectOps
method signatures to take handles to the relevantPropertyKey
subclasses, rather than what they take now - 2 days- This is straightforward enough (but depends on E4X removal), but there are a lot of implementations of these methods spread across many files (not easily searched for).
- Make shapes use
PropertyKey
instead ofjsid
- 1 week?- This involves changing the underlying field types, the methods used to expose the id, and so on.
- Fallout from this in other code may sweep fairly wide.
- Make the
baseops::*
methods takeProperty
handles - 1 week?- Perhaps the ugliest part of all this, because of the significant complexity amongst all these methods in their current forms.
Meta-object protocol changes
Our internal meta-object protocol, as represented in ObjectOps
, is quite dissimilar to the ECMAScript one. That one is formulated in terms of own properties throughout, and in terms of property descriptor objects. Ours is formulated in terms of property lookups, property values, and attributes accessed through attribute-accessing methods, and it lacks descriptors entirely. Our MOP also requires reimplementation of the property lookup process (in the start object, along the prototype chain, etc.) in several places.
Changing underlying structure, and doing it in an obviously correct way, requires converting our MOP to one more like ES6. Almost certainly a superset of it in specific areas -- property descriptors must be able to represent PropertyOp
and StrictPropertyOp
, for now -- but the idioms should be obviously parallel.
This also has benefits for the DOM bindings people, who have implemented WebIDL bindings using our current setup and have ignored the issues our current MOP doesn't let them address.
Time: ???
- Remove lookup*
- define* meta-op
- get* meta-op
- Remove getElementIfPresent
- set* meta-op
- Remove get*Attributes and set*Attributes
- delete* meta-op
- Adding more ops if necessary
Sparse elements
Properties already have a storage representation. Elements when split out will have one when they're dense, but they need one for when they're sparse.
v8 uses the exact same representation for sparse elements as for properties -- just a difference in a template parameter. Possibly we could also do this. Unfortunately our shape representation is quite complex, and its internals are intricately tied to the rest of the object representation, to type inference, and elsewhere. Possibly this could be disentangled. I'm not sure how long it would take. If we didn't disentangle, and perhaps instead just used (say) a HashMap
, it would still take a bit of time. It might take less time that way.
I don't have any good answers here, nor do I have much idea how long this should take, either way.
Time: unknown
- step 1 - time
- ...
- step 2 - time
- ...
Split all baseops
into property/symbol and element variants
Basically this is propagating the property/symbol and element distinction further downward, so that the element methods are clearly distinguished and ready to be rewritten. This has been somewhat ongoing for awhile, but the lack of PropertyKey
and the mismatch of jsid
have somewhat hindered this. So this depends on the PropertyKey
work being complete.
There is some overlap in code touched between this and the meta-object protocol changes, but the two are separate enough to proceed in somewhat parallel, with some merging/rebasing pain for the parties involved.
Time: unknown
- Split the baseops methods (which, given the
PropertyKey
work, now take a key) to take either property/symbol or element - ???- define meta-ops
- get meta-ops
- set meta-ops
- delete meta-ops
- other meta-ops
Type inference changes
Type inference currently associates type information with things through jsid
, and it does so for (almost) all properties. It attempts to perform its own property/element splitting already: non-negative number properties (this is not the same thing as an index as referred to in this document!) are grouped together under JSID_VOID
. This distinction admits more than just unsigned 32-bit integers.
The existing type inference algorithm must be changed so that it doesn't track information for elements. Tracking for elements needs to be moved into a separate location, consulted only for element access. It's also possible it'll need to be updated for whatever structure is used to represent sparse elements. There may be some applicability of the current code to sparse elements, if the property tree stuff is used to represent sparse elements, but it seems likely to be an awkward fit. Whatever happens here will much depend on sparse elements' representation.
Time: unknown
- ...
- ...
Issues
- thing to consider
- other thing to consider
Risks
- risk 1
- mitigating idea 1
- mitigating idea 2
- risk 2
- mitigating idea 1