This guide is intended for SpiderMonkey hackers! Users of the SpiderMonkey API will want to read the GC Rooting Guide instead.
Introduction
This guide explains the basics of interacting with the GC from SpiderMonkey. Since the GC may move GCThings, it very important that SpiderMonkey know about each and every GCPointer in the system. The method you should use to keep the GC up-to-date with this information will vary depending on where the GCPointer is being stored. GCPointers can be stored as a property of a GCThing, on the CHeap, or in the CStack. These three storage regions have different lifetime and overhead characteristics and thus require different management strategies for efficient space and CPU utilization. Since C++ does not reliably distinguish between storage classes, we have to put this burden on the user.
Storing a GCPointer in a GCThing
Storing your GCPointer inside of a GCThing that is already in the LiveSet is the easiest way to keep a GCThing to the LiveSet. GCPointers that reside in GCThings fall into one of these cases: storage in a normal property, storage in a reserved slot, or storage with JS_{Get|Set}Private. Do not use JS_SetPrivate. If you see a GCPointer stored as a private, please file a bug. The private field is frequently used to store things that are not GCPointers, so the GC cannot automatically handle this slot. This means it must be manually traced by the object's owner: this is both fragile and more expensive than using an extra reserved slot, or even just putting a new property on the object.
Class FooClass = { "FooPrototype", /* name */JSCLASS_HAS_PRIVATE |JSCLASS_HAS_RESERVED_SLOTS(1), /* flags */ JS_PropertyStub, /* addProperty */ JS_PropertyStub, /* delProperty */ ... etc ... }; RootedObject obj(cx, NewObject(cx, &FooClass, NullPtr(), NullPtr(), &obj)); RootedValue v(cx, OBJECT_TO_VALUE(otherGCThing)); JSObject::SetReservedSlot(obj, 0, v);
Storing a GCPointer on the CHeap
TODO: Tracing and HeapPtr
Storing a GCPointer on the CStack
GCPointers stored on the CStack are special. Unlike other GCPointers, they are not traced as part of the GC. Instead, every pointer on the CStack should be considered part of the RootSet. To that end, we have a special "Rooting" mechanism for stack pointers that is very efficiently able to add and remove GCPointers to the RootedSet. The downside of this efficiency is that GCPointers must be added to and removed from this RootedSet tracking structure in LIFO order. For this reason it is highly recommended that this rooting mechanism only be used on the CStack.
js::Rooted<T>
typedef js::Rooted<T> RootedT; -> RootedObject, RootedString, RootedScript, etc.
Because of the LIFO restriction, using this interface manually is extremely cumbersome. Instead, SpiderMonkey has a convenient suite of C++ RAII classes to do this for you, called js::RootedT
:
RootedString str1(cx, JS_ValueToString(cx, val));
RootedString str2(rt, JS_ValueToString(cx, val));
Note 1: C++ insists that an initializing assignment (e.g., the default constructor followed by operator=) must have a copy constructor available, even if it is not used. Since it is never correct to implicitly copy a js::RootedT
(more on this in a second) we have deleted the copy constructor from these classes. Therefore, we are forced to force you to initialize with constructor syntax. Sorry.
Note 2: In C++, destructors are required to run in declaration order, nicely maintaining our LIFO property. However, this does not hold true for temporaries: they are undeclared, after all. Some compilers maintain LIFO order, some do not. Incorrect use of js::RootedT
as a temporary will crash at run time.
There is also a static constructor method fromMarkedLocation()
that creates a Handle
from an arbitrary location. This is used to make Handles for things like scope chain objects, which aren't explicitly rooted themselves, but are always reachable from the implicit stack roots. Every use of these should be commented to explain why they are guaranteed to be rooted.
js::Handle<T>
typedef JS::Handle<T> js::HandleT; -> HandleObject, HandleString, HandleScript, etc.
Any single GCThing is likely to get passed through a fairly deep call-stack before getting used. Even though creating a new js::RootedT
is cheap, it is silly to repeat this work for every new call frame. To solve this, we use the standard "handle" approach. A js::HandleT
is a pointer to a js::RootedT
. A js::RootedT
will automatically convert to a js::HandleT
. By using js::HandleT
in the interface instead of direct GCPointers, we ensure that the GCThing is already rooted on some previous stack frame, freeing us from having to worry about rooting the GCThing for the duration of the call.
bool Hello(JSContext *cx, HandleObject foo) { return Hello(cx, foo); } ..... void OtherFunction(JSContext *cx) { ... RootedObject obj(cx, ...); Hello(cx, obj); ... }
A secondary benefit: since all JS functions take js::HandleT
s, code that does not root CStack GCPointers correctly will be much less likely to compile.
Warning: js::HandleT
does not root objects itself! That is the job of js::RootedT
.
Warning: Never give a function a js::HandleT
return type: this can easily lead to a js::HandleT
outliving its js::RootedT
's lifetime.
js::MutableHandle<T>
typedef JS::MutableHandle<T> js::MutableHandleT; -> MutableHandleObject, MutableHandleString, etc.
The purpose of js::MutableHandleT
is to allow out-parameters for rooted GCThings. This class solves two problems:
- C++ only allows one level of user-defined implicit type conversion. When passed to an out-param,
&handle
would convert correctly tojs::HandleT
*
, but&local
is a compile error. - Even though
js::HandleT
is indirect, it implicitly behaves like the thing it is pointing to. This makesjs::HandleT
andjs::RootedT
behave the same, allowing us to seamlessly weavejs::HandleT
s into the JS API. The problem is that this requires us to overloadoperator->
. If we have a method takingHandle<Value>*
, thenout->setObject(foo)
will not work as expected.
To solve these problems we have the js::MutableHandleT
class. js::MutableHandleT
is a js::HandleT
with a set
method.
bool ReturnFoo(JSContext *cx, MutableHandleString out) { out.set(JS_NewStringCopyZ(cx, "Foo")); return bool(out); } size_t GetLengthFoo(JSContext *cx) { RootedString s(cx); if (ReturnFoo(cx, &s)) return JS_GetStringLength(s); return size_t(-1); }
All methods in the JS-API that return GCPointers have been changed to this out-param style. This prevents an entire class of bugs where return values are not rooted properly before the next JS-API call.
JS::NullPtr
NULL (or nullptr) does not implicitly convert to a js::HandleT
. For places where you want to pass a NULL
value into a function taking a js::HandleT
, you can construct and pass a new NullPtr
instance. You are free to use these as temporaries.
RootedObject obj(cx, JS_NewObject(cx, clasp, NullPtr(), NullPtr());
Rooting verification mode
If SpiderMonkey is built with --enable-root-analysis, then the engine will be able to run in a special mode that verifies that no stack roots are missing. The mode can be enabled by running with GC zeal mode "6." The root analysis mode requires a special build because it incurs both exact and conservative stack rooting overhead and is thus somewhat slower than a normal build, even when the rooting analysis is not being actively run.
Example:
# JS_GC_ZEAL=6 ./tests/jstests.py ./build/js
When a test fails in rooting verification, it will crash at JS_ASSERT(IsPoisonedPtr(p)) or because of a segfault at an address that looks like 0xdaxxxxxx or 0x7fffdaxxxxxx. 0xDA is the poisoned byte. These failures are easy to debug by walking up the stack and finding where the poisoned pointer was stored on the stack without proper rooting.
Common Pitfalls
The C++ type system allows us to eliminate the possibility of most common errors; however, there are still a few things that you can get wrong that the compiler cannot help you with. There is basically never a good reason to do any of these. If you think you do need to do one of these, ask on one of SpiderMonkey's support forums: maybe we've already solved your problem using a different mechanism.
- Storing a
js::RootedT
on the heap. It would be very easy to violate the LIFO constraint if you did this. Use a different rooting mechanism if you store a GCPointer on the heap. There are MOZ_STACK_CLASS annotations that will cause a separate static analysis to fail if these are detected. - Storing a
js::HandleT
on the heap. It is very easy for the handle to outlive its root if you do this. Again, there are annotations with MOZ_STACK_CLASS. - Returning a
js::HandleT
from a function. If you do this, a handle may outlive its root.
Definitions
- GC - Acronym for Garbage Collection: specifically SpiderMonkey's method of automatically managing JavaScript program memory.
- GCThing - Storage allocated by SpiderMonkey's allocator.
- GCPointer - A raw pointer type or tagged pointer type that may refer to a GCThing (JSObject *, JSString *, Value, jsid, etc.)
- GCHeap - The graph of GCThings allocated by SpiderMonkey's allocator. The goal of SpiderMonkey's GC is to partition this heap into regions reachable from the RootSet and not.
- Root - A GCPointer held live because it is a member of the RootSet.
- RootSet - This is a set of GCPointers that are de-facto alive, regardless of the GCThing's reachability. GCPointers must be explicitly added to and removed from the RootSet.
- LiveSet - The set of all GCThings that are reachable from Roots.
- DeadSet - The set of all GCThings that are not reachable from Roots.
- CHeap - The C/C++ program heap: e.g., memory allocated by malloc/calloc/realloc.
- CStack - The C/C++ program stack.
- rooted - A GCThing in the LiveSet is said to be "Rooted" even if it is not a Root as such, but is only reachable from Roots. This traditional usage is ambiguous and this guide will use live or alive instead of rooted to avoid confusion.
Caveats
Exact Rooting Transition Period
Exact stack rooting is not currently enabled by default: we are still using conservative scanning. Until the system described above has 100% browser coverage, it is unsafe to enable it. In the meantime, the js::RootedT
class holds an object live by storing its pointer on the stack so that the conservative scanner will find it. This distinction should not be noticable to the end user.
Performance Tweaking
In aggregate, exact and conservate stack rooting should have similar performance. The performance tradeoffs that these techniques make, however, are quite different. If the extra overhead of exact rooting does end up adding an unacceptable cost to a specific code path that is not mitigated by a faster GC, there are some tricks you can use to get better performance at the cost of more complex code.
- Move
js::RootedT
declarations above loops. Modern C++ compilers are not smart enough to do LICM onjs::RootedT
, so forward declaring a singlejs::RootedT
above the loop and re-using it on every iteration can save some cycles. - Raw pointers. If you are 100% sure that there is no way for SpiderMonkey to GC while the pointer is on the stack, this is an option. Note: SpiderMonkey can GC because of any error, GC because of timers, GC because we are low on memory, GC because of environment variables, GC because of cosmic rays, etc. This is not a terribly safe option for embedder code, so only consider this as a very last resort.