Igalia has been working in collaboration with Salesforce on advancing the ShadowRealm proposal through the TC39 process, and part of that work is actually getting the feature implemented in the Javascript engines and their embedders (browsers, nodejs, deno, etc.)
Since joining the compilers group at Igalia, I’ve been working (with some wonderful peers) to advance the implementation of ShadowRealms in JavaScriptCore (the Javascript engine used by WebKit, hereafter, ‘JSC’) and also integrating this functionality with WebKit proper.
You can read about some of the work done so far in the blog post Hanging in the Shadow Realm with JavaScriptCore by Phillip Mates, who implemented ShadowRealm support in JSC.
what is a ShadowRealm
, anyways?
To explain what a ShadowRealm is, let’s start by explaining what a realm is more broadly:
“Realm” is from the Javascript spec, and is used to describe part of the environment a script executes in. For instance, different windows, frames, iframes, workers, and more all get their own realm to run code in.
Each realm also comes with an associated “global object” this is where top-level
identifiers are stored as properties. For example, on a typical webpage, your
Javascript runs in a realm whose global object exposes window
, Promise
,
Event
, and more. (The global object is always accessible as the name
globalThis
)
The usual isolation between these is informed by the mother of all browser security design principles, the same-origin policy: briefly, resources loaded from one domain (“origin”) shouldn’t normally be able to access resources from another; in the context of realms, this usually means that code running in one realm shouldn’t be able to directly access the objects associated with code running in another.
ShadowRealms are a new sandboxing primitive being added to the Javascript
language, which allow Javascript code to create new realms that have similar
isolation properties; these script-created realms are unique and disconnected
from other realms the browser (or other host, like, node
or deno
) creates.
Any realm may create a new ShadowRealm:
const r = new ShadowRealm();
It’s useful to have a name for a realm that does this, we’ll steal from the proposal and call such a realm the “incubating realm” of its respective shadow realm.
cross-realm boundary enforcement
Part of the design of ShadowRealms is to provide a level of isolation around
a ShadowRealm similar to what is provided today between other realms in the
browser, as required by the content security policy. That is, code in the
incubating realm (and, by extension, all other realms) should not be able to
affect the content of the global object of a ShadowRealm and vice versa;
except by using the ShadowRealm
object directly (by calling
myShadowRealm.evaluate
or myShadowRealm.importValue
.)
This requires fairly careful scrutiny of what we allow to pass between a
ShadowRealm and its incubating realm. For instance, we basically cannot allow
objects to pass between them at all, since if you obtain an object o
from
another realm, you can play nasty tricks by abusing prototype objects to get the
Function
constructor from the other realm.
const o = /* obtained via magical means */
// we can play games to obtain the constructor from o's prototype, which is bad enough on its own ...
const P = Object.getPrototypeOf(o).constructor;
// we can get the constructor _of that constructor_ which will be Function, but from the wrong realm!
const Funktion = Object.getPrototypeOf(P).constructor;
// now we can do basically whatever we please to o's global object, since
// constructing a new Function with a string gives us a function with that source text.
const farawayGlobalObject = (new Funktion("return globalThis;"))();
.Array.prototype.slice = /* insert evil here */; farawayGlobalObject
Note that this game (getting at the Function
constructor from the other realm
via any prototype chain) works in either direction, too! (it can be used to
access the ShadowRealm from the incubating as well as accessing the incubating
realm from the ShadowRealm)
We want to prevent leaks of this nature, since they allow action-at-a-distance not controlled by the normal ShadowRealm interface. This is important if we want modifications to the global objects of either realm to be only performed by code deliberately loaded into that realm.
venturing into the WebCore weeds
As he describes in the post linked above, Phillip implemented ShadowRealms and
the ShadowRealm
interface in JSC, but we left a host hook1 in to handle module
loads and to allow the host to customize the ShadowRealm global object:
*
ShadowRealmObject::create(VM& vm, Structure* structure, JSGlobalObject* globalObject) {
ShadowRealmObject* object = /* ... */ ;
ShadowRealmObject/* ... */
->m_globalObject.set(
object, object,
vm->globalObjectMethodTable()
globalObject->deriveShadowRealmGlobalObject(globalObject)
// \__________________________________________/
// provided by the engine host
);
/* ... */
return object;
}
When using JSC alone, deriveShadowRealmGlobalObject
does little more than
make a plain old new JSC::JSGlobalObject
for the ShadowRealm to use.
However, on WebKit, we needed to make sure it could create a JSGlobalObject
that could perform module loads for the web page, and is otherwise customized
to WebKit’s requirements, and that’s what we’ll describe here.
detour: wrappers for free
Central to WebKit’s use of JSC is that certain objects associated with a webpage
all get associated “wrapper objects”: these are instances of the type
JSC::JSObject
whose job it is to send Javascript calls to a method of the
wrapper object to calls to the C++ method that implements the object.
For example, in WebCore, we have an Element
class which is responsible for
modelling an HTML element in your web page—however, it cannot be used directly
by Javascript code: that interaction is controlled by its wrapper object, which
is an instance of JSElement
(which is a subclass, ultimately, of JSObject
)
Most wrapper classes in WebKit are, in fact, generated! Most web standards
specify what Javascript objects should be available using a special language
just for this purpose called WebIDL (IDL = Interface description language). For
example, the WebIDL for TextEncoder
looks like:
[
Exposed=*,
] interface TextEncoder {
constructor();
readonly attribute DOMString encoding;
[NewObject] Uint8Array encode(optional USVString input = "");
TextEncoderEncodeIntoResult encodeInto(USVString source, [AllowShared] Uint8Array destination);
};
This is used during the WebKit build to produce the wrapper class,
JSTextEncoder
, which looks something like this: (though I am omitting a lot of
boilerplate)
class JSTextEncoder : public JSDOMWrapper<TextEncoder> {
public:
using Base = JSDOMWrapper<TextEncoder>;
/* snip */
static TextEncoder* toWrapped(JSC::VM&, JSC::JSValue);
/* snip */
};
Here, the class JSDOMWrapper<TextEncoder>
provides the most basic possible
kind of wrapper object: the wrapper holds a reference to a TextEncoder
and
generated code in JSTextCoder.cpp
instructs the JS engine how to dispatch to
it:
/* Hash table for prototype */
static const HashTableValue JSTextEncoderPrototypeTableValues[] = {
{ "constructor",
static_cast<unsigned>(JSC::PropertyAttribute::DontEnum),
,
NoIntrinsic{ (intptr_t)static_cast<PropertySlot::GetValueFunc>(jsTextEncoderConstructor),
(intptr_t) static_cast<PutPropertySlot::PutValueFunc>(0) } },
{ "encoding", /* snip */ },
{ "encode", /* snip */ },
{ "encodeInto", /* snip */ },
};
(jsTextEncoderConstructor, (JSGlobalObject* lexicalGlobalObject,
JSC_DEFINE_CUSTOM_GETTER,
EncodedJSValue thisValue))
PropertyName{ /* dispatch code goes here */ }
/* much more generated code goes here, using the above */
Usually, we don’t care much about the details here, that’s why the code is
generated! The relevant information is typically that calling e.g.
encoder.encode
from Javascript should result to a call, in C++ to the encode
method on TextEncoder
.
There’s also a variety of attributes we can put on IDL declarations, some of which change the meaning of those declarations for instance, by specifying which kinds of realms they should be available in, and some others which affect WebKit-specific aspects of the declaration, notably, they give us more control over the code generation we just described.
ShadowRealm
global objects
To make sure that ShadowRealms behave appropriately in WebKit, we need to make
sure that we can create a JSGlobalObject
that also cooperates with the
wrapping machinery in WebCore; the typical way to do this is to make
the wrapper object for the realm global object an instance of
WebCore::JSDOMGlobalObject
: this both provides functionality to ensure that
the wrappers used in that realm can be tracked and also that they are distinct
from wrappers used in other realms.
For ShadowRealms we need to make sure that our new ShadowRealm global object
is wrapped as a subclass of JSDOMGlobalObject
; we can do this pretty directly
with WebKit IDL attributes:
[
Exposed=ShadowRealm,
JSLegacyParent=JSShadowRealmGlobalScopeBase,
Global=ShadowRealm,
LegacyNoInterfaceObject,
] interface ShadowRealmGlobalScope {
/* snip */
};
These have the meaning:
Exposed=ShadowRealm
+LegacyNoInterfaceObject
: these two together don’t make much difference:Exposed=ShadowRealm
tells us that the interface should be available in ShadowRealms;LegacyNoInterfaceObject
tells us that there shouldn’t actually be aglobalThis.ShadowRealmGlobalScope
available anywhere; so, there is, in fact, nothing really to expose… but, because this is the global object for ShadowRealms, any members on it will be available onglobalThis
.JSLegacyParent=JSShadowRealmGlobalScopeBase
tells WebKit’s code generation that the wrapper class for this interface should use our customJSShadowRealmGlobalScopeBase
(which we have yet to write) as the base class.Global=ShadowRealm
tells other people reading this IDL file that this interface is the global object for ShadowRealms.
Now we just need two more things: the implementation of the unwrapped
ShadowRealmGlobalScope
, and the implementation of our wrapper class,
JSShadowRealmGlobalScopeBase
the unwrapped global scope
We can start with the unwrapped, global object, since it ends up being simpler: the main thing we need from a ShadowRealm global object is just to be able to find our way back to the incubating realm—it turns out a convenient way to do this is to just make a new type and have it keep its incubating realm around:
class ShadowRealmGlobalScope : public RefCounted<ShadowRealmGlobalScope> {
/* ... snip ... */
private:
// a (weak) pointer to the JSDOMGlobalObject that created this ShadowRealm
::Weak<JSDOMGlobalObject> m_incubatingWrapper;
JSC
// the module loader from our incubating realm
* m_parentLoader { nullptr };
ScriptModuleLoader
// a pointer to the JSDOMGlobalObject that wraps this realm (it's unique!)
::Weak<JSShadowRealmGlobalScopeBase> m_wrapper;
JSC
// a separate module loader for this realm to use
std::unique_ptr<ScriptModuleLoader> m_moduleLoader;
};
Asute readers will note that the ShadowRealmGlobalScope
does not, in fact,
keep its parent realm around; this is because it is retained by the
ShadowRealmObject
from above! Having the ShadowRealm global scope retain
its incubating realm would form a loop of retaining pointers and therefore
leak memory! Since these are WTF::RefCounted<...>
, there’s no garbage
collector to help, out either; we really need to avoid the reference cycle.
We can, however, get away with a weak pointer since if the incubating global object became unreachable, there would be no way to get back into the shadow realm except code running in the incubating realm or its event loop, neither of which should be possible, so, the weak pointer will always be valid when we need it.
the wrapper global object
Let’s go ahead and add the wrapper class now:
class JSShadowRealmGlobalScopeBase : public JSDOMGlobalObject { /* snip */ }
… and, since we get to pick our base class, we can pick JSDOMGlobalObject
instead of JSObject
, how convenient! This has the effect of implicitly making
other parts of the engine treat our new global object as a separate realm that
requires its own wrapper objects. This doesn’t come for free, though, we have
several virtual methods on JSDOMGlobalObject
we are obliged to implement.
Thankfully, we have another JSDOMGlobalObject
around we can happily delegate
to! For example:
// a shared utility to retrieve the incubating realm's global object
const JSDOMGlobalObject* JSShadowRealmGlobalScopeBase::incubatingRealm() const
{
auto incubatingWrapper = m_wrapped->m_incubatingWrapper.get();
(incubatingWrapper);
ASSERTreturn incubatingWrapper;
}
// discharge one of our obligations by delegating to `incubatingRealm()`
//
// (this method is static; we get `this` as JSGlobalObject*, annoyingly, but
// the downcast should always succeed)
bool JSShadowRealmGlobalScopeBase::supportsRichSourceInfo(const JSGlobalObject* object)
{
auto incubating = jsCast<const JSShadowRealmGlobalScopeBase*>(object)->incubatingRealm();
return incubating->globalObjectMethodTable()->supportsRichSourceInfo(incubating);
}
Finally we need only to add branches in some (admittedly awkward2) parts of
JSDOMGlobalObject
for our new ShadowRealm global object, for example:
static ScriptModuleLoader* scriptModuleLoader(JSDOMGlobalObject* globalObject)
{
/* snip */
if (globalObject->inherits<JSShadowRealmGlobalScopeBase>(vm))
return &jsCast<const JSShadowRealmGlobalScopeBase*>(globalObject)->wrapped().moduleLoader();
/* snip */
}
the grand finale … almost
Now we can actually implement deriveShadowRealmGlobalObject
, right? Well, not
quite. It turns out <iframe>
acts rather differently when the page it contains
has the same origin as the parent page—in that case, their global objects are
actually reachable from one another! (This came as an unpleasant surprise to me
at the time …)
This won’t do for us—it breaks the invariant we described above. There’s
nothing to prevent a child <iframe>
from creating a new ShadowRealm
and
allowing it to escape to the parent frame; then the ShadowRealm
can outlive
its incubating realm’s global object :(
We can solve the problem by actually walking up the hierarchy of frames until we either hit the top or find one with a different origin, and use the top-most global object with the same origin, which re-establishes our invariant, since there now really should be no way for the ShadowRealm object to escape :)
::JSGlobalObject* JSDOMGlobalObject::deriveShadowRealmGlobalObject(JSC::JSGlobalObject* globalObject)
JSC{
auto& vm = globalObject->vm();
auto domGlobalObject = jsCast<JSDOMGlobalObject*>(globalObject);
auto context = domGlobalObject->scriptExecutionContext();
if (is<Document>(context)) {
// Same-origin iframes present a difficult circumstance because the
// ShadowRealm global object cannot retain the incubating realm's
// global object (that would be a refcount loop); but, same-origin
// iframes can create objects that outlive their global object.
//
// Our solution is to walk up the parent tree of documents as far as
// possible while still staying in the same origin to insure we don't
// allow the ShadowRealm to fetch modules masquerading as the wrong
// origin while avoiding any lifetime issues (since the topmost document
// with a given wrapper world should outlive other objects in that
// world)
auto document = &downcast<Document>(*context);
auto const& originalOrigin = document->securityOrigin();
auto& originalWorld = domGlobalObject->world();
while (!document->isTopDocument()) {
auto candidateDocument = document->parentDocument();
if (!candidateDocument->securityOrigin().isSameOriginDomain(originalOrigin))
break;
= candidateDocument;
document = candidateDocument->frame()->script().globalObject(originalWorld);
domGlobalObject }
}
/* snip */
auto scope = ShadowRealmGlobalScope::create(domGlobalObject, scriptModuleLoader(domGlobalObject));
/* snip */
}
a brief note on debugging
Of course, none of the above went as smoothly as I make it sound; I ended up
encountering many crashes and inscrutable error messages as I fumbled my way
around WebKit internals. After printf debugging, A classic technique to
interactively explore program state when in unfamiliar territory is the iconic
ASSERT(false)
—WebKit even provides a marginally more convenient macro for
this purpose, CRASH()
, which proved invaluable.
Simply run your test case in a debugger and set a breakpoint on WTFCrash
and
you will have a convenient gdb prompt; I find it to be a fun, slightly more
powerful flavor of printf-debugging :)
the road ahead
Now, we have a working ShadowRealm
available in the browser!
If you’re interested to try them out for yourself, you can find them in the latest Safari Technology Preview release!
However this is only part of the work for this project, because it is also planned to add certain Web interfaces to ShadowRealm contexts, and more testing coverage is needed.
exposing web interfaces
Since ShadowRealms are actually part of the Javascript standard and not a Web standard, so, we need to be careful about this work; ShadowRealms are supposed to be a sandbox, so it wouldn’t do much good if scripts you load into a shadow realm start mucking around with the markup on your web site!
So the interfaces that are planned to be exposed are strictly those that expose
some extra computational facility to Javascript, but do not really have an
effect outside of the script where they are invoked. For example, TextEncoder
is quite likely to be exposed, Document
is not.
A patch adding several of these APIs to ShadowRealm contexts is already landed, but probably won’t appear in Safari until after ShadowRealms do.
never enough testing
ShadowRealms are already unit tested in both the existing WebKit implementation and in test262, the test suite accompanying the Javascript standard, however, more tests are needed in WPT, the web test suite, for the correctness of the module loading support and newly exposed interfaces; some work here is underway and should be finished in the coming few weeks.
notes
“host” here refers to whatever piece of software is running Javascript with JSC—usually the host is a web browser, but it doesn’t have to be. For our purposes, “host hook” is a function that the Javascript engine cannot provide—it requires the host to cooperate in some way.↩︎
The awkwardness here is that
scriptModuleLoader
is not actually part of the interface ofJSDOMGlobalObject
, but probably should be; however, we have now arrived at the delicate argument over whether or not patches like this should minimize the changes or clean up ugliness everywhere they find it: you can even see this in the code review of this patch if you look closely.↩︎