Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using FinalizationRegistry for ThreadBoundData #74

Open
eqrion opened this issue Jul 11, 2024 · 4 comments
Open

Using FinalizationRegistry for ThreadBoundData #74

eqrion opened this issue Jul 11, 2024 · 4 comments

Comments

@eqrion
Copy link
Contributor

eqrion commented Jul 11, 2024

One idea that came up in verbal discussions of ThreadBoundData was the alternative of allowing shared objects as keys in a FinalizationRegistry, but not as keys in WeakMap. Opening this issue to continue this discussion.

With this, a shared object could hold an index to an unshared object stored in a thread specific table (instead of holding a ref to a ThreadBoundData). Then the toolchain could add the shared object into a finalization registry that would deallocate the unshared objects from the table when the shared object expires.

This would give languages 'strong' lifetimes for their shared-to-unshared references, while avoiding the need for engines to support marking all heaps together.

This would work because the unshared finalization registry only keeps alive an unshared callback, so there's no shared-to-unshared references for the engine to manage. The lifetime of the callback doesn't depend on the lifetime of the shared key. There is some coordination in the GC to notify finalization registries that a shared reference they contain has expired, but this could be done asynchronously with a notification or message from the shared GC.

WeakMap and ThreadBoundData (with strong behavior) are different because the lifetime of the unshared values depend on the lifetime of the shared keys. Liveness is transitive, so we'd have to be able to trace through all heaps to support this.

The downside of using finalization registry for ThreadBoundData would be that it could not collect arbitrary shared/unshared cycles. A cycle passing through shared and unshared objects using indices/finalization registry would not be visible to the GC and therefore not able to be broken.

I personally think that would be an acceptable tradeoff to deliver enough value with an MVP.

The only relevant note on this all that I found in the overview was:

An intermediate behavior would be the strong behavior but without cross-heap cycle collection. This is as expressive as the weak behavior if we were to hypothetically augment it by allowing shared objects to be keys in FinalizationRegistry but not WeakMap (which would be too inconsistent for us to actually ship). The FinalizationRegistry would be able to automatically manage the lifetimes of rooted unshared objects as long as they do not cyclically keep the shared objects that refer to them alive.

I would argue this is not necessarily inconsistent behavior. As noted above, FR and WM have different implementation characteristics, it seems reasonable for them to have different restrictions. I be curious to hear why they must accept the same set of values as keys.

cc @lukewagner who originally brought up this idea.

@tlively
Copy link
Member

tlively commented Jul 11, 2024

cc @syg, who is the source on the "too inconsistent for us to actually ship" note.

What would you think of providing the ThreadBoundData API, but having the implementation leak cycles? This would be more ergonomic for users, and as the note mentions, would require the same expressiveness from the GC implementation.

@eqrion
Copy link
Contributor Author

eqrion commented Jul 12, 2024

What would you think of providing the ThreadBoundData API, but having the implementation leak cycles? This would be more ergonomic for users, and as the note mentions, would require the same expressiveness from the GC implementation.

I think you can polyfill something very close to the ThreadBoundData interface if you have finalization registry. The major difference is that you need to explicitly convert between the ThreadBoundData and index representation when reading/writing shared fields. But considering that we don't even have a way to directly read/write a wasm-GC field from JS, I don't think that's going to be an ergonomic issue.

I would prefer to not have a native ThreadBoundData interface if we're going to just have all implementations leak cycles across shared/unshared. It seems like a foot gun to users to have something that looks like a strong reference, and usually acts like one, but not always. If we're just providing FinalizationRegistry, then it's clear that there is an aspect of manual memory management here and that they should be careful to use the feature in a limited way (possibly adding debugging utilities to detect issues).

In a future where engines develop the ability to do global marking across all threads JS and C++ heaps, then I think a native interface would make sense. Adding the interface would be a good way for languages to be able to feature detect this too.

Polyfill class Slot { constructor(value) { this.value = value; this.refCount = 0; } };

class ThreadBoundData {
constructor(value, index_) {
if (index_ !== undefined) {
assert(value === undefined);
this.index_ = index_;
} else {
this.index_ = newSlot_(value, this);
}
ThreadBoundData.addRefSlot_(this.index_);
ThreadBoundData.finalizers_.register(this, this.index_);
}
static fromShared(index) {
return new ThreadBoundData(undefined, index);
}

get value() {
return ThreadBoundData.slots_[this.index_].value;
}
set value(v) {
ThreadBoundData.slots_[this.index_].value = v;
}

shareWith(sharedValue) {
ThreadBoundData.addRefSlot_(this.index_);
ThreadBoundData.finalizers_.register(sharedValue, this.index_);
return this.index_;
}
unshareWith(sharedValue) {
ThreadBoundData.releaseSlot_(this.index_);
ThreadBoundData.finalizers_.unregister(sharedValue);
return -1;
}

// Implementation details

static slots_ = [];
static finalizer_(index) {
releaseIndex(index);
}
static finalizers
= new FinalizationRegistry(ThreadBoundData.finalizer_);

static newSlot_(value) {
let index = ThreadBoundData.findFreeIndexOrGrowTable_();
ThreadBoundData.slots_[index] = new Slot(value);
}
static deleteSlot_(index) {
ThreadBoundData.freeIndex_(index);
ThreadBoundData.slots_[index] = null;
}
static addRefSlot_(index) {
ThreadBoundData.slots_[index].refCount += 1;
}
static releaseSlot_(index) {
ThreadBoundData.slots_[index].refCount -= 1;
if (ThreadBoundData.slots_[index].refCount == 0) {
deleteSlot_(index);
}
}
static findFreeIndexOrGrowTable_() {
// Some index allocation scheme ...
}
static freeIndex_() {
// Some index allocation scheme ...
}
}

let { sharedStruct, getSharedStructField, setSharedStructField } = wasmExports;

let element = getElementById("#id");
let tbd = new ThreadBoundData(element);

// sharedStruct.field = element
setSharedStructField(sharedStruct, tbd.shareWith(sharedStruct));

// element = sharedStruct.field
let element2 = ThreadBoundData.fromShared(getSharedStructField(sharedStruct)).value;
assert(element === element2);

@syg
Copy link

syg commented Jul 21, 2024

It is very inconsistent and strange for a JS-exposed object to be able to be a weak target only for FinalizationRegistry and WeakRefs but not weak collections. This is a complexity that hurts the JS language, especially in a space that the developer already doesn't understand very well.

I still think we should take an holistic, application view on this. The upshot of the engine not supporting collecting cycles is that they will leak, because cycle collection is non-polyfillable expressivity. All discussion downstream of that is about how how to make this lack of collectiong palatable -- but palatable to whom? I favor being palatable to the application developer and tooling author over to the engine implementer here. I think the FinalizationRegistry and WeakRef-only proposal favors the engine implementer because it exposes the lack of cycles to the developer, in the hopes that they... rearchitect their apps or something? You say "then it's clear there is an aspect of manual management here", but what's the fallout from that? We ask applications to use one-way weakrefs (like how we used to do it when designing web APIs because of the lack of cycle collection) to break the cycle instead of the natural thing? That doesn't seem better to me. On the other hand, supporting ephemerons is more future proof, and get you most of the way there in terms of what's collectible even without cycle collection. Also if an app really wants you to break cycles with one-way weakrefs, they still can.

The feature detection aspect of this is a good one, though there's no good way to do that in general for GC capabilities. Like, I can imagine for performance I also care if an young generation weak refs are collectible, before I use weak refs. But that's arguably an implementation detail and I am wary to expose that.

Finally, the most optimistic thing is that in internal V8 discussions we're still trying to figure out if there's a way to do cycle collection without a full global marking phase. Smarter people than I have a sketch, but we still need to think about it more before sharing.

@eqrion
Copy link
Contributor Author

eqrion commented Jul 23, 2024

It is very inconsistent and strange for a JS-exposed object to be able to be a weak target only for FinalizationRegistry and WeakRefs but not weak collections. This is a complexity that hurts the JS language, especially in a space that the developer already doesn't understand very well.

I forgot to include WeakRef in the original post, but my intention was to disallow shared wasm objects from being used in those too. Essentially, only allow them in FinalizationRegistry because that has the simpler implementation characteristics compared to WeakRef/WeakMap.

I still think we should take an holistic, application view on this. The upshot of the engine not supporting collecting cycles is that they will leak, because cycle collection is non-polyfillable expressivity. All discussion downstream of that is about how how to make this lack of collectiong palatable -- but palatable to whom? I favor being palatable to the application developer and tooling author over to the engine implementer here. I think the FinalizationRegistry and WeakRef-only proposal favors the engine implementer because it exposes the lack of cycles to the developer, in the hopes that they... rearchitect their apps or something? You say "then it's clear there is an aspect of manual management here", but what's the fallout from that? We ask applications to use one-way weakrefs (like how we used to do it when designing web APIs because of the lack of cycle collection) to break the cycle instead of the natural thing? That doesn't seem better to me. On the other hand, supporting ephemerons is more future proof, and get you most of the way there in terms of what's collectible even without cycle collection. Also if an app really wants you to break cycles with one-way weakrefs, they still can.

Agree, this would make things a bit harder from downstream users. What I'm proposing here is a compromise here between making things easy for just the engine and or just for downstream users.

Simply implementing shared wasm GC objects and shared wasm functions in a production web engine is already a huge task. If we also allow shared and unshared cycles, it becomes even harder. Whenever I've brought this aspect (ThreadBoundData) of the design up with GC/CC folks at Mozilla, they've made it clear it will not be easy to add and have tricky performance implications. (I usually have a hard time getting them even on board with just the shared part of this proposal)

It's for this reason that I think that starting with just FinalizationRegistry is a better tradeoff for an MVP. It gets you most of what you need, for much less complexity. And nothing about it would preclude moving to a full ThreadBoundData later.

Regarding the concern about manual memory management, it's not ideal but this is not new in the wasm world. I think if we're expecting users to be able to write correct multi-threaded code targeting the web, we can also ask them to do some limited manual memory management to avoid cycles between shared and unshared objects.

As an aside, why are you referring to ThreadBoundData as an ephemeron? It seems to just be a plain strong reference from a shared to unshared value that has a dynamic thread check upon access.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants