Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: comparable as a case of a generalized definition of basic interfaces #53734

Open
atdiar opened this issue Jul 7, 2022 · 17 comments

Comments

@atdiar
Copy link

atdiar commented Jul 7, 2022

Abstract

By reviewing a few concepts, we can define type sets, constraint interface satisfaction, interface implementation, in a way that makes the comparable interface obey a general rule instead of being an exception. In doing so, constraint interfaces would be strictly equivalent to interface types. This would future-proof their use as regular types.

Basically, a Go type defines structure and behaviors i.e. (a set of constraints) on raw data.
Some set of constraints are first-class citizens in Go and are represented by interface types.

The types that satisfy a given set of constraints form a type set.

  • Interface satisfaction is then simply the satisfaction of the set of constraints denoted by that interface
  • Interface implementation ensures that a type is at least as constraining as a given interface

e.g. any is less constraining than comparable: it does not implement comparable because its type set holds types that do not satisfy comparable
intimplements comparable: every type in the type set of int satisfies comparable

- constraint satisfaction is type set membership of a type
- interface implementation is type set inclusion of the type set of a type.

Now comparable can just simply be one of these interface types that describe a set of constraints: {"has the comparison operators == and !=} just as defined in the spec.
All current Go interface types satisfy this constraint even though they rarely implement it. That's why they can instantiate type parameters constrained by comparable.
comparable could be used as a regular type in the future: assignment requiring interface implementation, it would ensure the absence of panic on equality testing for types satisfying it.

Remains to model type sets in terms of sets of constraints.
While type sets are often infinite, we know that the constraints of the language are of finite quantity.
The rest is set operations on sets of sets of constraints (this is not a stutter, this is accounting for unions)

Proposal

(Re)define type set, interface satisfaction, interface implementation so that comparable follows a general rule of interface types. And in doing so, lay the ground work for a potential use of all interfaces outside of generic code ( interfaces as sum/union types).

Background

What is an interface?

Types specify structural and behavioral constraints on raw data.
The Go language has a special construct in its type system that can be used to describe some of these constraints: interface types.
(note that, by virtue of being types themselves, these interface types also satisfy their own set of constraints that may be completely unrelated)

An interface type can describe constraints on types such as:

  • a set of methods
  • the underlying element ( ~T)
  • the availability of the comparison operators ( comparable)

An interface type may also describe a set of constraints as a combination of the above, using:

  • union interfaces
  • interface embedding (which provides intersection)

What is a type set?

Since a type specify a set of constraints that can be potentially satisfied by other types as well, we can gather these constraint-satisfying types into what is called a type set.
Type sets are fully described by a set of constraints. In mathematics, this would be set comprehension notation. A constraint would be a set intension i.e. a proposition.
As such, we do not need to compute type sets specifically (theoretically, a lot of type sets are infinite anyway).
We can just describe them using the finite number of constraints that can be modeled in Go.

  • type set of an interface type I :
    • if I is a basic interface, it is composed of every type that has the methods of I. By definition, that includes I itself
    • if type I = interface {~T} the typeset of I is the set of types whose underlying type is T
    • if I is defined as a union of type terms, the type set of I is the union of the type set of each term
  • type set of a concrete type T:
    • if T is a composite type (struct, array, slice, channel, ...), the type set of T includes all possible combinations obtained by substitution of the type U of an element with a restricted version of U where the type set is now a singleton composed of any one of the members of the initial type set of U. If T is not a composite of interface types, type set of T is therefore the singleton {T}
    • if T is not a composite type, then the type set of T is the the singleton {T}

It's interesting to note that for composites of interface types, the type set includes types that may have not been defined in a Go program.
In theory, these are subtypes. Since variance is restricted in Go, they do not necessarily appear as runtime types. This is not an issue since we do not propose to compute type sets explicitly but to rather operate on the set of constraints that define them.

Constraint satisfaction and interface implementation

While a type may satisfy a set of contraints, if that type is an interface type, that interface type may be used to describe a totally different set of constraints.

For example,

  • interface {int | string | bool} satisfies the constraints shared by all interface types. But it describes a whole other set of constraints. Therefore, it does not satisfy the set of constraints it denotes.

The aforementioned example shows that constraint satisfaction and interface implementation are different notions altogether, where interface implementation is taken to indicate constraint enforcement (or entailment).
If types may satisfy some constraints, they do not necessarily enforce them. (e.g. any satisfies comparable just as any interface currently, but a type satisfying any does not mean it satisfies comparable)

Said otherwise, interface implementation tests that the constraints described by an interface can be found in the set of constraint defined by another type. This is a comparison of constraintive power.

A type T implements an interface I

  • iff typeset(T) ⊆ typeset(I)
  • iff constraint_set_described_by I ⊆ constraint_set_described_by T

i.e. every type that satisfies the constraints of T also satisfies I.

Interface implementation is constraint satisfaction over the type set of a type. It is not constraint satisfaction of the type itself.
As showed above, in general, interface implementation does not imply interface satisfaction and vice versa.

For example:

  • interface{int | string | bool} implements itself but does not satisfy itself. (this is good, this is what we'd expect)
  • interface{int | string} implements interface{int | string | bool} but does not satisfy it or itself (the interface is not in the type set)
  • any satisfies comparable but does not implement it

With the introduction of interface types and their semantics, there is a distinction between a type and its type set which is the reason for such difference.

type set of a type parameter

The type set of a type parameter is somewhat unknown. We just know that the type parameter itself satisfies a constraint interface.
In general, it can only be known to be assignable to itself.
It can be understood as a mostly opaque type variable but not necessarily an interface type.

Note If a type parameter is constrained by an interface C and must also satisfy another interface S(e.g. in the body of a function, the type parameter is used where a comparable type is expected), then S must implement C.
This is easy to see as any type that satisfies S (i.e. in the type set) has to satisfy C too (type set inclusion).

Comparable type parameter

The spec states that: "In any comparison, the first operand must be assignable to the type of the second operand, or vice versa."
Assignability is based on interface implementation. The issue here is that a type parameter is not known to implement any specific interface more often than not.
Usually, we don't know a type parameter's type set before instantiation in the general case. We just know that every type implements itself.

The general rule is that a type parameter satisfies an interface if its constraint implements it.

Examples

Let's borrow the examples from #56548 (spoiler: we arrive at the same result)

type mymap[K comparable, V any] map[K]V

func _[P any, Q comparable]() {
   var (
   	_ map[func()]string        // invalid with current and new rules: func() is not comparable
   	_ map[struct{ f P }]string // invalid with current and new rules: P is not assured to be comparable
   	_ map[struct{ f Q }]string // valid: key type is comparable

   	_ mymap[any, string]                // currently invalid, valid with new rules: any supports ==
   	_ mymap[struct{ f any }, string]    // currently invalid, valid with new rules: type of f supports ==
   	_ mymap[struct{ f func() }, string] // invalid with current and new rules: f is not comparable
   	_ mymap[struct{ f int }, string]    // valid: key type satisfies comparable
   	_ mymap[struct{ f P }, string]      // invalid with current and new rules: P may not be comparable because any does not implement comparable
   	_ mymap[struct{ f Q }, string]      // valid: key type satisfies comparable
   )
}

To be comparable means to satisfy comparable which itself means to satisfy the set of constraint described by comparable.
It does not mean to implement comparable, the latter implying some sort of substitutability (which appears in Go during value creation such as struct{any}{int(2)} or type assertions)
If we want to be able to require from a type parameter that it implements comparable, we will need something else. In the meantime, there should be no difference between what can be expressed in generic Go vs regular Go.

Observations

  • All types could be part of type sets, including interface types. This should not be an issue since type sets are not first-class types themselves. No Russell's paradox: the set of all type sets is not a type nor a type set.
  • type safety is guaranteed for variable assignment since it requires interface implementation
  • In terms of implementation, it could require a little bit more work around type sets (to be redefined in terms of constraint sets). However, I believe that this is the way to make it work as evidenced by the theory
  • This proposal ignores the current restrictions on constraint interfaces. They can remain and be lifted progressively. Especially since, for union types, it is expected that dealing with terms whose type set intersection is not empty may require some attention.
  • some constraints are inter-related (notably methods imply restrictions in terms of signature for a same method name) and should be accounted for when checking set inclusion of constraint sets.
  • a union of sets of constraints is not the set of the union of constraints
  • since variance is restricted (no subtyping), a variable's type is fixed. Constraints sets are not mutating. It should simplify things and make constraint satisfaction checking tractable.

spec (AFAICT)

Unchanged:

Would need modifications:

Appendix

A similar treatment can be found in the type theory literature under the name semantic subtyping and perhaps CSP.
For most of the constraints expressed in Go, we would rarely (if ever) need to prove the satisfiability of interface types (which is just proving that they can be inhabited, i.e. proving that their type set is not empty).
Notably because the type checker forbids a lot of unsatisfiable interfaces already, and also because the expressivity around creating new interfaces is limited. Union interface changes that a little but not that much.

In any case, it should let us establish the semantics of an hypothetical FutureGo in one fell swoop, hopefully.

@atdiar atdiar added the Proposal label Jul 7, 2022
@gopherbot gopherbot added this to the Proposal milestone Jul 7, 2022
@atdiar
Copy link
Author

atdiar commented Jul 7, 2022

cc @ianlancetaylor @griesemer
Although I hesitated before creating a new issue, but just in case it can generate some ideas as I believe it is one way of looking at the problem.

@atdiar atdiar changed the title proposal: spec: comparable as a case of a generalized basic interface proposal: spec: comparable as a case of a generalized definition of basic interfaces Jul 7, 2022
@rsc rsc moved this to Incoming in Proposals Aug 10, 2022
@rsc rsc added this to Proposals Aug 10, 2022
@ianlancetaylor
Copy link
Member

We have proposed #56548 which I think covers the same area as this proposal. If #56548 is accepted then I don't think there is anything to do here. Putting this proposal on hold pending a decision on #56548.

@ianlancetaylor ianlancetaylor moved this from Incoming to Hold in Proposals Nov 4, 2022
@atdiar
Copy link
Author

atdiar commented Nov 4, 2022

@ianlancetaylor thanks for the heads up. From a cursory look there are still a few differences that we can discuss in #56548

I will comment there in a few.

@griesemer
Copy link
Contributor

griesemer commented Nov 5, 2022

This issue is on hold but because of a lengthy discussion in #56548 related to this issue, I reread this proposal. As is, this proposal is very vague. It crucially depends on how type sets are defined and interface and constraint satisfaction differ. (By contrast, #56548 leaves type sets as is and simply introduces an exception to the rule for constraint satisfaction.)

This proposal suggests that constraint satisfaction is based on type membership. In order to understand which programs are permitted and which are not, it is crucial to be precise what the type sets are.

Without a precise definition of how type sets are constructed per this proposal, we cannot further consider this. The prose talks about 3 possible ways, based on propositions, or a list of types, or a singleton type - none of it laid out in detail. The proposal should define exactly what is the type set of a non-interface type, of an interface type (which is not a type parameter), and a type parameter.

Per this proposal, comparable is the set of all types for which == is defined. Therefore, the type struct{any} is in the type set of comparable.

Per the "type set membership rule", struct{any} satisfies the constraint comparable (it is a member). Based on this, the following function is valid which is what one would expect:

func f[P comparable]() {
	f[struct{ any }]()
}

Now the question, is the following function valid?

func g[Q struct{ any }]() {
	f[Q]()  // calling f from the example above: does Q satisfy comparable?
}

That is, is the type Q (a type parameter) a member of the type set of comparable? If not, why not? (How are the type sets of P and Q computed?)

If the program is valid, why? (I note that per #56548 this function is not valid because Q is not strictly comparable.)

Without clear answers to all these questions, we cannot further consider this proposal - there's not enough here. Thanks.

@atdiar
Copy link
Author

atdiar commented Nov 5, 2022

@griesemer Thanks for the feedback! . I'm on it. I haven't modified this proposal yet.

Just to address one of your questions quickly:

func g[Q struct{ any }]() {
	f[Q]()  // calling f from the example above: does Q satisfy comparable?
}

Just as in #56548 it should not be valid.
f[Q] must be valid for every possible Q which means that Q should be constrained by comparable AND struct{ any} .
Here it's constrained by comparable OR struct{ any} which is insufficient.

Since we are comparing the constraint themselves, this is in fact testing whether struct{any} implements comparable which it does not.
The outer constraint interface should implement the inner constraint interface.

I will get to making clarifications shortly especially re. type sets. As you correctly remarked, just as in mathematics, I am thinking of determining type sets by sets of intensions/constraints/propositions.

We don't have to prove that those set of constraints that define a type set are satisfiable for the most part. That would turn the problem into a boolean satisfiability problem just to make sure that all interfaces we define can be inhabited i.e. type sets are not empty.

We just need to test whether a type can be a solution which is the application of set operations on constraint sets.

Someone may want to double-check on that still.

Anyway, I will ping back with clarifications.

Thanks again.

@griesemer
Copy link
Contributor

How can struct{any} not implement comparable? struct{any} is a comparable type as == is permitted. struct{any} is an element of the type set of comparable. The type set consisting of the one type struct{any} is a subset of comparable.

@atdiar
Copy link
Author

atdiar commented Nov 5, 2022

struct{any} satisfies comparable since each of its fields does but for a struct type to implement comparable, all its fields should implement comparable which is not the case here.
That is how we recover "strict comparability". If we had comparable interface and var v comparable , a value of type struct{any} could not be assigned to it.

edit: admittedly this is a bit handwavy of an answer. This is something that may require further thoughts in terms of type sets. What is the type set of struct{any} would be a fair question. Looks like a kind of row polymorphism type of semantic.
Type sets of composites of interfaces may not necessarily be singletons after all.
I will also try and address that point.

@atdiar
Copy link
Author

atdiar commented Nov 6, 2022

So just a few notes (in the meanwhile).

I.

There was indeed an inaccuracy but it's resolvable.
Basically, I had overlooked the distinction between concrete types such as int,bool, string... (I'll call them atoms) and product types.

  • For atoms, the type set is a singleton.

  • For product types, it is not necessarily a singleton, especially if terms of the product are interface types.
    The type set holds every possible combinations in fact. (but we don't need to compute them explicitly)

Where it might be a bit involving is to spec this in Go, perhaps.
Theoretically it's not an issue. It stems from a semantic subtyping relationship (instead of a syntactic one).
Some of the elements of those type sets are semantic subtypes only. Meaning, they do not exactly appear as full-fledged subtypes in Go, especially since variance is not allowed.

We can have a feel for them when creating a value such as struct{any}{int(2)}, It works because int is a subtype of any so somehow we have struct{int} as a semantic-only subtype of struct{any}.

A value of type struct{int} is not assignable to a variable of type struct{any} however since variance restricts it. And in the general case of defined types, the subtypes could not really exist anyway because of the name constraint even if a type with identical structure already exists.

As you rightly said, it is a bit more of a mathematical idea of type sets. It should still be tractable however.

II.

Another important note to answer why we can lift the restriction on type set membership of interface types.

I think it was the right move to make that restriction out of caution initially.
The issue, I think, was how to avoid a circularity problem of type definitions, i.e. avoiding Russel's paradox.
But since interfaces are not type sets, they simply define them and more importantly type sets are not first class types within the language, there is no issue: the set of all typesets is not a type nor a typeset.

So it's fine for any to belong to its own type set for instance. It shouldn't create problems.

Thankfully you challenged, product types had escaped me!

@atdiar
Copy link
Author

atdiar commented Nov 11, 2022

@griesemer @ianlancetaylor updated the proposal. Hope it helps. Many thanks.

@griesemer
Copy link
Contributor

Apologies for being blunt, but AFAICT this proposal remains as vague as it's been before.

(Re)define type set, interface satisfaction, interface implementation so that comparable follows a general rule of interface types. And in doing so, lay the ground work for a potential use of all interfaces outside of generic code ( interfaces as sum/union types).

We really don't want to redefine all these things if we can avoid it. It would take a long time to understand the implications. Nor is it clear how all these redefinitions work.

"And in doing so" is meaningless w/o actual explaining what it is that's being done in precise detail.

There is also no need to "lay the ground work for a potential use of all interfaces outside of generic code": that possibility exists already. The existing definitions in the spec suffice, we simply have to remove the restrictions in place.

Some comments:

  1. Why is is important that "if T is a composite type (struct, array, slice, channel, ...), the type set of T includes all possible combinations obtained by substitution of the type U of an element with members of U's type set."? Why do we need to look inside such types? Basically, if a have a struct{any} the type set would include struct{int}, struct{float32}, struct{...}, etc. according to this description. This seems extremely complicated. There's no motivation given. It also seems incorrect.

  2. "interface {int | string | bool} satisfies the constraints shared by all interface types. But it describes a whole other set of constraints. Therefore, it does not satisfy the set of constraints it denotes." - These three sentences make no sense to me.

  3. There are various rules w/o actually defining the terms precisely. For instance, there are constraint sets and constraint set inclusion, but for that to make any sense, we need to know exactly what a constraint set looks like, i.e., what a constraint looks like. The proposal text says "a type specifies a set of constraints" and all the types that satisfy the constraints make up a type set. Some constraints are types themselves, another constraint is the ability to perform ==, or so it seems like. So constraints seem to have different forms. How can they be compared? How can we see if they are the same? W/o knowing this, how can we see if they are included in another set? This things are all unclear.

Note that in the current Go spec, we use the term "constraint" simply to denote an interface in a "constraint position". It's still just an interface. A constraint is not a new thing. Also, in the current Go spec, we have types (well defined), type sets (which are also well defined, namely sets of these types, but excluding interface types), and that's it. We have a clear understanding of when two types are the same (identical is defined for types). We can therefore tell if a type is included in a type set. We can therefore tell if a type set if a subset of another type set. And so forth. All these things are pinned down in exact details. Otherwise we don't know how to implement this.

The examples, which are the ones from #56548, arrive at the same result. There's no explanation as to why that is the case. But more importantly, if one arrives at the same result, what is the point of this proposal? It all sounds much more complicated.

In summary, it is not clear to me why there is anything here that is not already achieved with the much simpler #56548. There's also no need to lay any groundwork for the generalized use of interfaces. We have laid that groundwork with the introduction of type sets.

I believe this proposal should be declined.

@atdiar
Copy link
Author

atdiar commented Nov 11, 2022

Sorry, I believe it's only vague if one only reads the proposal paragraph which just signify the scope of the change. Everything is explained in the later parts (and the abstract). How can I make this paragraph better?

Also, let me address your specific comments to hopefully make things clearer:

  1. As explained above, the set theoretic interpretation of types is based on a subtyping relationship. This is this relationship that is used during interface implementation (the subsuming rule). I haven't perhaps explained this in detail because Go does not introduce these terms in its spec. In practice we do not have to exactly identify all these (sub)types. I mentioned them because otherwise, I don't see how one can explain why for struct and arrays, every field needs to satisfy comparable or implement comparable. As explained, these subtypes do not exist anyway because Go's type constructors are invariant.
    It's perhaps easier to see with another example: type S struct{any} has subtypes type S struct{int} and type S struct{[]int} [edit] type S struct{interface{int}} and type S struct{interface{[]int}} actually.
    (note the same name which should make it obvious they cannot really exist).
    One of these subtypes does not implement comparable.
    Again, it is just meant to justify why in practice implementing comparable relies on checking that each field's type implementscomparable for struct and array types. It's probably better explained in the theory.

  2. This would probably be more relevant if we considered a relaxation of the rules around constraint interfaces to be usable as regular interface types and perhaps that a better example would be more explicit as well.
    interface{[]byte | string} would satisfy comparable but does not represent such constraint. It however represents the constraint of being either []byte or string. That's what was meant.

  3. I have not modeled constraints precisely but I don't think this is very important at this stage. I have however indicated that these constraints would be propositions. It would suffice to compare sets of propositions/predicates by checking for their presence or absence.

I'm not proposing to add anything to Go that is not here already.
What is being comparable or implementing a given method sets if not just some constraints of types?

While we arrive at the same result as #56548 I think it is objectively much easier to reason about:
We eschew two definitions of comparability and potentially two kinds of type sets in the future.
It's just about the clear distinction between interface satisfaction and interface implementation.
For instance, the rules in #56548 cannot handle an interface such as : interface{ int | string}. It should implement itself but not satisfy itself. The rules would have to change anyway. So I don't really think #56548 is much simpler.

@griesemer
Copy link
Contributor

Just to be clear, I did read the entire rewritten section.

  1. struct{int} is not a subtype of struct{any}: I cannot use a value of struct{int} type in place of a struct{any}. I can also not use the type struct{int} to instantiate a type parameter constrained by interface{struct{any}}. Not that that matters.
  2. interface{[]byte | string} does satisfy comparable, agreed. interface{[]byte | string} also satisfies itself. So I'm not sure what your point is.
  3. Propositions - at least the way you have described them before - may describe overlapping type sets. So one cannot simple check whether certain propositions are present or absent. It's more complicated than that. Which is why I was asking for precise details. "It would suffice to compare sets of propositions/predicates by checking for their presence or absence." is handwaving.

The rules in #56548 certainly can handle the interface{ int | string} case: first of all, the #56548 rules are an extension of the existing rules, and the existing rules already cover this case just fine (#56548 is not even needed for this). interface{ int | string} implements itself, and satisfies itself. And it even happens to satisfy (and implement) comparable (you can even try it out in a 1.19 playground). So no rules need to change for this case at all.

Anyway, thanks for trying to clarify this proposal. Still, I don't think there's enough here to proceed in any meaningful way. I have tried to understand what you're saying and have failed. I will refrain from further comments. Thanks.

@atdiar
Copy link
Author

atdiar commented Nov 14, 2022

Thank you for reviewing this. Just to clarify some more.

  1. Yes, depth subtyping is affected by the invariance. That's why I made a distinction between a type and a semantic subtype. The type struct{int} is different from the subtype of struct{any} where any can only be assigned values of type int. That's my fault for trying to explain via the shortcut notation struct{int}. It stems from types representing sets of values and subtypes representing subsets.

  2. Here I think that my understanding diverge. If interface{string | []byte} satisfies itself, doesn't that mean that this interface type would be able to instantiate a constraint expecting either a string or byteslice type? We have discussed it before and I think it's problematic and unexpected. However, that interface should implement itself.
    The difference is that interface implementation and interface satisfaction should only be the same perhaps for basic interfaces. For instance, I don't understand how an interface type interface{int | string} can satisfy the constraints of being an int or string... How would that work with operators such as + once the restrictions are lifted?

  3. Re. Comparing sets of constraints and overlapping type sets, I am not sure I understand. Do you (or anyone) have an example where it would not work so that I can ponder it some more?
    If the worry is about unions, that's why these sets of set of constraints cannot be collapsed into a single set of constraints. Each set corresponding to a termof the union should be checked separately.
    Perhaps that I'm overlooking something however.

Thanks again.

@atdiar
Copy link
Author

atdiar commented Nov 14, 2022

And in fact, for the first point, now that I think of it, I should probably have written struct{interface{int}} instead of struct{int}.

I will make the amendments and try to think about how to deal with sets of sets of constraints in more details.

Thanks again.

@atdiar
Copy link
Author

atdiar commented Nov 20, 2022

@griesemer @ianlancetaylor here is the high level idea regarding checking interface implementation (as checking the presence of sets of constraints) in a bit more details. Hope you or anyone else can take a look whenever you have time and critique it. Thanks again.

Currently, the constraints on types that can be represented by interfaces in Go are (afaict) :

  • to be a given type
  • to have a given core type
  • to have a given set of methods
  • to be comparable
  • any combination of the above (by conjunction or disjunction i.e. intersection or union)

Taking the abstract example of :

type C interface{
  a | b | c | d
  e | f
  g | h
}

Implementing C is merely implementing either aeg or aeh or afg or afh or beg or bfg ...and so on. (each case is a constraint branch)
Where each letter corresponds to a combination of the constraints listed above.

In theory, checking for C may seem exponential in the numbers of constraints being combined (o(kN), EXPSPACE in the WORST case only).
This seems to be a case of expansion of a boolean formula into DNF form.

However, in practice, the restriction on the domains of many constraints (which makes them incompatible with each other) simplify things. (e.g. if a is int and e string, aeg and aeh can be eliminated).
The checking should also be parallelizable.

So it shouldn't be a big worry, I think.

Generalized Interface implementation checking

There are a few steps needed to check that a type T implements an interface I:

  • retrieve the relevant constraints that exist for T (possible ones are listed in the section above)
  • represent I as a tree of constraints (Multi-valued Decision Diagram) (and store it in a constraint store i.e. map[*types.Type]*ConstraintNode or similar...)
  • Find out if any branch of I hold a set of constraints that can be found in T's list of constraints (if yes, T implements I)

Tranforming a type constraint into a Tree Representation of a Boolean Formula

The idea is that for a type T to implement an interface I, it means that it is at least as constraining. (it implements the constraints described by I)
Meaning that, at a minimum, the constraints that are represented by I can be found on T. In the case of I defined by unions, it is sufficient that any constraint from a branch created by a union term appears on T.

One way to represent such constraints is to use a tree datastructure such as a Multi-valued Decision Diagram (MDD) that would represent the different constraint satisfaction branches of a boolean formula.
This is just a tree representation of the different constraint components of an interface.

The idea of checking the presence or absence of sets of constraints is basically just that same idea implemented as an adhoc MDD.

// ConstraintNode is what the decision tree is made of.
// Each node accumulates the constraint of its ancestor nodes in a Constraint object.
type ConstraintNode{
  Accumulator Constraint // representation of the overall set of constraints that need to be present when checking that node
  Next []*ConstraintNode // holds the list of constraint added by conjunction (e.g. if the current node represents `a`, then Next hold `e` and `f` representations)
}

type Constraint struct{
  // Kind can be a combination of:
  // Type (binary: 00001) 
  // CoreType (00010)
  // MethodSet (00100)
  // Comparable (01000)
  // OtherConstraint (10000)
  // Invalid (00000) 
  Kind int 
  
  Type *types.Type
  CoreType *types.Type
  MethodSet map[types2.Signature]bool
  Comparable bool
  Other []*ConstraintNode // if a term of the union is another constraint that holds a union, we just store a ref to its MDD root
}

Checking interface implementation

  • first, we create the tree corresponding to the constraints of T (if T is an interface that has union terms, we will need to check every branch for constraint implementation)
  • then we create the tree corresponding to the constraint of I
  • last, we check that on each branch of the MDD of T, we can find the constraints defined by any one branch of the MDD of I. (at least one is necessary and sufficient)

Reminder re. constraint satisfaction

constraint satisfaction is not constraint implementation.
A type may satisfy a constraint such as comparable. Doesn't mean that it implements it. A simple example is any.
In general, interface types describe constraints that they do not satisfy themselves. (basic interfaces and comparable are special cases)
Non-interface types satisfy and implement themselves.
The main point is that when checking that an interface type T satisfies or implements another interface I, the set of constraints of T to be taken in consideration might be different, depending on the operation.

type sets ?

We wouldn't need to compute them exactly anymore. The type set would be informally constructed by the types which satisfy the tree representation (MDD) of the constraint set.

is it backtracking? Dynamic programming?

Yes but it's not really a problem because once the MDD has been established and branches impossible to satisfy have been cleaned out, checking whether a type satisfies the corresponding constraint is merely checking whether it is compatible with one of the leaf nodes set of constraints. No need to retraverse the whole tree.
The nodes can be cached.

@ianlancetaylor
Copy link
Member

I also think that it's very likely that we can convert constraint satisfaction into a boolean formula. But determine whether a boolean formula is satisfiable is itself an NP complete problem. You suggest that this shouldn't be a big worry, and I agree. But the type set approach isn't a big worry either in practice. It's only in edge cases that we run into problems, and edge cases will also be a problem with the boolean formula approach.

@atdiar
Copy link
Author

atdiar commented Nov 24, 2022

Yes, I understand. Actually, if we can remain with the current implementation choices that's fine with me by all means.
For the current use-cases it works.
I'm just a bit concerned about what it entails in terms of complexity on the spec and over the long-term.
This proposal is just an attempt to have a single general definition of

  • interface
  • satisfaction
  • implementation

without introducing special cases (simpler to explain) for now and later.

It's a bit unfortunate that the easiest way to make these changes require a bit more implementation work, afaict.

Then again, you probably are also aware of some edge cases that I'm not thinking about?

Note I'm not sure that we actually need to solve SAT exactly. Unless we want to show that any interface definition in a given Go program can be satisfied. We probably just need to be able to check whether a given type satisfies a constraint or not: a case-by-case computation which is a much easier problem.
In theory, it does allow for the existence of unsatisfiable interface and type definitions that could remain unnoticed unless they are being used ("used" meaning in code, so before instantiation still).

In practice, within the context of the current Go constraints, I think even this is not a problem. There are only so many options for an interface to be unsatisfiable (conflicting method sets, conflicting type/coretype) anyway. That can probably be checked easily.

Thanks to invariance and given the limits on the constraints expressible in Go, the reducibility of MDD (by elimination of redundant nodes/branches), I think it is tractable and gets rid of edge-cases related to constraint cycles in union interfaces (if union terms are ever relaxed to accept general interfaces).

Thanks again :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Hold
Development

No branches or pull requests

4 participants