Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typing of outputs of the standard library functions #512

Closed
Gvaihir opened this issue Aug 17, 2022 · 5 comments
Closed

Typing of outputs of the standard library functions #512

Gvaihir opened this issue Aug 17, 2022 · 5 comments

Comments

@Gvaihir
Copy link

Gvaihir commented Aug 17, 2022

I tried to revive #373 but it might not be bumped. So here's a new issue.
I'd like to have some clearance on type coercion of the inputs/outputs from the standard library functions.
On the one hand - all inputs and outputs must be typed.
But I do see some use cases where the following syntax is used:

scatter (pair in zip(foo, bar)) {}

I could not find any elaboration on such cases in the spec (mostly I see it in the users' code), and I think this is a bit contradicting the static typing requirement. In general though seems these examples compile and run, but we start to see coercion problems when nesting such functions and applying them to optional types. Like this:

Array[Int] foo
Array[Int] bar
Array[String?] lol

scatter (nested_pair in zip(zip(foo, bar), lol)) {}

In this case we see the problems of coercion for values of lol which were evaluated to null, throwing exceptions at runtime. If we declare and type the outputs - all fine, no exceptions:

Array[Pair[Pair[Int, Int], String?]] arr = zip(zip(foo, bar), lol)
scatter (nested_pair in arr) {}

We can solve this in dxCompiler and allow such scenarios, but isn't this in the realm of automatic type coercion and contradictory to strong/static typing of the language?

Could you provide more explicit guidance on typing the inputs/outputs for the standard library functions in the language spec? Should typing of inputs/outputs of the standard library functions be enforced?

@patmagee
Copy link
Member

@Gvaihir this is an interesting edge case that I personally have not come across. Type coercion can lead to a bit of a mess IMO as @jdidion (and @orodeh before him) has mentioned in the past. Allowing for these automatic type coercions is at once handy to the user in most cases, and extremely frustrating in all other situations.

In the first case, do you know what the type of lol was evaluated to? A zip function should simply take two arrays and appropriately typed pairs, so I would imagine the type should actually be Pair[Pair[Int, Int], String?]. Is the null value being introduced from an implicit conversion to a String? If so, I feel like that behaviour is either poorly defined in the spec, or an unintended consequence in the code.

In general though, you do raise an interesting point about type coercion in situations where there is no explicit LHS type (as in a scatter). This is very much so the "Best Guess" approach of the engine to basically return the compatible types. I wonder if we could allow or even enforce explicit typing in the scatter block

scatter (Pair[Pair[Int, Int], String?] nested_pair in zip(zip(foo, bar), lol)) {
  String? opt = nested_pair.right
  String nonOpt = select_first(nested_pair.right,"")
}

@Gvaihir
Copy link
Author

Gvaihir commented Aug 26, 2022

Thanks for the reply @patmagee! lol was evaluated to String instead of String? and it was a problem of the engine (dxCompiler) and we fixed it, as I pointed in the original question. But I was asking about this solely with regards to strong/static typing which is enforced everywhere except StdLib functions (and maybe some other edge cases).

As to your suggestion - I would propose to consider it. Maybe not in scatter itself - this may make the code less readable, but outside the scatter definition? Or simply put a single common denominator as all inputs and outputs have to be declared and typed - regardless whether the custom tasks or StdLib is in question.

@patmagee
Copy link
Member

@Gvaihir given that this was a bug in the implementation, do you think it is worth keeping this issue open or should we potentially start another discussion around explicitly typing outputs in operations like scatter?

@Gvaihir
Copy link
Author

Gvaihir commented Sep 21, 2022

@patmagee - I started the issue exactly to discuss explicit typing of stdlib functions. My OP attempts to give an example where this may be relevant. Indeed - the implementation was fixed, but that bug was not at the center of my question. I have the following suggestions for moving forward:

  1. Keep this thread for discussion and just add a "discussion" label (or whatever you see appropriate).
  2. Close it with the note that explicit typing of the Stdlib functions is not enforced so that will be left up to the compiler/executor implementations. Adding such note to the language spec would be the most helpful too.
    I'll leave it up to you

@jdidion
Copy link
Collaborator

jdidion commented Feb 6, 2024

The parameter and return types of the standard library functions are given as explicitly as possible. Some functions are generic and thus the return type depends on the input type(s). In this case, the types are given using generic parameters, but the intention is for the engine to enforce the data types.

In the following example, the return type of zip(a, b) is expected to be Array[Pair[Int, String?]] whether or not it is specified explicitly by the user.

Array[Int] a = [1,2,3]
Array[String?] b = ["a", None, "c"]
zip(a, b)

If you have specific examples of standard library functions where a parameter or return type is not explicit (or explicit enough) please re-open this issue and add that information. Thanks

@jdidion jdidion closed this as completed Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants