Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tail return proposal #1761

Open
josh11b opened this issue Jul 27, 2022 · 21 comments
Open

tail return proposal #1761

josh11b opened this issue Jul 27, 2022 · 21 comments
Labels
long term Issues expected to take over 90 days to resolve.

Comments

@josh11b
Copy link
Contributor

josh11b commented Jul 27, 2022

I think the Carbon project would welcome a proposal to add an explicit syntax that forces a tail call. This is aligned with our goals of predictable performance and giving the user explicit control.

My syntax suggestion: tail return F(...);.

Background:

@josh11b josh11b added the good first issue Possibly a good first issue for newcomers label Jul 27, 2022
@pache-Ak
Copy link

I agree, we need something to help complier. But I don't want it is a keyword. In my opinion, attribute is a good choice.
The program will run the same whether we use this flag or not. But if we use it, the compiler might give us better performance. This matches the meaning of the C++ attribute. I think keywords should have an effect on program semantics, but it tail doesn't.

@OlaFosheimGrostad
Copy link

Try to avoid "stealing" names that are commonly used in algorithms: head, tail, next, prev, iter, etc…

@dialupmars
Copy link

Could it work if we made it so if you used t.return or something similar, it would then function like a normal tail return? Or could be similar to print and printf, the extra letter changes what the argument is.

@emlai
Copy link

emlai commented Jul 27, 2022

I agree with using an attribute, to avoid stealing common variable names, for example: @tailcall return F(...);.

Also I think it's too early for this proposal. There are much more important things that would be better focused on right now.

@geoffromer
Copy link
Contributor

The "Tail Calls" section of this C++ proposal might also be a useful reference, if you'll pardon the self-citation.

I personally agree that tail return is probably not the best syntax, but syntax is easy to change, so I think we shouldn't focus too much on that. The more interesting questions to consider are things like:

  • Does tail return (or whatever the syntax is) require the implementation to perform tail call elimination? True, it's "just an optimization", but it's a very powerful one: it reduces consumption of a very scarce resource (stack space) from O(N) to O(1), so a lot of code that would benefit from this optimization simply won't work if the optimization isn't applied.
  • What requirements does tail return impose on the caller and the callee? Tail call elimination only works if no code at all is executed after the tail call, including things like destructors. So do we disallow local variables with nontrivial destructors, or do we run destructors before the tail call is entered? This feeds back into the previous point, because running the destructors early could observably change the behavior of the program, which makes it harder to argue this is merely an optimization.

@josh11b
Copy link
Contributor Author

josh11b commented Jul 27, 2022

  • I think tail return should require the implementation to perform tail-call elimination. It makes it easier to reason about, and making it an explicit in the source allows it to have different semantics than ordinary calls. I think the "guarantees not optimizations" policy matches our focus on predictability and control over performance. It matches the explicit and straightforward philosophy of Carbon.
  • I think we should generate an error if any code would execute after the tail call. The error would say which variables would need to be destroyed early using ~x to make the call legal. This would make the early destruction of variables visible in the source, so there are no surprises, e.g. if you are trying to do RAII or something.

@OlaFosheimGrostad
Copy link

  • What requirements does tail return impose on the caller and the callee? Tail call elimination only works if no code at all is executed after the tail call, including things like destructors. So do we disallow local variables with nontrivial destructors, or do we run destructors before the tail call is entered? This feeds back into the previous point, because running the destructors early could observably change the behavior of the program, which makes it harder to argue this is merely an optimization.

Is the idea that you can write a tail return statement anywhere in the function and require that the compiler attempts to do a program transform that makes tail call optimization possible, while preserving externally observable effects? Basically some sort of program synthesis with verification of effects?

That is quite heavy duty… and you would need full access to all source code used. But, having better functional programming support than C++ would be a great selling point.

@OlaFosheimGrostad
Copy link

OlaFosheimGrostad commented Jul 28, 2022

  • I think we should generate an error if any code would execute after the tail call.

If you can move the constructor/destructor pair out of the resulting loop and get the same behaviour, why not? If the only observable effect is that you get fewer calls to malloc()/free() because reuse is possible, why not?

Added: You could define a concept of «internal side effects» as a change in behaviour that is contained to the runtime/standard library. Then allow all of those «internal side effects» in functions that request tail return and other 'high level optimizations'. That could also include doing things like reserving a predicted buffer size for std::vector before the loop and use shrink_to_fit after loop in order to prevent many increasing reallocations inside the loop.

@dilijev
Copy link

dilijev commented Jul 28, 2022

In the ES spec (ECMA Script / javascript) proposals for tail call/return (and then revising the spec in that area for practical reasons) there were a few big issues for discussion and debated in terms of implementation:

  1. Should tail call behavior be the default? (Largely agreed no, except by the functional programming community)
    • My answer: not default behavior, opt-in.
  2. Should there be a syntax to opt-in? (or, if (1) is yes, a syntax to opt-out)
    • My answer: yes, syntax to opt-in.
  3. Should the opt-in be (a) per-call-site, or (b) per function (attribute or syntax at point of declaration)
    • My answer: opt-in per call-site.
    • This is a question worth debating, IMO. At least the ECMAScript community had a fair amount of debate over it, as I recall.
    • Per function at declaration means any caller of a library API doesn't need to know the function is tail-callable and gets a performance benefit. This seems broadly a bad idea because it would be surprising.
    • Per call site means the programmer calling the function is aware of the tail call semantics and will not be surprised.
  4. Should there be an error if a tail-call-function [marked at declaration as tail call [i.e. 3b]] is called but not in tail position (probably not -- unless the programming paradigm truly expects all tail call functions to be optimized even if it's clearly not tail position)
    • My answer: if 3(b), I think such a function should still be allowed in non-tail position, because the programmer will be aware of the semantic difference if they understand tail calls - IOW it will not be a surprise that it works, and it will not block the programmer from doing what they want (using a function to perform its function in a reasonable context regardless of tail position)
  5. Must a tail call be part of a return statement? What if the function's return is void? Can we / should we spec tail-call of functions in tail call position when there is no return?
    • Having tail return be the only type of tail call simplifies things because there's no implicit function in tail position with suddenly confusing behavior (potentially confusing in the event of a crash or debugging missing frames you'd expect to be there, if you aren't thinking about tail call when you write the code)
    • In most languages functions that have a void return type can issue a return; statement to exit early - if we allow a function that returns void to return void; as a trivial case, that allows return <expr-evaluating-to-void>; then, syntactically, we can also tail return <void>; - that's an elegant solution to requiring tail call to be a return in the context of void functions.
  6. Debugging and crashes (backtraces)! If you tail call do you lose all the debugging context? What context can be kept? (e.g. Number of tail call frames overwritten, the number of mutually recursive functions involved if it's not directly self-recursive, etc.) Obviously you'd want to collapse it somehow for the same reason that tail call wants to get rid of O(n) recursive frames.
  7. Are there to be any restrictions on the complexity or feature set of a function that can be tail-called? For example, functional programming generally likes immutable parameters, will pointer arguments be allowed?

@nigeltao
Copy link

I think tail return should require (emphasis added) the implementation to perform tail-call elimination.

There's the Carbon explorer (interpreter) and a production Carbon compiler is (or will be) based on LLVM. It's not obvious to me whether it will be feasible to transpile Carbon to C++ the way the original CFront transpiled C++ to C code (not to object code). Feasible meaning not just a proof-of-concept, but something you could use in production.

C++ is Turing complete, but still, requiring TCE (Tail Call Elimination) could make transpilation harder.

There's no definitive right answer and it's all trade-offs. I was just wondering whether it was an explicit goal or non-goal to rule in or out a CFront-like thing. (This question could possibly spin out as a separate issue).

@geoffromer
Copy link
Contributor

I was just wondering whether it was an explicit goal or non-goal to rule in or out a CFront-like thing.

See the interoperability goals proposal:

The Carbon toolchain will support, as an option, generating a C++ header and object file from a Carbon library, with an ABI that's suitable for use with non-Carbon toolchains.

But see also the following paragraph, about how C++ code generation will not necessarily support all Carbon features with full fidelity. This is only intended as an interoperability tool; we definitely don't expect or intend to go through a phase where all Carbon code is transpiled to C++, the way all C++ code was transpiled to C in the CFront era.

@josh11b
Copy link
Contributor Author

josh11b commented Jul 29, 2022

@chandlerc Would you like to weigh in on the CFront-style transpiler is something we want to support?

@OlaFosheimGrostad
Copy link

Just for reference: ATS is using fn for regular functions and fnx for mutually recursive functions that should be tail call optimized:
http://ats-lang.sourceforge.net/DOCUMENT/INT2PROGINATS/HTML/x715.html

@chandlerc
Copy link
Contributor

@chandlerc Would you like to weigh in on the CFront-style transpiler is something we want to support?

I'm not opposed to some folks working on this if they want, but I think there are good reasons why neither the explorer nor the main toolchain should take this direction.

For the explorer, I think its primary value is in simplicity and very clearly and unambiguously matching the intended semantics of the language. I would worry about trying to generate code via mapping into C/C++ source code here being very difficult to avoid accidental semantic effects from the target language. The most effective way I know to do this is to map it into a very low level of the target language, which works but is also somewhat difficult and labor intensive. It's not clear this provides enough value on top of the fairly simple and effective high level interpreter in the explorer. But this is just my impression, and I don't have the deepest understanding of the explorer so I may be missing things.

For the toolchain, our goal is to allow this to become a production quality toolchain as the experiment matures and undertakes increasingly significant real-world evaluations. I suspect that compiling down to C/C++ source code would significantly hamper this because it would be hard to realize some of the specific value propositions of Carbon with this design. For example, I would expect such a toolchain to be somewhat brittle, have significantly worse compile times, and potentially generate poor performing code because of being forced to indirect through a source language rather than directly working with the underlying compiler infrastructure.

The generics model of Carbon would also have to be entirely handled before emitting the low-level code, and I suspect a very large fraction of the complexity of any toolchain will be consumed in the generics and type checking infrastructure that won't be saved by lowering to C/C++.

But again, while these points may argue against using this strategy for these two efforts, they don't necessarily argue against having such an implementation for other reasons.

@nigeltao
Copy link

while these points may argue against using this strategy for these two efforts, they don't necessarily argue against having such an implementation for other reasons.

I don't think it's controversial that neither the explorer nor the main toolchain uses transpliation. I'm more concerned about other toolchains.

My question is, if it comes down to "(1) e.g. Mandatory TCE (Tail Call Elimination); (2) Transpiling Carbon to C++ is viable; choose one", is there early guidance on Carbon's choice?

@chandlerc
Copy link
Contributor

while these points may argue against using this strategy for these two efforts, they don't necessarily argue against having such an implementation for other reasons.

I don't think it's controversial that neither the explorer nor the main toolchain uses transpliation. I'm more concerned about other toolchains.

My question is, if it comes down to "(1) e.g. Mandatory TCE (Tail Call Elimination); (2) Transpiling Carbon to C++ is viable; choose one", is there early guidance on Carbon's choice?

This is a useful distillation and gets to my underlying hesitancy with transpilation.

I do not think Carbon's semantics should be constrained by what we can transpile to C/C++.

So I think that means if the only tradeoff is the above, it's option (1) all the way.

There are lots of fancy, kludgy ways to transpile to get very precise functionality like this. If Carbon benefits from guaranteed TCE, then those will be necessary, and that's the cost of trying to transpile. I really want carbon to be free to innovate semantically here.

@nigeltao
Copy link

It's tangential from tail return (and Carbon) but, speaking of transpilation and CFront, https://github.com/hsutter/cppfront was recently mentioned at CppCon.

@github-actions
Copy link

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive label. The long term label can also be added for issues which are expected to take time.
This issue is labeled inactive because the last activity was over 90 days ago.

@github-actions github-actions bot added the inactive Issues and PRs which have been inactive for at least 90 days. label Dec 17, 2022
@chandlerc chandlerc added long term Issues expected to take over 90 days to resolve. and removed inactive Issues and PRs which have been inactive for at least 90 days. labels Dec 21, 2022
@Akashgola123
Copy link

I want to work on this issue please assign this issue to me

@jonmeow
Copy link
Contributor

jonmeow commented Mar 10, 2023

@Akashgola123 We typically won't assign issues. I've added some documentation for why. Please feel to work on it if you have time, but please through the proposal process to get a better understanding of what to expect.

@jonmeow jonmeow removed the good first issue Possibly a good first issue for newcomers label Mar 10, 2023
@jonmeow
Copy link
Contributor

jonmeow commented Mar 10, 2023

Also, just to note -- I've checked with josh11b, and as proposals can be a little more complex, I've unmarked this as a "good first issue".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
long term Issues expected to take over 90 days to resolve.
Projects
None yet
Development

No branches or pull requests