Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix plan unparsing for filters with subqueries involved type coercion #26

Merged

Conversation

sgrebnov
Copy link
Collaborator

@sgrebnov sgrebnov commented Aug 1, 2024

DataFusion optimization uses aliases to preserve original expression names during optimizations and type coercions:
apache/datafusion#3794

Datafusion contains special logic to prevent aliasing for filter/join plans, but I still see the alias can be produced for some complex queries/cases
https://github.com/apache/datafusion/blob/f044bc8371d5b4e1e51a9026f3eccac16a6d4648/datafusion/optimizer/src/utils.rs#L316

Self {
            use_alias: !matches!(plan, LogicalPlan::Filter(_) | LogicalPlan::Join(_)),
        }

If there is a filter with additional alias information, we must use the expression only as original expression name is unnecessary and the alias can't be part of the WHERE clause. Note: we don't currently support Expr::Alias as general unparsing mechanism, put fix directly to where_expr so we are sure that the Alias is part of filter / where clause.

Example query

select c_custkey
from customer
where c_custkey > (select 1 from customer where c_acctbal > 0.00)

There will be the alias with name "customer.c_acctbal > Decimal128(Some(0),38,15)" produced due to type coercion Decimal128(Some(0),38,2) and Decimal128(Some(0),38,15)

/// Ensure `expr` has the name as `original_name` by adding an
    /// alias if necessary.
    pub fn alias_if_changed(self, original_name: String) -> Result<Expr> {
        let new_name = self.name_for_alias()?;

        if new_name == original_name { # customer.c_acctbal > Decimal128(Some(0),38,15) != customer.c_acctbal > Decimal128(Some(0),38, 2)
            return Ok(self);
        }

        Ok(self.alias(original_name))
    }

alias:Alias { expr: BinaryExpr(BinaryExpr { left: Column(Column { relation: None, name: "c_acctbal" }), op: Gt, right: Literal(Decimal128(Some(0),38,2)) }), relation: None, name: "customer.c_acctbal > Decimal128(Some(0),38,15)" }

@phillipleblanc phillipleblanc merged commit c49db80 into datafusion-contrib:main Aug 2, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants