Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isdistinct doesn't work for null #2879

Open
floitsch opened this issue Jul 27, 2023 · 8 comments
Open

isdistinct doesn't work for null #2879

floitsch opened this issue Jul 27, 2023 · 8 comments
Labels
docs Only related to documentation

Comments

@floitsch
Copy link

Environment

  • PostgreSQL version: (if using docker, specify the image)
    public.ecr.aws/supabase/postgres:15.1.0.90

  • PostgREST version: (if using docker, specify the image)
    public.ecr.aws/supabase/postgrest:v11.1.0

  • Operating system:
    Linux

Description of issue

isdistinct doesn't work for null values. It is not possible to get all rows that do not have null in it.

Repro:

CREATE TABLE bug (
  id SERIAL PRIMARY KEY,
  data TEXT,
  b BOOLEAN
);

INSERT INTO bug (data, b)
  VALUES
    ('null', null),
    (null, false),
    ('foobar', null);
ᐅ curl 'http://localhost:49321/rest/v1/bug?data=isdistinct.null'
[{"id":2,"data":null,"b":false}, 
 {"id":3,"data":"foobar","b":null}]%                                                                                                                                                  

ᐅ curl 'http://localhost:49321/rest/v1/bug?b=isdistinct.null'
{"code":"22P02","details":null,"hint":null,"message":"invalid input syntax for type boolean: \"null\""}%                                                                              

As can be seen in the example: the value that is passed to isdistinct must be the type of the column, and if the column is of type string, then the isdistinct treats it like the string 'null' and not the null value.

Expected behavior:
isdistinct behaves similar to IS DISTINCT FROM (as the documentation states) and thus:

  • isdistinct.null compares to the null value.
  • isdistinct."null" looks for the 'null' string.
@laurenceisla
Copy link
Member

laurenceisla commented Jul 28, 2023

Treating the null value in the query string as a text string is a current limitation for isdistinct. It also happens for eq, like and similar comparison operators. Only is treats the value as null IIRC.

I'm not so sure about treating only the quoted "null" as a string, because other strings don't behave like that currently, e.g. isdistinct="value" is the same as IS DISTINCT FROM '"value"' (it keeps the double quote).

We would need to get to an agreement on the syntax or if it should be implemented, because using not.is.null as an alternative is available.

@steve-chavez
Copy link
Member

curl 'http://localhost:49321/rest/v1/bug?b=isdistinct.null'

A boolean looks more fit for is.unknown (three-valued logic):

curl 'http://localhost:49321/rest/v1/bug?b=is.unknown'

@floitsch
Copy link
Author

I'm fine either way, as long as the documentation is updated.

Note that quotes already dropped inside an or (or similar construct): or=(data.eq."foo") is the same as or=(data.eq.foo)

(btw: is there a good place to find all these rules and maybe some test-cases, so that library writers can make sure they get all these cases correct?)

@laurenceisla
Copy link
Member

Note that quotes already dropped inside an or (or similar construct): or=(data.eq."foo") is the same as or=(data.eq.foo)

That's interesting, then maybe we aren't being completely consistent in parsing operators, or maybe there's something special that I'm missing for logical operators.

is there a good place to find all these rules and maybe some test-cases, so that library writers can make sure they get all these cases correct?

For now, this is documented in the Horizontal Filtering and the Working With PG Types sections in the docs. The tests we use for Filters could also be useful:

describe "Filtering response" $ do
it "matches with equality" $
get "/items?id=eq.5"
`shouldRespondWith` [json| [{"id":5}] |]
{ matchHeaders = ["Content-Range" <:> "0-0/*"] }
it "matches with equality using not operator" $
get "/items?id=not.eq.5&order=id"
`shouldRespondWith` [json| [{"id":1},{"id":2},{"id":3},{"id":4},{"id":6},{"id":7},{"id":8},{"id":9},{"id":10},{"id":11},{"id":12},{"id":13},{"id":14},{"id":15}] |]
{ matchHeaders = ["Content-Range" <:> "0-13/*"] }
it "matches with more than one condition using not operator" $
get "/simple_pk?k=like.*yx&extra=not.eq.u" `shouldRespondWith` "[]"
it "matches with inequality using not operator" $ do
get "/items?id=not.lt.14&order=id.asc"
`shouldRespondWith` [json| [{"id":14},{"id":15}] |]
{ matchHeaders = ["Content-Range" <:> "0-1/*"] }
get "/items?id=not.gt.2&order=id.asc"
`shouldRespondWith` [json| [{"id":1},{"id":2}] |]
{ matchHeaders = ["Content-Range" <:> "0-1/*"] }
it "matches items IN" $
get "/items?id=in.(1,3,5)"
`shouldRespondWith` [json| [{"id":1},{"id":3},{"id":5}] |]
{ matchHeaders = ["Content-Range" <:> "0-2/*"] }
it "matches items NOT IN using not operator" $
get "/items?id=not.in.(2,4,6,7,8,9,10,11,12,13,14,15)"
`shouldRespondWith` [json| [{"id":1},{"id":3},{"id":5}] |]
{ matchHeaders = ["Content-Range" <:> "0-2/*"] }
it "matches nulls using not operator" $
get "/no_pk?a=not.is.null" `shouldRespondWith`
[json| [{"a":"1","b":"0"},{"a":"2","b":"0"}] |]
{ matchHeaders = [matchContentTypeJson] }
it "matches nulls in varchar and numeric fields alike" $ do
get "/no_pk?a=is.null" `shouldRespondWith`
[json| [{"a": null, "b": null}] |]
{ matchHeaders = [matchContentTypeJson] }
get "/nullable_integer?a=is.null" `shouldRespondWith` [json|[{"a":null}]|]
it "matches with trilean values" $ do
get "/chores?done=is.true" `shouldRespondWith`
[json| [{"id": 1, "name": "take out the garbage", "done": true }] |]
{ matchHeaders = [matchContentTypeJson] }
get "/chores?done=is.false" `shouldRespondWith`
[json| [{"id": 2, "name": "do the laundry", "done": false }] |]
{ matchHeaders = [matchContentTypeJson] }
get "/chores?done=is.unknown" `shouldRespondWith`
[json| [{"id": 3, "name": "wash the dishes", "done": null }] |]
{ matchHeaders = [matchContentTypeJson] }
it "matches with trilean values in upper or mixed case" $ do
get "/chores?done=is.NULL" `shouldRespondWith`
[json| [{"id": 3, "name": "wash the dishes", "done": null }] |]
{ matchHeaders = [matchContentTypeJson] }
get "/chores?done=is.TRUE" `shouldRespondWith`
[json| [{"id": 1, "name": "take out the garbage", "done": true }] |]
{ matchHeaders = [matchContentTypeJson] }
get "/chores?done=is.FAlSe" `shouldRespondWith`
[json| [{"id": 2, "name": "do the laundry", "done": false }] |]
{ matchHeaders = [matchContentTypeJson] }
get "/chores?done=is.UnKnOwN" `shouldRespondWith`
[json| [{"id": 3, "name": "wash the dishes", "done": null }] |]
{ matchHeaders = [matchContentTypeJson] }
it "fails if 'is' used and there's no null or trilean value" $ do
get "/chores?done=is.nil" `shouldRespondWith` 400
get "/chores?done=is.ok" `shouldRespondWith` 400
it "matches with like" $ do
get "/simple_pk?k=like.*yx" `shouldRespondWith`
[json|[{"k":"xyyx","extra":"u"}]|]
get "/simple_pk?k=like.xy*" `shouldRespondWith`
[json|[{"k":"xyyx","extra":"u"}]|]
get "/simple_pk?k=like.*YY*" `shouldRespondWith`
[json|[{"k":"xYYx","extra":"v"}]|]
it "matches with like using not operator" $
get "/simple_pk?k=not.like.*yx" `shouldRespondWith`
[json|[{"k":"xYYx","extra":"v"}]|]
it "matches with ilike" $ do
get "/simple_pk?k=ilike.xy*&order=extra.asc" `shouldRespondWith`
[json|[{"k":"xyyx","extra":"u"},{"k":"xYYx","extra":"v"}]|]
{ matchHeaders = [matchContentTypeJson] }
get "/simple_pk?k=ilike.*YY*&order=extra.asc" `shouldRespondWith`
[json|[{"k":"xyyx","extra":"u"},{"k":"xYYx","extra":"v"}]|]
{ matchHeaders = [matchContentTypeJson] }
it "matches with ilike using not operator" $
get "/simple_pk?k=not.ilike.xy*&order=extra.asc" `shouldRespondWith` "[]"
it "matches with ~" $ do
get "/simple_pk?k=match.yx$" `shouldRespondWith`
[json|[{"k":"xyyx","extra":"u"}]|]
get "/simple_pk?k=match.^xy" `shouldRespondWith`
[json|[{"k":"xyyx","extra":"u"}]|]
get "/simple_pk?k=match.YY" `shouldRespondWith`
[json|[{"k":"xYYx","extra":"v"}]|]
it "matches with ~ using not operator" $
get "/simple_pk?k=not.match.yx$" `shouldRespondWith`
[json|[{"k":"xYYx","extra":"v"}]|]
it "matches with ~*" $ do
get "/simple_pk?k=imatch.^xy&order=extra.asc" `shouldRespondWith`
[json|[{"k":"xyyx","extra":"u"},{"k":"xYYx","extra":"v"}]|]
{ matchHeaders = [matchContentTypeJson] }
get "/simple_pk?k=imatch..*YY.*&order=extra.asc" `shouldRespondWith`
[json|[{"k":"xyyx","extra":"u"},{"k":"xYYx","extra":"v"}]|]
{ matchHeaders = [matchContentTypeJson] }
it "matches with ~* using not operator" $
get "/simple_pk?k=not.imatch.^xy&order=extra.asc" `shouldRespondWith` "[]"
describe "Full text search operator" $ do
it "finds matches with to_tsquery" $
get "/tsearch?text_search_vector=fts.impossible" `shouldRespondWith`
[json| [{"text_search_vector": "'fun':5 'imposs':9 'kind':3" }] |]
{ matchHeaders = [matchContentTypeJson] }
it "can use lexeme boolean operators(&=%26, |=%7C, !) in to_tsquery" $ do
get "/tsearch?text_search_vector=fts.fun%26possible" `shouldRespondWith`
[json| [ {"text_search_vector": "'also':2 'fun':3 'possibl':8"}] |]
{ matchHeaders = [matchContentTypeJson] }
get "/tsearch?text_search_vector=fts.impossible%7Cpossible" `shouldRespondWith`
[json| [
{"text_search_vector": "'fun':5 'imposs':9 'kind':3"},
{"text_search_vector": "'also':2 'fun':3 'possibl':8"}] |]
{ matchHeaders = [matchContentTypeJson] }
get "/tsearch?text_search_vector=fts.fun%26!possible" `shouldRespondWith`
[json| [ {"text_search_vector": "'fun':5 'imposs':9 'kind':3"}] |]
{ matchHeaders = [matchContentTypeJson] }
it "finds matches with plainto_tsquery" $
get "/tsearch?text_search_vector=plfts.The%20Fat%20Rats" `shouldRespondWith`
[json| [ {"text_search_vector": "'ate':3 'cat':2 'fat':1 'rat':4" }] |]
{ matchHeaders = [matchContentTypeJson] }
when (actualPgVersion >= pgVersion112) $ do
it "finds matches with websearch_to_tsquery" $
get "/tsearch?text_search_vector=wfts.The%20Fat%20Rats" `shouldRespondWith`
[json| [ {"text_search_vector": "'ate':3 'cat':2 'fat':1 'rat':4" }] |]
{ matchHeaders = [matchContentTypeJson] }
it "can use boolean operators(and, or, -) in websearch_to_tsquery" $ do
get "/tsearch?text_search_vector=wfts.fun%20and%20possible"
`shouldRespondWith`
[json| [ {"text_search_vector": "'also':2 'fun':3 'possibl':8"}] |]
{ matchHeaders = [matchContentTypeJson] }
get "/tsearch?text_search_vector=wfts.impossible%20or%20possible"
`shouldRespondWith`
[json| [
{"text_search_vector": "'fun':5 'imposs':9 'kind':3"},
{"text_search_vector": "'also':2 'fun':3 'possibl':8"}]
|]
{ matchHeaders = [matchContentTypeJson] }
get "/tsearch?text_search_vector=wfts.fun%20and%20-possible"
`shouldRespondWith`
[json| [ {"text_search_vector": "'fun':5 'imposs':9 'kind':3"}] |]
{ matchHeaders = [matchContentTypeJson] }
it "finds matches with different dictionaries" $ do
get "/tsearch?text_search_vector=fts(french).amusant" `shouldRespondWith`
[json| [{"text_search_vector": "'amus':5 'fair':7 'impossibl':9 'peu':4" }] |]
{ matchHeaders = [matchContentTypeJson] }
get "/tsearch?text_search_vector=plfts(french).amusant%20impossible" `shouldRespondWith`
[json| [{"text_search_vector": "'amus':5 'fair':7 'impossibl':9 'peu':4" }] |]
{ matchHeaders = [matchContentTypeJson] }
when (actualPgVersion >= pgVersion112) $
get "/tsearch?text_search_vector=wfts(french).amusant%20impossible"
`shouldRespondWith`
[json| [{"text_search_vector": "'amus':5 'fair':7 'impossibl':9 'peu':4" }] |]
{ matchHeaders = [matchContentTypeJson] }
it "can be negated with not operator" $ do
get "/tsearch?text_search_vector=not.fts.impossible%7Cfat%7Cfun" `shouldRespondWith`
[json| [
{"text_search_vector": "'amus':5 'fair':7 'impossibl':9 'peu':4"},
{"text_search_vector": "'art':4 'spass':5 'unmog':7"}]|]
{ matchHeaders = [matchContentTypeJson] }
get "/tsearch?text_search_vector=not.fts(english).impossible%7Cfat%7Cfun" `shouldRespondWith`
[json| [
{"text_search_vector": "'amus':5 'fair':7 'impossibl':9 'peu':4"},
{"text_search_vector": "'art':4 'spass':5 'unmog':7"}]|]
{ matchHeaders = [matchContentTypeJson] }
get "/tsearch?text_search_vector=not.plfts.The%20Fat%20Rats" `shouldRespondWith`
[json| [
{"text_search_vector": "'fun':5 'imposs':9 'kind':3"},
{"text_search_vector": "'also':2 'fun':3 'possibl':8"},
{"text_search_vector": "'amus':5 'fair':7 'impossibl':9 'peu':4"},
{"text_search_vector": "'art':4 'spass':5 'unmog':7"}]|]
{ matchHeaders = [matchContentTypeJson] }
when (actualPgVersion >= pgVersion112) $
get "/tsearch?text_search_vector=not.wfts(english).impossible%20or%20fat%20or%20fun"
`shouldRespondWith`
[json| [
{"text_search_vector": "'amus':5 'fair':7 'impossibl':9 'peu':4"},
{"text_search_vector": "'art':4 'spass':5 'unmog':7"}]|]
{ matchHeaders = [matchContentTypeJson] }
context "Use of the phraseto_tsquery function" $ do
it "finds matches" $
get "/tsearch?text_search_vector=phfts.The%20Fat%20Cats" `shouldRespondWith`
[json| [{"text_search_vector": "'ate':3 'cat':2 'fat':1 'rat':4" }] |]
{ matchHeaders = [matchContentTypeJson] }
it "finds matches with different dictionaries" $
get "/tsearch?text_search_vector=phfts(german).Art%20Spass" `shouldRespondWith`
[json| [{"text_search_vector": "'art':4 'spass':5 'unmog':7" }] |]
{ matchHeaders = [matchContentTypeJson] }
it "can be negated with not operator" $
get "/tsearch?text_search_vector=not.phfts(english).The%20Fat%20Cats" `shouldRespondWith`
[json| [
{"text_search_vector": "'fun':5 'imposs':9 'kind':3"},
{"text_search_vector": "'also':2 'fun':3 'possibl':8"},
{"text_search_vector": "'amus':5 'fair':7 'impossibl':9 'peu':4"},
{"text_search_vector": "'art':4 'spass':5 'unmog':7"}]|]
{ matchHeaders = [matchContentTypeJson] }
it "can be used with or query param" $
get "/tsearch?or=(text_search_vector.phfts(german).Art%20Spass, text_search_vector.phfts(french).amusant, text_search_vector.fts(english).impossible)" `shouldRespondWith`
[json|[
{"text_search_vector": "'fun':5 'imposs':9 'kind':3" },
{"text_search_vector": "'amus':5 'fair':7 'impossibl':9 'peu':4" },
{"text_search_vector": "'art':4 'spass':5 'unmog':7"}
]|] { matchHeaders = [matchContentTypeJson] }
it "matches with computed column" $
get "/items?always_true=eq.true&order=id.asc" `shouldRespondWith`
[json| [{"id":1},{"id":2},{"id":3},{"id":4},{"id":5},{"id":6},{"id":7},{"id":8},{"id":9},{"id":10},{"id":11},{"id":12},{"id":13},{"id":14},{"id":15}] |]
{ matchHeaders = [matchContentTypeJson] }
it "order by computed column" $
get "/items?order=anti_id.desc" `shouldRespondWith`
[json| [{"id":1},{"id":2},{"id":3},{"id":4},{"id":5},{"id":6},{"id":7},{"id":8},{"id":9},{"id":10},{"id":11},{"id":12},{"id":13},{"id":14},{"id":15}] |]
{ matchHeaders = [matchContentTypeJson] }
it "cannot access a computed column that is outside of the config schema" $
get "/items?always_false=is.false" `shouldRespondWith` 400
it "matches filtering nested items" $
get "/clients?select=id,projects(id,tasks(id,name))&projects.tasks.name=like.Design*" `shouldRespondWith`
[json|[{"id":1,"projects":[{"id":1,"tasks":[{"id":1,"name":"Design w7"}]},{"id":2,"tasks":[{"id":3,"name":"Design w10"}]}]},{"id":2,"projects":[{"id":3,"tasks":[{"id":5,"name":"Design IOS"}]},{"id":4,"tasks":[{"id":7,"name":"Design OSX"}]}]}]|]
{ matchHeaders = [matchContentTypeJson] }
it "errs when the embedded resource doesn't exist and an embedded filter is applied to it" $ do
get "/clients?select=*&non_existent_projects.name=like.*NonExistent*" `shouldRespondWith`
[json|
{"hint":"Verify that 'non_existent_projects' is included in the 'select' query parameter.",
"details":null,
"code":"PGRST108",
"message":"'non_existent_projects' is not an embedded resource in this request"}|]
{ matchStatus = 400
, matchHeaders = [matchContentTypeJson]
}
get "/clients?select=*,amiga_projects:projects(*)&amiga_projectsss.name=ilike.*Amiga*" `shouldRespondWith`
[json|
{"hint":"Verify that 'amiga_projectsss' is included in the 'select' query parameter.",
"details":null,
"code":"PGRST108",
"message":"'amiga_projectsss' is not an embedded resource in this request"}|]
{ matchStatus = 400
, matchHeaders = [matchContentTypeJson]
}
get "/clients?select=id,projects(id,tasks(id,name))&projects.tasks2.name=like.Design*" `shouldRespondWith`
[json|
{"hint":"Verify that 'tasks2' is included in the 'select' query parameter.",
"details":null,
"code":"PGRST108",
"message":"'tasks2' is not an embedded resource in this request"}|]
{ matchStatus = 400
, matchHeaders = [matchContentTypeJson]
}
it "matches with cs operator" $
get "/complex_items?select=id&arr_data=cs.{2}" `shouldRespondWith`
[json|[{"id":2},{"id":3}]|]
{ matchHeaders = [matchContentTypeJson] }
it "matches with cd operator" $
get "/complex_items?select=id&arr_data=cd.{1,2,4}" `shouldRespondWith`
[json|[{"id":1},{"id":2}]|]
{ matchHeaders = [matchContentTypeJson] }
it "matches with IS DISTINCT FROM" $
get "/no_pk?select=a&a=isdistinct.2" `shouldRespondWith`
[json|[{"a":null},{"a":"1"}]|]
{ matchHeaders = [matchContentTypeJson] }
it "matches with IS DISTINCT FROM using not operator" $
get "/no_pk?select=a&a=not.isdistinct.2" `shouldRespondWith`
[json|[{"a":"2"}]|]
{ matchHeaders = [matchContentTypeJson] }

@floitsch
Copy link
Author

floitsch commented Aug 2, 2023

For now, this is documented in the Horizontal Filtering and the Working With PG Types sections in the docs. The tests we use for Filters could also be useful[.]

Thanks.
Maybe I missed it, but I was mostly looking at how to escape input data correctly. Experimentally, I found that I'm allowed (and sometimes required) to quote strings if they are nested. But I'm not allowed to do that if they aren't.

When quoted, then I need to escape any \ and ", but I don't escape any other character.

I also need to quote column names that are equal to "or" or "and", and I do quote some other column names as well. However, I'm not completely sure what the rules for column names are. I wasn't able to access a column named ". (To be fair, I didn't manage that with pure SQL either, but Postgresql does apparently allow it).

Here is my encoding function: https://github.com/toitware/toit-supabase/blob/06f434e8e35705063965320639c0d8299a95c995/src/filter.toit#L305
(and the encode_column just below).

Here are my test-cases for some string values:
https://github.com/toitware/toit-supabase/blob/06f434e8e35705063965320639c0d8299a95c995/tests/supabase_filters_test.toit#L49

@steve-chavez
Copy link
Member

Finally got to understand this issue. So the problem is that:

col=isdistinct.null

Translates to:

col IS DISTINCT FROM 'null'

And it's expected to translate to

col IS DISTINCT FROM NULL

Which is reasonable. Though as Laurence mentions, it's possible to get the same behavior with not.is.null.

Expected behavior:
isdistinct behaves similar to IS DISTINCT FROM (as the documentation states) and thus:
isdistinct.null compares to the null value.

Seems possible to get that behavior by doing:

col is distinct from nullif('null', 'null')::col_type;

But not sure if we should do it or we should just document the limitation.

@wolfgangwalther
Copy link
Member

I don't think we should change anything (except maybe for better documentation). The rules are simple. If you want to compare the column data with some value passed via query string, you will need to:

  • use is / not.is when you want to compare against value NULL.
  • use eq / not.eq when you want to compare against a non-null value, but never want to match when data IS NULL.
  • use isdistinct when you want to compare against a non-null value and always want to match when data IS NULL.

not.isdistinct is only useful when you compare two columns, but we don't allow that. As soon as you compare against a constant, which we always do, you can choose the operator accordingly.

@wolfgangwalther
Copy link
Member

wolfgangwalther commented Jan 20, 2024

Expected behavior:
isdistinct behaves similar to IS DISTINCT FROM (as the documentation states) and thus:

isdistinct.null compares to the null value.
isdistinct."null" looks for the 'null' string.

Before we could do anything like that, we would need to clean up the whole quoting mess, as you already wrote in the other comments. See also: #1943

@wolfgangwalther wolfgangwalther added docs Only related to documentation and removed question labels May 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Only related to documentation
Development

No branches or pull requests

4 participants