Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PG14: Custom parameter names must be of the form "identifier.identifier" and no dash(-) can be inside them #1857

Closed
congstr opened this issue May 28, 2021 · 34 comments · Fixed by #1968
Assignees
Labels
breaking change A bug fix or enhancement that would cause a breaking change bug

Comments

@congstr
Copy link

congstr commented May 28, 2021

Environment

  • PostgreSQL version: 14beta1
  • PostgREST version: 7.0.1
  • Operating system: CentOS 8

Description of issue

I've been testing my current setup with the latest PostgreSQL v14 beta1. There is a new limitation which does not allow to use the latest PostgreSQL as is.

The root cause can be explained with this snippet:

postgres=# set local "request.jwt.claim.role" = 'webuser';
WARNING:  SET LOCAL can only be used in transaction blocks
ERROR:  invalid configuration parameter name "request.jwt.claim.role"
DETAIL:  Custom parameter names must be of the form "identifier.identifier".

A workaround would be reverting the code which introduced this limitation:

git clone https://github.com/postgres/postgres.git
git revert 3db826bd55cd1df0dd8c3d811f8e5b936d7ba1e4

More details:
postgres/postgres@3db826b

I'm posting this here so people know that this will be an issue soon when the upcoming releases will have to support PostgreSQL v14.

@steve-chavez
Copy link
Member

Thanks a lot for the detailed report @congstr. This looks like trouble.

I wonder if it's intentional to ban GUCs with many dots, I don't see test cases for this here. If it's a bug, could we revert it?

Worst case scenario, we could change the request.jwt.claim.role to request.jwt_claim_role , same for our other GUCs. Having the dot as a namespace looks nicer though.

@congstr
Copy link
Author

congstr commented May 28, 2021

it's intentional to ban GUCs with many dots

not a bug, done intentionally to have only the form "identifier.identifier"

request.jwt_claim_role

I've just tested this on pg14:

postgres=# set local "request.jwt_a_asdf" = 'webuser';
WARNING:  SET LOCAL can only be used in transaction blocks
SET

The good news is that underscore is still valid. How fast would you like to see fix for this done? I'd like to get into knowing the internals of PostgREST much better at one point since I'm using it in production. Any recommendations how to start coding for this project with no prior Haskell experience?

@congstr
Copy link
Author

congstr commented May 28, 2021

"identifier"
should start with one of ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
and then may continue also with these: 0123456789_$

I found out that these are the only characters allowed in pg14beta1.

@steve-chavez
Copy link
Member

I'd like to get into knowing the internals of PostgREST much better at one point since I'm using it in production.

@congstr Awesome! We'd really appreciate your help here :)

The GUCs names are isolated in this function(make sure to scroll to bottom)

runPgLocals :: AppConfig -> M.HashMap Text JSON.Value ->
(ApiRequest -> ExceptT Error H.Transaction Wai.Response) ->
ApiRequest -> ExceptT Error H.Transaction Wai.Response
runPgLocals conf claims app req = do
lift $ H.statement mempty $ H.dynamicallyParameterized
("select " <> intercalateSnippet ", " (searchPathSql : roleSql ++ claimsSql ++ [methodSql, pathSql] ++ headersSql ++ cookiesSql ++ appSettingsSql))
HD.noResult (configDbPreparedStatements conf)
lift $ traverse_ H.sql preReqSql
app req
where
methodSql = setConfigLocal mempty ("request.method", toS $ iMethod req)
pathSql = setConfigLocal mempty ("request.path", toS $ iPath req)
headersSql = setConfigLocal "request.header." <$> iHeaders req
cookiesSql = setConfigLocal "request.cookie." <$> iCookies req
claimsWithRole =
let anon = JSON.String . toS $ configDbAnonRole conf in -- role claim defaults to anon if not specified in jwt
M.union claims (M.singleton "role" anon)
claimsSql = setConfigLocal "request.jwt.claim." <$> [(c,unquoted v) | (c,v) <- M.toList claimsWithRole]

Changing the naming convention would be a breaking change, but since we're heading towards a new major version, it should be fine. You'll likely have to fix some failing tests as well.

How fast would you like to see fix for this done?
Any recommendations how to start coding for this project with no prior Haskell experience?

No rush :). I think LYAH is a good start for Haskell, then running the project with our Nix tooling.

@steve-chavez steve-chavez added breaking change A bug fix or enhancement that would cause a breaking change bug labels May 28, 2021
@wolfgangwalther
Copy link
Member

Hm. That just adds another point to my "rant against GUCs" list: #1794 (comment)

We do have a bit of time left until pg14 is released. I think we should release v8 without changing all the request gucs and then plan ahead to make another release that will be compatible with pg14.

Then, I'd like to bring up my proposal in #1710 (comment) again, where we'd be replacing all the GUCs entirely. The major advantage here: With this proposal, people on older PG versions can run a custom pre-request that sets the GUCs old-style. People on PG14 can set them new-style - or don't use them at all, and use only the function-argument-style. Or use schema variables, once they arrive - hopefully in PG15.

@robertsosinski
Copy link

Would it make sense to perhaps file a bug with the postgresql to say this is a breaking change?

Looking at the conversations here:

  1. https://www.postgresql-archive.org/Tightening-up-allowed-custom-GUC-names-td6178392.html
  2. https://www.postgresql.org/message-id/flat/20210209144059.GA21360%40depesz.com

It seems the contributors did not think this change would break anything, and that there were more worried about special characters such as = and -. Saying that a production project needs to allow . as part of GUC identifiers could nip this in the bud while postgresql 14 is still in beta stage (which is when you want to identify and fix breaking changes such as this).

Just switching . to _ would clobber 'namespacing' for GUC values. Example: request.app.value_x -> request.app_value_x makes it much harder to organize parameters.

Thanks!

@steve-chavez
Copy link
Member

Would it make sense to perhaps file a bug with the postgresql to say this is a breaking change?

@robertsosinski Yes, I absolutely agree. Could you file the bug 🙏?

I also believe the namespacing is a good thing to have. I don't see why it should be restricted.

@robertsosinski
Copy link

Just filed a bug report with PostgreSQL (#17045). FYI, you can also submit bugs to the PostgreSQL team via this webform here: https://www.postgresql.org/account/submitbug/ which allows you to signup/login via GitHub. You should be able to see this bug report at https://www.postgresql.org/list/pgsql-bugs/2021-06/ once it clears moderation queue.

Copying the bug report below, and also attaching screen snip for history.

PostgreSQL version: 14beta1

Operating system: All

Short description: 14 Beta Tighten up allowed names for custom GUC parameters breaks PostgREST

Details:

Recently (April 7th 2020) a new change was made to the PostgreSQL 14 beta that changes allowed names for custom GUC keys. Please see the following:

  1. The code change: postgres/postgres@3db826b
  2. Initial question about valid GUC keys: https://www.postgresql.org/message-id/flat/20210209144059.GA21360%40depesz.com
  3. More conversation about what valid GUC keys should be: https://www.postgresql-archive.org/Tightening-up-allowed-custom-GUC-names-td6178392.html

This change allows there to only be one . in the GUC key. The problem is that PostgreSQL uses multiple . in GUC keys to nest parameters, such as request.jwt.claim.role. As such this would be a change that significantly breaks PostgREST.

You can see the conversation at PostgREST here: #1857

It seems this change was mainly targeting special characters such as - and =, could this be updated to allow multiple . characters for GUC keys as well? There does not seem there is any issues with multiple . characters in GUC keys.

Thanks!

postgresql bug 2021-06-02 143439

@robertsosinski
Copy link

Hey Everyone,

FYI, bug report is in, you can see the message thread here: https://www.postgresql.org/message-id/17045-6a4a9f0d1513f72b%40postgresql.org

Already, Tom Lane has responded: https://www.postgresql.org/message-id/165910.1622665167%40sss.pgh.pa.us

Looks like one concern is succinctly describing what a valid GUC parameter name would be if multiple "." are included, but that they must include at least one ".". I gave some ideas (see below) but if y'all have better ones, please let me know.

I responded with the following, should show up on the message list soon.

Hi Tom,

Thanks for the quick and thoughtful reply. Really appreciate your support!

I would revise my statement to say this change to PostgreSQL 14 would significantly break existing PostgREST applications, to the point that they would be unable to upgrade to 14. The way PostgREST works, these GUC parameters are used extensively in user defined functions and SQL queries, with the "." used as the standard way to namespace parameter elements. Syntax-wise, it would be similar as if PostgreSQL 14 changed the schema/table qualifier from . to _. Developers could migrate all their functions and queries before upgrading to 14, but that would be a significant and required change for applications to continue working.

Regarding the error detail, here are some ideas for adding multiple "." characters. Happy for any feedback:

  1. Custom parameter names must include at least one namespace, such as "namespace.custom".
  2. Custom parameter names must include at least one namespace character ".", such as "namespace.parameter" or " namespace.parameter.one".
  3. Custom parameter names must include at least one ".", such as "custom.parameter".
  4. Custom parameter names must include at least one ".", such as "custom.one" or "custom.one.two".
  5. Custom parameter names must include at least one ".", such as "one.two" or "one.two.three".
  6. Custom parameter names must include two or more components, such as "one.two" or "one.two.three".

With these examples, I'm hoping to show that custom parameters must have at least one "namespace" or "." character. Some give an extra example with two dots to show that you can have multiple "namespaces". Again, happy to hear your thoughts!

Also, thanks for all your work on PostgreSQL! This database is Amazing (been using it for 20 years now) and excited to get production workloads on 14!

@robertsosinski
Copy link

Hey again,

So Tom has responded: https://www.postgresql.org/message-id/232375.1622674697%40sss.pgh.pa.us

Looks like going forward, one or more "." characters will be allowed. You can see the updated PostgreSQL 14 commit here: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=2955c2be79b35fa369c83fa3b5f44661cb88afa9

Looks like this should be good to go! Shout out to the PostgreSQL team for solid support (as always)!

@wolfgangwalther
Copy link
Member

Note, that this will not be the only breaking change regarding our usage of GUCs. We map headers, cookies and other claims to GUCs. And certainly, some of that is going to break with the limited set of characters allowed:

should start with one of ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
and then may continue also with these: 0123456789_$

Most prominently all the headers with -, won't work out of the box.

A short term fix could be to replace all other chars with _, so User-Agent would become request.header.user_agent. Although that would be a breaking change for all users, even those who are not using PG14.

If we instead went the way to do #1857 (comment), users with older PG versions could just update their pre-request functions to add the same GUCs as before - essentially preventing any breaking change for their application code. Users on PG14 could either map those special characters to _ and otherwise do the same - or do something entirely different.

@robertsosinski
Copy link

@wolfgangwalther dang, that is correct. I wonder though if trying to 'hunt down' every potential format that these GUCs can take key-wise, if instead cookies/headers would only be obtainable GUC style request.cookie or request.header, and then return a json object, so then the usage could look more like this: current_setting('request.header')::json ->> 'User-Agent'? I think all GUC values must be a string, but let me know if they can natively be json (thus removing the need of ::json in the query.

I could see, potentially, users stuffing all kinds of things in these GUCS, so anything say "user defined" goes into a JSON object; while anything that is more strictly defined would be available via GUC keys?

@wolfgangwalther
Copy link
Member

I thought about that, too. That would match up nicely with how we can set response headers. But it would be the same case with any custom claims in the jwt, which are currently set as request.jwt.claim.xxx. Now, if we make that request.jwt.claims with all of them... we're back to where we started, because we have a breaking change for request.jwt.claim.role. And this is really the one change that breaks most of the existing applications, I think.

I think all GUC values must be a string, but let me know if they can natively be json (thus removing the need of ::json in the query.

Nope, they have to be string.

@robertsosinski
Copy link

robertsosinski commented Jun 3, 2021

Yeah, I'm thinking about this more. I think getting the . was important: 1. as we want to clearly namespace GUC parameters, and 2. it would have been a significant breaking change to how we set every parameter currently (chiefly request.jwt.claim, but many others as well). But going forward I could see unstructured data such as claims, headers, cookies, etc... being in a json format.

Also, moving more "user defined" values into one json instead of multiple GUC values could speed performance, correct? if a claim has 5 parts, that would be 1 get/set operation, instead of 5, right? we could sell this change as a performance benefit ;p

I could also see this being an easier transition for request.jwt.claim.role (the most important imho) and anything within claims. JWT claims are naturally very short (role, aud, exp), so we wont run into the key constraint problem. We could do "both" for a new release, eg:

  1. Old style: current_setting('request.jwt.claim.role') -> 'user' + Deprecation warning
  2. New style: current_setting('request.jwt.claims')::json ->> 'role' -> 'user'

The old style of accessing could be claim, header, or cookie; new style could be claims, headers, or cookies (plural).

Super interested in everyone's thoughts. Thanks for all the great work!

@robertsosinski
Copy link

Also, been catching up on #1710 and came up with another idea.

If the "New style" of using json is somewhat lengthy, I could see an extension or namespaced pgrst function that could be optionally added, whereas you can get something like pgrst.request('jwt.claims') ->> 'role' -> 'user'. Some convenience extension could add many of these functions for easier access.

@steve-chavez
Copy link
Member

and then return a json object, so then the usage could look more like this: current_setting('request.header')::json ->> 'User-Agent'

@robertsosinski For performance reasons, I also made the same proposal here. I think that's the way forward 👍.

Old style: current_setting('request.jwt.claim.role') -> 'user' + Deprecation warning
New style: current_setting('request.jwt.claims')::json ->> 'role' -> 'user'

Deprecating instead of breaking it's definitely better. But more GUCs incur in perf loss, even with just 3 more. Perhaps we can include a config for disabling old gucs? Like db-guc-style = 'string|json'.

@robertsosinski
Copy link

One question, is there a size constraint on GUC values? Just wondering if putting too much data into one particular GUC could yield a problem.

Outside of that, I can see this being a balancing act, of too much GUC: e.g. current_setting('request.headers.User-Agent') -> 'Firefox' vs not enough GUC: e.g. current_setting('pgrst_everything') -> {huge 8mb json object}.

Based on https://postgrest.org/en/stable/api.html#accessing-request-headers-cookies-and-jwt-claims (and also looking at code in previous PostgREST projects), it seems headers are already downcased, but still keep a -. One such example is:

user_agent text := current_setting('request.header.user-agent', true);

It's pretty common to 'snake case' headers in other frameworks, so a way to blend this in would be to just update the downcase logic to also snakecase the -, thus 'request.header.user_agent'. Again, would be a breaking change, but only for header and cookie usage (which are used less often then multi . syntax or jwt.claims, atleast in my experience). That could be enough of a break to squeak by for v8 (to ensure it works with PostgreSQL 14)?

Setting responses for headers seem to already use a json style:

set_config('response.headers', '[{"Cache-Control": "public"}, {"Cache-Control": "max-age=259200"}]', true);

So would be no breaking changes there.

If v8 were to just snakecase headers and cookies, and assume jwt's are using standard JSON key names (a json key should not use the -, only _; as obj.dash-case is not valid JavaScript) then pgrst v8 would be psql v14 compatible. Then, changing some GUCs to use json could be up next?

I do think json is the future, just not sure how much could be up to change for a v8 release (which I hope could be psql v14 compliant). That would be up to y'all, and hope these ideas help.

@wolfgangwalther
Copy link
Member

I don't think we should try to squeeze in pg14 support into v8. We should release v8 asap - the only real blocker is the missing docs right now.

We already have more than a year worth of changes waiting to be released officially and there is so much value to releasing that - we don't need to support pg14 right now.

And then we can immediately work on adding pg14 to the latest nightly after v8 - but this will give us time to think about that more, maybe try different approaches or revert stuff, before making another v9 release ready for pg14.

And given that pg14 is still in beta, we don't even know where it will end up exactly. I don't think we should base our pg14 support on the beta.

@robertsosinski
Copy link

@wolfgangwalther that makes sense, especially as v8 looks to be soon out the door (which I am eagerly waiting for). Definitely defer to you and the PostgREST team on what makes the most sense here.

Really appreciate all the great work too, thanks so much!

@steve-chavez
Copy link
Member

One question, is there a size constraint on GUC values? Just wondering if putting too much data into one particular GUC could yield a problem.

I don't think there is a size constraint. We got the request.spec(this is an experimental feat) to be 279 MB big; and there was no error.

@wolfgangwalther wolfgangwalther changed the title Custom parameter names must be of the form "identifier.identifier" PG14: Custom parameter names must be of the form "identifier.identifier" Jun 4, 2021
@steve-chavez steve-chavez changed the title PG14: Custom parameter names must be of the form "identifier.identifier" PG14: Custom parameter names must be of the form "identifier.identifier" and no dash(-) can be inside them Jul 21, 2021
@glaroc
Copy link

glaroc commented Sep 27, 2021

PostgreSQL 14 is coming out this week. It might be important to clarify in the docs that PostgREST v8 doesn't work with v14 at the moment. We tested PostgREST 8 wtih PostgreSQL 14 RC1 and still got the error "Custom parameter names must be two or more simple identifiers separated by dots". "invalid configuration parameter name "request.header.x-forwarded-host"".

@steve-chavez
Copy link
Member

We'll make a release soon with this fix. For now, you can try the latest binaries here(at the bottom).

@wolfgangwalther
Copy link
Member

We'll make a release soon with this fix. For now, you can try the latest binaries here(at the bottom).

I'm missing the old nightlies... those were available from docker hub, too. Right now, this is blocking my upgrade to PG 14 on one of my dev branches :(

@steve-chavez
Copy link
Member

@wolfgangwalther Hm, certainly it was easier with the nightlies, you just pushed a tag and it was done(I remember you did it when you needed it as well).

Now you'd need to do a PR to update the cabal file(example) and then push the tag. Besides the extra step, do you have a blocker for doing a pre-release? The pre-release process should also push an image to docker hub(example from the previous pre-release).

@wolfgangwalther
Copy link
Member

Besides the extra step, do you have a blocker for doing a pre-release?

Mostly, because I am not sure / confused about two things:

Considering both, I think we should actually end up with a 8.1.0 release next. Not sure how we'd treat the accidental version bump to v9, though.

All-in-all, I don't see a reason why we should use pre-releases at all. We already have all the artifacts in GitHub Actions - the only thing we'd need to do is upload the docker image, that is already built, to docker hub every time an action is run on main.

@wolfgangwalther
Copy link
Member

Not sure how we'd treat the accidental version bump to v9, though.

I missed the fact, that this was corrected in #1957 already.

@monacoremo
Copy link
Member

Now you'd need to do a PR to update the cabal file(example) and then push the tag

The extra step with the PR should not be necessary, you can update the cabal version with a local commit and just push the tag directly! (Leaving main, which is a protected branch and would require a PR, untouched)

After the discussion in CI fixes #1957, I am not sure what the next version number for a pre-release should be.

Just the current version with the current date appended as a fourth element (given that the Haskell PVP removed tags, which just worked accidentially with the previous pre-releases)

the only thing we'd need to do is upload the docker image

We would need to upload to a separate repository to make sure that we don't compromise our dockerhub credentials. Pushing a tag for a pre-release is a reasonable middle ground I think (and there is no extra step as noted above)

@homberghp
Copy link

Multiple dots are allowed

It appears that pg14 accepts multiple dots in customer parameters.
I infer that from the changed error message which states that a custom parameter should be two 'simple names' separated
with dots, where simple name is something matching c-words, or in a regex: [a-zA-Z][a-zA-Z0-9_$].
This allows more freedom and even nested name-spaces like top.middle.bottom and should provide enough flexibility, combined with the fact that the values are not restricted, other than be strings.

@wolfgangwalther
Copy link
Member

It appears that pg14 accepts multiple dots in customer parameters.

Yes. We are making use of that already, as you can tell from the docs for the new-style GUC settings:

You can access request headers, cookies and JWT claims by reading GUC variables set by PostgREST per request. They are named request.headers, request.cookies and request.jwt.claims.

Those changes are still unreleased as discussed above.

However, this will still not allow us to use the old style GUCs, where we used other characters, most importantly -, e.g. when passing in header names. That's why we opted for the new style for now.

@steve-chavez
Copy link
Member

If we get Allow overloaded functions if one has a single unnamed JSON param #1996 done, I think we don't have any breaking changes anymore, so far, right?

@wolfgangwalther I dropped support for a deprecated syntax on bf91187, this syntax was never documented, still I think it merits a major version, just to be really clear.

The extra step with the PR should not be necessary, you can update the cabal version with a local commit and just push the tag directly!

@monacoremo Did that one on a15f988 and it worked nice! We have a new pre-release and docker image.

@wolfgangwalther
Copy link
Member

@wolfgangwalther I dropped support for a deprecated syntax on bf91187, this syntax was never documented, still I think it merits a major version, just to be really clear.

Got it.

@steve-chavez
Copy link
Member

With v9, this is now fixed on docker latest as well

@elimisteve
Copy link
Contributor

Dear future readers,

With v9, this is now fixed on docker latest as well

The final resolution is ^there^! The relevant excerpt:

Getting the value for a header GUC on PostgreSQL 14 is done using current_setting('request.headers')::json->>'name-of-header' and in a similar way for request.cookies and request.jwt.claims

This is also now reflected in the docs: https://postgrest.org/en/latest/references/transactions.html#guc-req-headers-cookies-claims .

@elimisteve
Copy link
Contributor

elimisteve commented May 20, 2023

It is also worth noting that, due to the aforementioned change, request headers are effectively case-sensitive now, whereas before they were essentially case-insensitive. What I mean is, before you could do

SELECT current_setting('request.header.X-My-Header');

and you would get the value you're looking for, whereas now, if you do

SELECT current_setting('request.headers')::json->>'X-My-Header';

it will not work (and it will be confusing to debug!). You can now only access the headers by their lower-cased version:

SELECT current_setting('request.headers')::json->>'x-my-header';

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change A bug fix or enhancement that would cause a breaking change bug
Development

Successfully merging a pull request may close this issue.

9 participants