Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inefficient query plans when using @context #3140

Merged
merged 17 commits into from
Oct 15, 2024
Merged

inefficient query plans when using @context #3140

merged 17 commits into from
Oct 15, 2024

Conversation

clenfest
Copy link
Contributor

@clenfest clenfest commented Sep 9, 2024

Fixes edge case where contextual arguments can lead to inefficient query plans. Also fixes naming of query plan arguments which can be a problem when using contextual variables in multiple subraphs.

…nt in some circumstances. Fix this by having all conditions go through the handleRequires() code path
@clenfest clenfest requested a review from a team as a code owner September 9, 2024 16:53
Copy link

changeset-bot bot commented Sep 9, 2024

🦋 Changeset detected

Latest commit: a04979f

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 7 packages
Name Type
@apollo/query-planner Patch
@apollo/query-graphs Patch
@apollo/gateway Patch
@apollo/composition Patch
@apollo/federation-internals Patch
@apollo/subgraph Patch
apollo-federation-integration-testsuite Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link

netlify bot commented Sep 9, 2024

Deploy Preview for apollo-federation-docs canceled.

Name Link
🔨 Latest commit a04979f
🔍 Latest deploy log https://app.netlify.com/sites/apollo-federation-docs/deploys/670ee0cfcc43e40008ab2eae

Copy link

codesandbox-ci bot commented Sep 9, 2024

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

@clenfest clenfest changed the title Clenfest/zzz inefficient query plans when using @context Sep 10, 2024
@@ -965,6 +954,22 @@ function federateSubgraphs(supergraph: Schema, subgraphs: QueryGraph[]): QueryGr

}

// add contextual argument maps to builder
for (const [i, subgraph] of subgraphs.entries()) {
Copy link
Member

@dariuszkuc dariuszkuc Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously this was calculated in a single loop above -> any reason to move it down here? I think we only need the builder.setContextMaps(...) outside of the loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's like that because we only need to do the calculations once at the end. The fact that it was inside the loop was causing naming colisions.

Copy link
Contributor

@sachindshinde sachindshinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed this synchronously with Chris, will summarize the feedback:

  1. The changes to query graph creation are fine, and can be in their own PR (it's been split off into Fixes issue where there can be naming collisions in contextual parameter names #3155 ).
  2. handleRequires() needs to be split up, so that we may reuse the first half of it. Specifically, the part that applies the condition's OpPathTree and hoists (and maybe merges) the created nodes up the ancestors of the fetch dependency graph where possible. This function would return the main group that the start of the conditions merged into, along with any unmerged created groups.
    • The second half of handleRequires() is effectively the part specific to @requires, and takes in the outputs of the first function (along with e.g. current group/path). It's responsible for adding the inputs to the node querying the @requires field, and setting up parent/child relationships.
    • The idea here is that we can make an analogous second-half function for @fromContext-containing nodes, which will add the context arguments/input rewrites to the node, along with setting up parent/child relationships.
    • IIRC we've been choosing to put the subgraph jump (if needed) at the @context type and not near the @fromContext field, so I believe we'd need to insert the subgraph jump in the first-half function.

Copy link
Contributor

@sachindshinde sachindshinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The splitting of handleRequires() generally looks fine to me (would change a bit in the results passed between functions), but some changes are needed in how it's called in computeGroupsForTree(). Going to pair program on that with @clenfest.

@@ -4227,7 +4229,7 @@ function computeGroupsForTree(
});
updateCreatedGroups(createdGroups, newGroup);
newGroup.addParents(conditionsGroups.map((conditionGroup) => {
// If `conditionGroup` parent is `group`, that is the same as `newGroup` current parent, then we can infer the path of `newGroup` into
// If `conditionGroup` parent is `group`, that is the same as `groupCopy` current parent, then we can infer the path of `groupCopy` into
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the @key-handling logic, which still uses newGroup (it's not actually copying the current group, but genuinely creating a new group based off the @key edge's tail).

@@ -4363,11 +4365,16 @@ function computeGroupsForTree(
contextToConditionsGroups,
};

if (conditions && edge.conditions) {
if (conditions) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that conditions being truthy from the OpPathTree just tells us something wants to fetch data here, for use either immediately (@requires edge, InterfaceObjectFakeDownCast edge) or later (@fromContext edge), potentially multiple of these. conditions meant truthy means we'd need to execute the first-half function handleRequiresTree(). edge.conditions being additionally truthy means this edge requires at least some of those conditions to be provided as input (via _entities), so we'd also want to execute the @requires second-half function, createPostRequiresGroup().

path: path.forNewKeyFetch(createFetchInitialPath(dependencyGraph.supergraphSchema, entityType, context)),
createdGroups,
};
return { group, path, createdGroups, unmergedGroups: [], extraParent: undefined, newGroupIsUnneeded: undefined };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit misleading here to indicate unmergedGroups as [], as no merging of created groups takes place at all in this branch (so it's more like unmergedGroups should be createdGroups).

I think it'd be more clear here to:

  1. Swap createdGroups with a noCreatedGroups boolean.
  2. Have unmergedGroups be createdGroups when no merging took place.
  3. Use unmergedGroups in the other half of handleRequires(), in the place of createdGroups.

const newContextToConditionsGroups = new Map<string, FetchGroup[]>([...contextToConditionsGroups]);
for (const context of contextToSelection) {
newContextToConditionsGroups.set(context, [group]);
if (requireResult.createdGroups.length === 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was curious about why we had this if block, synced with @clenfest but we didn't see a reason here.

newContextToConditionsGroups.set(context, [group, ...conditionsGroups]);
// If a new group is created for the conditions, make sure that we add that as a dependency
// so that using a context will not occur within the same fetch group as it is retrieved
if (!requireResult.newGroupIsUnneeded || group.isTopLevel) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we'd only set the condition groups when this boolean expression is true, in general we need these condition groups so the downstream @fromContext will be able to depend on these.

@svc-apollo-docs
Copy link
Collaborator

svc-apollo-docs commented Oct 15, 2024

✅ Docs Preview Ready

No new or changed pages found.

@apollo-bot2
Copy link

Detected Secrets

🔴 Secrets Found

If you are seeing this message, it means that the secret scanning tool that Apollo uses to prevent secrets from entering our repositories has identified a secret in your branch. Information about what was detected and steps to move forward are below.

If the secret scanner detected a valid secret, please take immediate action to remove it from your branch. Apollo also recommends that you invalidate the secret in the system it grants access to. While there are additional steps to prepare the branch for merging, protecting the secret is the top priority.

If the scanner detected a value that is not actually secret or it detected a value that was a secret but has been invalidated, follow the steps in the "Resolution Process" section below to mark it as a false/benign positive. One common reason for benign positive detections is test values. For example, if you are committing a value (like a private key, JWT, etc.) that would usually be secret but is being used in a test case to validate functionality, a detection will be generated. This is safe to allowlist.

Values Detected

Signature File Commit Start Line Start Column Link To Secret
generic-api-key query-planner-js/src/tests/buildPlan.test.ts 98882688d195e195234d8af24fd64fdd6d105a2b 9178 15 Link
generic-api-key query-planner-js/src/tests/buildPlan.test.ts ff5e2fac1ed4731c3392546a361291b3905b494b 10334 16 Link
generic-api-key query-planner-js/src/tests/buildPlan.test.ts ff5e2fac1ed4731c3392546a361291b3905b494b 10339 16 Link

Resolution Process

[!NOTE]
Apollo does not use all of the native options provided by Gitleaks to perform allowlisting. We use Gitleaks' standard line comments but rely on a custom convention written in a .gitleaks.toml file at the root of each repository for allowlisting detections in Git history.

To resolve the issue, you will need to:

  1. Prevent new detections of the "secret":
    • Add a comment containing gitleaks:allow to the line containing the detected value.
    • Skip this step if it's not feasible. For example, the detection is in file types like JSON or Markdown, you've completely removed the value, etc.
  2. Configure the scanner to ignore the "secret" in git history.
    • Follow the instructions below for either "Allowlist by Commit Hash" or "Allowlist by File Path"
    • You will need to ensure that a .gitleaks.toml file exists at the root of this repository in your branch. If not, create one!

Allowlist by Commit Hash

This is the preferred option. Update the .gitleaks.toml file with the following contents:

[[ rules ]]
    id = "signature-from-table-above"
    [ rules.allowlist ]
        commits = [
            "full-commit-hash-1",
            "full-commit-hash-2"
        ]

If a [[rules]] block already exists for the signature involved in your detection, simply append your list of commits.

If a matching [[rules]] block is already present but has a paths attribute rather than commits, proceed to the next option.

This configuration tells the scanner to ignore the specified signature in the list of commits provided.

Allowlist by File Path

This is a backup option to allowlist specific detections. Add/update the .gitleaks.toml file with the following contents:

[[ rules ]]
    id = "signature-from-table-above"
    [ rules.allowlist ]
        paths = [
            '''path/to/detected/file1.json$''',
            '''path/to/other/detected/file.md$'''
        ]

This configuration tells the scanner to ignore any detections with the specified signature generated for the list of files.


Getting Help

The Apollo Security team has been notified and is available to assist in resolving this issue. Please tag us on this PR using @apollographql/security if you need assistance!


How do I know I fixed this correctly?

Once you have resolved the issue completely, this message will disappear! If you're still seeing this message, there is more to do prior to merging.

Copy link
Contributor

@sachindshinde sachindshinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished pairing with @clenfest , code should be simpler now around __typename additions and we should be reusing the @requires optimizations for @fromContext selection sets now.

@clenfest clenfest merged commit 345661c into main Oct 15, 2024
17 of 18 checks passed
@clenfest clenfest deleted the clenfest/zzz branch October 15, 2024 22:07
sachindshinde pushed a commit that referenced this pull request Oct 16, 2024
This PR was opened by the [Changesets
release](https://github.com/changesets/action) GitHub action. When
you're ready to do a release, you can merge this and the packages will
be published to npm automatically. If you're not ready to do a release
yet, that's fine, whenever you add more changesets to main, this PR will
be updated.


# Releases
## @apollo/[email protected]

### Patch Changes

- Updated dependencies
\[[`345661c558773e4eb5d5f0b28464a8d1acdc2a2d`](345661c),
[`e00e1c9892b48ac89823597113989830877966ef`](e00e1c9),
[`cc4573471696ef78d04fa00c4cf8e5c50314ba9f`](cc45734),
[`062572b3253e8640b60a0bf58b83945094b76b6f`](062572b),
[`df5eb3cb0e2b4802fcd425ab9c23714de2707db3`](df5eb3c),
[`1c99cb0dcc6c639ac351210932623ab0bd6907e4`](1c99cb0)]:
    -   @apollo/[email protected]
    -   @apollo/[email protected]

## @apollo/[email protected]

### Patch Changes

- Updated dependencies
\[[`345661c558773e4eb5d5f0b28464a8d1acdc2a2d`](345661c),
[`7fe1465cf35c2efe37ac5c73fac2b7ea04173318`](7fe1465),
[`cc4573471696ef78d04fa00c4cf8e5c50314ba9f`](cc45734),
[`062572b3253e8640b60a0bf58b83945094b76b6f`](062572b),
[`df5eb3cb0e2b4802fcd425ab9c23714de2707db3`](df5eb3c),
[`1c99cb0dcc6c639ac351210932623ab0bd6907e4`](1c99cb0)]:
    -   @apollo/[email protected]
    -   @apollo/[email protected]
    -   @apollo/[email protected]

## @apollo/[email protected]

### Patch Changes

- fix: normalize field set selection sets
([#3162](#3162))

`FieldSet` scalar represents a selection set without outer braces. This
means that users could potentially specify some selections that could be
normalized (i.e. eliminate duplicate field selections, hoist/collapse
unnecessary inline fragments, etc). Previously we were using `@requires`
field set selection AS-IS for edge conditions. With this change we will
now normalize the `FieldSet` selections before using them as fetch node
conditions.

- Fixed missing referenced variables in the `variableUsages` field of
fetch operations
([#3166](#3166))

Query variables used in fetch operation should be listed in the
`variableUsages` field. However, there was a bug where variables
referenced by query-level directives could be missing in the field.

- Fix fragment generation recursion logic to apply minification on all
subselections.
([#3158](#3158))

- Fixed a bug that `__typename` with applied directives gets lost in
fetch operations.
([#3164](#3164))

The sibling typename optimization used by query planner simplifies
operations by folding `__typename` selections into their sibling
selections. However, that optimization does not account for directives
or aliases. The bug was applying the optimization even if the
`__typename` has directives on it, which caused the selection to lose
its directives. Now, `__typename` with directives (or aliases) are
excluded from the optimization.

## @apollo/[email protected]

### Patch Changes

- Fixes edge case where contextual arguments can yield inefficient query
plans. Also fixes naming of query plan arguments which can be a problem
when using contextual variables in multiple subgraphs
([#3140](#3140))

- When eliminating suboptimal indirect paths during query planning,
properly check for a direct `@key` edge at the end of the potential
direct path.
([#3160](#3160))

- Fixed missing referenced variables in the `variableUsages` field of
fetch operations
([#3166](#3166))

Query variables used in fetch operation should be listed in the
`variableUsages` field. However, there was a bug where variables
referenced by query-level directives could be missing in the field.

- Updated dependencies
\[[`cc4573471696ef78d04fa00c4cf8e5c50314ba9f`](cc45734),
[`062572b3253e8640b60a0bf58b83945094b76b6f`](062572b),
[`df5eb3cb0e2b4802fcd425ab9c23714de2707db3`](df5eb3c),
[`1c99cb0dcc6c639ac351210932623ab0bd6907e4`](1c99cb0)]:
    -   @apollo/[email protected]

## @apollo/[email protected]

### Patch Changes

- Fixes edge case where contextual arguments can yield inefficient query
plans. Also fixes naming of query plan arguments which can be a problem
when using contextual variables in multiple subgraphs
([#3140](#3140))

- Ensure all useless fetch groups are removed
([#3163](#3163))

When removing "useless" fetch nodes/groups we remove them in-place while
still iterating over the same list. This leads to potentially skipping
processing of some the children fetch nodes, as when we remove nodes we
left shift all remaining children but the iterator keeps the old
position unchanged effectively skipping next child.

- fix: normalize field set selection sets
([#3162](#3162))

`FieldSet` scalar represents a selection set without outer braces. This
means that users could potentially specify some selections that could be
normalized (i.e. eliminate duplicate field selections, hoist/collapse
unnecessary inline fragments, etc). Previously we were using `@requires`
field set selection AS-IS for edge conditions. With this change we will
now normalize the `FieldSet` selections before using them as fetch node
conditions.

- Fixed missing referenced variables in the `variableUsages` field of
fetch operations
([#3166](#3166))

Query variables used in fetch operation should be listed in the
`variableUsages` field. However, there was a bug where variables
referenced by query-level directives could be missing in the field.

- Fixed a bug that `__typename` with applied directives gets lost in
fetch operations.
([#3164](#3164))

The sibling typename optimization used by query planner simplifies
operations by folding `__typename` selections into their sibling
selections. However, that optimization does not account for directives
or aliases. The bug was applying the optimization even if the
`__typename` has directives on it, which caused the selection to lose
its directives. Now, `__typename` with directives (or aliases) are
excluded from the optimization.

- Updated dependencies
\[[`345661c558773e4eb5d5f0b28464a8d1acdc2a2d`](345661c),
[`e00e1c9892b48ac89823597113989830877966ef`](e00e1c9),
[`cc4573471696ef78d04fa00c4cf8e5c50314ba9f`](cc45734),
[`062572b3253e8640b60a0bf58b83945094b76b6f`](062572b),
[`df5eb3cb0e2b4802fcd425ab9c23714de2707db3`](df5eb3c),
[`1c99cb0dcc6c639ac351210932623ab0bd6907e4`](1c99cb0)]:
    -   @apollo/[email protected]
    -   @apollo/[email protected]

## @apollo/[email protected]

### Patch Changes

- Updated dependencies
\[[`cc4573471696ef78d04fa00c4cf8e5c50314ba9f`](cc45734),
[`062572b3253e8640b60a0bf58b83945094b76b6f`](062572b),
[`df5eb3cb0e2b4802fcd425ab9c23714de2707db3`](df5eb3c),
[`1c99cb0dcc6c639ac351210932623ab0bd6907e4`](1c99cb0)]:
    -   @apollo/[email protected]

## [email protected]

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants