Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSHARP-5321: Optimize client-side projections to only fetch the parts of the document that are needed for the projection. #1589

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rstam
Copy link
Contributor

@rstam rstam commented Jan 9, 2025

No description provided.

… of the document that are needed for the projection.
@rstam rstam requested a review from a team as a code owner January 9, 2025 00:07
typeof(ClientSideProjectionSnippetsDeserializer<,,,,>)
];

public const int MaxNumberOfSnippets = 4; // could be expanded up to 16
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stopped at 4 because I wanted to get farther in the review process before adding support for more snippets.

And we need to decide how many to add support for. The max would be 16 (based on the Func types available).

var deserializerGenericTypeDefinition = __deserializerGenericTypeDefinitions[snippetTypes.Length];
var deserializerGenericTypeArguments = snippetTypes.Append(projectionType).ToArray();
var deserializerType = deserializerGenericTypeDefinition.MakeGenericType(deserializerGenericTypeArguments);
return (IBsonSerializer)Activator.CreateInstance(deserializerType, [snippetDeserializers, projector]);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I decided it is better to use reflection ONCE to create the deserializer rather than to use reflection to call the projector delegate. That is because the projecter delegate has to be called for EVERY document returned from the server.

using MongoDB.Bson.Serialization;
using MongoDB.Bson.Serialization.Serializers;

namespace MongoDB.Driver
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A "snippet" is the name I have given to a piece of the projector expression that can be evaluated server-side. The rest of the projecction is done client-side.

When finding the snippets we try to keep as much of the work as possible server-side which more often than not will reduce the amount of data sent to the client.

@@ -35,6 +35,9 @@ public static bool IsInt32Constant(this AstExpression expression, out int value)
public static bool IsMaxInt32(this AstExpression expression)
=> expression.IsInt32Constant(out var value) && value == int.MaxValue;

public static bool IsRootVar(this AstExpression expression)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was performed so many times that I created a method to centralize the implementation and simplify the places where the test is needed.

{
var rewrittenArg = (AstExpression)AstNodeReplacer.Replace(node.In, (node.As, _element));
var accumulatorExpression = AstExpression.UnaryAccumulator(AstUnaryAccumulatorOperator.Push, rewrittenArg);
var accumulatorFieldName = _accumulators.AddAccumulatorExpression(accumulatorExpression);
var root = AstExpression.Var("ROOT", isCurrent: true);
return AstExpression.GetField(root, accumulatorFieldName);
return AstExpression.GetField(AstExpression.RootVar, accumulatorFieldName);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AstExpression.RootVar is the way we should always refer to the variable for $$ROOT.

By using AstExpression.RootVar we ensure that the variable is created the same way every time (I found a few places where it was not). A secondary benefit is that there is only one instance of the RootVar that can be re-used over and over.

}

return pipeline;
return projectStage == null ?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

projectStage will be null when the client-side projection needs the whole document as input.

AssertStages(stages, Array.Empty<string>());
serializer.Should().BeOfType<ClientSideProjectionDeserializer<DocumentWithArray, int>>();
var stages = Translate(collection, queryable, out var outputSerializer);
AssertStages(stages, "{ $project : { _0 : '$A', _id : 0 } }");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of tests involving client-side projections had to be modified to pass now that we have optimized client-side projections.

.Select(x => Add(x.X, 1));

var stages = Translate(collection, queryable);
AssertStages(stages, "{ $project : { _0 : '$X', _id : 0 } }");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the constant 1 is not a snippet. It remained client-side.

.Select(x => Add(x.A.Sum(), 4));

var stages = Translate(collection, queryable);
AssertStages(stages, "{ $project : { _0 : { $sum : '$A' }, _id : 0 } }");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the sum is done server-side, avoiding fetching a potentially large array to the client.

@@ -38,7 +36,7 @@ public static string TranslateToDottedFieldName(this LambdaExpression fieldSelec
{
throw new ArgumentException($"ValueType '{parameterSerializer.ValueType.FullName}' of parameterSerializer does not match parameter type '{parameterExpression.Type.FullName}'.", nameof(parameterSerializer));
}
var parameterSymbol = context.CreateSymbolWithVarName(parameterExpression, varName: "ROOT", parameterSerializer, isCurrent: true);
var parameterSymbol = context.CreateRootSymbol(parameterExpression, parameterSerializer);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a root symbol is done often enough that I have added a new CreateRootSymbol helper method to help ensure that the root symbol is always created correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant