Feature/59184 order by stages and gates on project list #17530

EinLama · 2025-01-03T10:11:05Z

Ticket

https://community.openproject.org/wp/59184

What are you trying to accomplish?

Screenshots

What approach did you choose and why?

Since projects are related to the LifeCycleSteps table, which is in turn reliant on the LifeCycleStepDefinition table - which is the attribute we are sorting on, I looked for a way to query the relevant rows without having to do a lot of aggregation and group by magic.

I found two approaches that work and seem to be performant. The first used a CTE, the second a subquery. In both cases, I join the projects table on a subset of life cycle step definitions. This gets rid of obsolete rows while not requiring further aggregation.

Both approaches seem to have a similar performance impact. I found the subquery more readable, so I chose that.

There is a caveat: since we can order by multiple columns, we must ensure that our subquery identifier (-> SELECT * FROM identifier) is not used twice. I added the definition id to that name to make it unique per request.

Merge checklist

Added/updated tests
~~Added/updated documentation in Lookbook (patterns, previews, etc)~~
Tested major browsers (Chrome, Firefox, Edge, ...)

app/models/queries/projects/orders/life_cycle_step_order.rb

ulferts

I like where this is heading, @EinLama. The code structure is good and at least on the limited data I have, performance looks good and the results are also as expected.

There is however a permission check that needs to be added when ordering. This might still require some work to get in. The rest is just small stuff.

ulferts · 2025-01-08T15:58:24Z

app/models/queries/projects/orders/life_cycle_step_order.rb

+              WHERE
+                steps.active = true
+                AND def.id = #{life_cycle_step_definition.id}
+            ) #{subquery_table_name} ON #{subquery_table_name}.project_id = projects.id


I don't think that the JOIN with the definitions is necessary. You could write

SELECT STEPS.*, STEPS.DEFINITION_ID AS DEF_ID FROM PROJECT_LIFE_CYCLE_STEPS STEPS WHERE STEPS.ACTIVE = TRUE AND STEPS.DEFINITION_ID = 3

and have the same results.

But what is necessary to add is the check for the view_project_stages_and_gates permission in each project. Otherwise, given two projects in which the user has the permission in one project and lacks it in the other, both end up being sorted on the value. But the user should not receive any info on that value so when sorting, it needs to be treated as NULL. The easiest way of doing that is to not join the value in such a case.

In the screenshot below, the user does not have the permission in the first permission, has it in the second while in the third project, the value is not set.

The Project.allowed_to method might come in handy here. Working in a call to Project.allowed_to(User.current, :view_project_stages_and_gates).to_sql might do the trick.

Thank you for noticing this! Good point about simplifying the query, I started with the definitions and never questioned my initial assumption after that.

About the permission: I have added an implementation for filtering out rows that do not match the new criteria. But I'm wondering whether there is a better way to do so. Please have a look and let me know what you think 🕵🏻

I experimented around with it and measured the time on my data set. It does have a larger number of projects but unfortunately lacks a substantial number of portfolio elements so the measurement might be flawed. Overall I found very little impact, with the timing when sorting by 3 lifecycle elements always being around 20ms.

However, I still think that the better way of dealing with this query is in the following way:

# SQL LEFT JOIN ( SELECT steps.*, steps.definition_id as def_id FROM project_life_cycle_steps steps WHERE steps.active = true AND steps.definition_id = :definition_id AND steps.project_id IN ( #{viewable_project_ids.to_sql} ) ) #{subquery_table_name} ON #{subquery_table_name}.project_id = projects.id # ... # Ensure that only life cycle columns viewable to the current user are considered # for ordering the query result. def viewable_project_ids Project.allowed_to(User.current, :view_project_stages_and_gates).select(:id) end

This avoids fetching the project ids (as done by the pluck before) once for each column sorted by. Granted, this will be cached by the SQL cache but it is still an overhead. On top of that, I find that easier to read. I benchmarked the two approaches and on my data the approaches have the following runtime when issuing:

Benchmark.bm do |x| x.report { 1000.times { query_ordered_by_three_lifecycle_columns.results.to_a.count } } end

The PR's approach needed

user system total real 26.923800 1.216307 28.140107 ( 56.359955)

Whereas the changed one takes just

user system total real 19.154100 0.573853 19.727953 ( 32.417628)

This was done within the rails console so the PR's times are worsened by the fact that the query cache wasn't used. But to me the results are good enough to suggest that it at least not worse to have the subquery executed together with the rest.

I also tried to move the viewable_project_ids to a CTE to avoid issuing the same subquery thrice but that did not lead to measurable changes so I don't think this is necessary at this point in time.

app/models/queries/projects/orders/life_cycle_step_order.rb

ulferts · 2025-01-08T16:23:30Z

app/models/queries/projects/orders/life_cycle_step_order.rb

+  end
+
+  def available?
+    life_cycle_step_definition.present?


Strictly speaking, the check for view_project_stages_and_gates should also be in here. The code should check if the user has that permission in any project.

The behaviour is as expected because there is an imprecision in the way the selectable orders work.

The Projects::ConfigureViewModalComponent passes the selectable_columns to the Queries::SortByComponent. Within that component, the columns are then mapped to order statements without checking their available? method again. In short, this only works because the conditions for the columns are currently the same as for ordering. But this might change.

Even now, while the option is not available in the UI, it is possible to craft a URL request in a way to get the order to be applied even if the user does not have the permission. Via the API, it is possible as well. Mind you, it will not be a problem any more when the permissions are checked on the values but ideally, the user would get an error message in this case which requires to have a check for this permission here as well.

I wouldn't want to include the rework of the Modal and SortBy component in this PR but rather open a new ticket/PR to address this. In here then, one would only need to add the permission check. Do you agree?

Interesting, good catch! I agree that this might come up in the future. So I will add the permission check here and we'll have to revisit sorting and the Modal in a separate task.

I have added the permission check along with a spec for it.

app/models/queries/projects/orders/life_cycle_step_order.rb

Using a named subquery will break as those have to be unique. The same issue applies to CTEs - they need a unique name per query. To solve this, I have allowed queries to use CTEs and define their name. The name will be derived from the definition id, which is unique per query. Therefore, you can now order by multiple life cycle definitions at once.

There is no need to filter for available definitions before ordering, this also gets rid of a Brakeman warning.

It is supposed to be a bit more performant for a case like this, too.

This is so much better

ulferts

I think the query can be simplified which admittedly only slightly should also improve the performance.

ulferts · 2025-01-10T15:39:14Z

app/models/queries/projects/orders/life_cycle_step_order.rb

+              WHERE
+                steps.active = true
+                AND def.id = #{life_cycle_step_definition.id}
+            ) #{subquery_table_name} ON #{subquery_table_name}.project_id = projects.id


I experimented around with it and measured the time on my data set. It does have a larger number of projects but unfortunately lacks a substantial number of portfolio elements so the measurement might be flawed. Overall I found very little impact, with the timing when sorting by 3 lifecycle elements always being around 20ms.

However, I still think that the better way of dealing with this query is in the following way:

# SQL LEFT JOIN ( SELECT steps.*, steps.definition_id as def_id FROM project_life_cycle_steps steps WHERE steps.active = true AND steps.definition_id = :definition_id AND steps.project_id IN ( #{viewable_project_ids.to_sql} ) ) #{subquery_table_name} ON #{subquery_table_name}.project_id = projects.id # ... # Ensure that only life cycle columns viewable to the current user are considered # for ordering the query result. def viewable_project_ids Project.allowed_to(User.current, :view_project_stages_and_gates).select(:id) end

This avoids fetching the project ids (as done by the pluck before) once for each column sorted by. Granted, this will be cached by the SQL cache but it is still an overhead. On top of that, I find that easier to read. I benchmarked the two approaches and on my data the approaches have the following runtime when issuing:

Benchmark.bm do |x| x.report { 1000.times { query_ordered_by_three_lifecycle_columns.results.to_a.count } } end

The PR's approach needed

user system total real 26.923800 1.216307 28.140107 ( 56.359955)

Whereas the changed one takes just

user system total real 19.154100 0.573853 19.727953 ( 32.417628)

This was done within the rails console so the PR's times are worsened by the fact that the query cache wasn't used. But to me the results are good enough to suggest that it at least not worse to have the subquery executed together with the rest.

I also tried to move the viewable_project_ids to a CTE to avoid issuing the same subquery thrice but that did not lead to measurable changes so I don't think this is necessary at this point in time.

github-advanced-security bot found potential problems Jan 3, 2025

View reviewed changes

app/models/queries/projects/orders/life_cycle_step_order.rb Fixed Show fixed Hide fixed

EinLama force-pushed the feature/59184-order-by-stages-and-gates-on-project-list branch 2 times, most recently from f551cb5 to 73a8642 Compare January 6, 2025 08:44

github-advanced-security bot found potential problems Jan 6, 2025

View reviewed changes

app/models/queries/projects/orders/life_cycle_step_order.rb Fixed Show fixed Hide fixed

app/models/queries/projects/orders/life_cycle_step_order.rb Fixed Show fixed Hide fixed

EinLama force-pushed the feature/59184-order-by-stages-and-gates-on-project-list branch from eb18a6e to b161d9a Compare January 6, 2025 13:59

github-advanced-security bot found potential problems Jan 6, 2025

View reviewed changes

app/models/queries/projects/orders/life_cycle_step_order.rb Fixed Show fixed Hide fixed

EinLama force-pushed the feature/59184-order-by-stages-and-gates-on-project-list branch 3 times, most recently from e972026 to a6ee983 Compare January 7, 2025 10:01

EinLama added feature needs review labels Jan 7, 2025

EinLama marked this pull request as ready for review January 7, 2025 12:55

ulferts removed the needs review label Jan 8, 2025

ulferts requested changes Jan 8, 2025

View reviewed changes

EinLama force-pushed the feature/59184-order-by-stages-and-gates-on-project-list branch from db349fb to c8555d9 Compare January 9, 2025 16:37

EinLama added the needs review label Jan 10, 2025

EinLama commented Jan 10, 2025

View reviewed changes

app/models/queries/projects/orders/life_cycle_step_order.rb Show resolved Hide resolved

EinLama added 13 commits January 10, 2025 11:43

[#59184] order by life_cycle_step.start_date

3f595f5

[#59184] first specs

589a5b2

[#59184] spec: order by stage

f7bc985

[#59184] improve spec

f143d13

[#59184] spec for desc ordering

db90500

[#59184] order by end_date secondarily

146c6a0

[#59184] specs for ordering life cycle gates

e24b864

[#59184] spec for multi-sorting

d2ceef8

[#59184] details

df7bf68

[#59184] simplify the regexp for the ordering key

23ab08e

There is no need to filter for available definitions before ordering, this also gets rid of a Brakeman warning.

[#59184] use Arel to build the query to get rid of injection warnings

5e245e9

[#59184] simplify by using subqueries in favor of CTE

0f5cabe

It is supposed to be a bit more performant for a case like this, too.

EinLama added 7 commits January 10, 2025 11:43

[#59184] small refactors, comments, etc.

64c3b1a

[#59184] remove unnecessary scope method

41189c3

[#59184] simplify subquery

0463dff

This is so much better

[#59184] check view-permission when ordering

af04542

[#59184] optimize SQL query for large arrays

0aa0586

[#59184] consider view-permission in #available?

cf90c0e

[#59184] check feature flag before ordering life cycles

e5eee68

EinLama force-pushed the feature/59184-order-by-stages-and-gates-on-project-list branch from d559d92 to e5eee68 Compare January 10, 2025 10:43

ulferts requested changes Jan 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/59184 order by stages and gates on project list #17530

Feature/59184 order by stages and gates on project list #17530

EinLama commented Jan 3, 2025 •

edited

Loading

ulferts left a comment

ulferts Jan 8, 2025

EinLama Jan 9, 2025

ulferts Jan 10, 2025

ulferts Jan 8, 2025

EinLama Jan 9, 2025

EinLama Jan 9, 2025

ulferts left a comment

ulferts Jan 10, 2025

Feature/59184 order by stages and gates on project list #17530

Are you sure you want to change the base?

Feature/59184 order by stages and gates on project list #17530

Conversation

EinLama commented Jan 3, 2025 • edited Loading

Ticket

What are you trying to accomplish?

Screenshots

What approach did you choose and why?

Merge checklist

ulferts left a comment

Choose a reason for hiding this comment

ulferts Jan 8, 2025

Choose a reason for hiding this comment

EinLama Jan 9, 2025

Choose a reason for hiding this comment

ulferts Jan 10, 2025

Choose a reason for hiding this comment

ulferts Jan 8, 2025

Choose a reason for hiding this comment

EinLama Jan 9, 2025

Choose a reason for hiding this comment

EinLama Jan 9, 2025

Choose a reason for hiding this comment

ulferts left a comment

Choose a reason for hiding this comment

ulferts Jan 10, 2025

Choose a reason for hiding this comment

EinLama commented Jan 3, 2025 •

edited

Loading