Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintain materialized view data status. #501

Merged
merged 1 commit into from
Jul 11, 2024
Merged

Conversation

avamingli
Copy link
Contributor

@avamingli avamingli commented Jul 7, 2024

A materialized view's data could be seen as up to date with base tables if there is no writable operations since last Refresh of the view.

If one of base tables in the query tree of a materailied view is modified, the view data is not up to date. And we could not use it to answer query.

This commit maintain the data satus of a materialized view and it applies to normal materialized view, IVM with defer refresh. IVM with immediately refresh is always up to date.

When a base table has writable operation, we try to update view data status as:

  • 'u'(up to date):

    • Create Materialized View
    • Refresh
  • 'e'(expired):

    • Update base tables
    • Delete base tables
    • Refresh With No Data
    • Truncate base tables
    • Create Materialized View With No Data
  • 'i'(insert only):

    • Insert into base tables
    • Copy From
    • Copy From on Segments
  • 'r'(up to date but reorganized):

    • Cluster base tables
    • Vacuum Full base tables

Insert, Update, Delete, Copy From operations take effect if the actual affected rows > 0, ex:
sql insert into t1 select * from t2;
We don't need to update view if t2 has zero rows.

The real status will be decided both by current and the status we try to mark as.Ex: we try to mark insert only on a expired status, it will not success.

This doesn't work on utility mode, if user modified data in that mode, should refresh views if want to use it to answer query.

AQUMV could use normal materialized views after this commit.

Authored-by: Zhang Mingli [email protected]

fix #ISSUE_Number


Change logs

Describe your change clearly, including what problem is being solved or what feature is being added.

If it has some breaking backward or forward compatibility, please clary.

Why are the changes needed?

Describe why the changes are necessary.

Does this PR introduce any user-facing change?

If yes, please clarify the previous behavior and the change this PR proposes.

How was this patch tested?

Please detail how the changes were tested, including manual tests and any relevant unit or integration tests.

Contributor's Checklist

Here are some reminders and checklists before/when submitting your pull request, please check them:

  • Make sure your Pull Request has a clear title and commit message. You can take git-commit template as a reference.
  • Sign the Contributor License Agreement as prompted for your first-time contribution(One-time setup).
  • Learn the coding contribution guide, including our code conventions, workflow and more.
  • List your communication in the GitHub Issues or Discussions (if has or needed).
  • Document changes.
  • Add tests for the change
  • Pass make installcheck
  • Pass make -C src/test installcheck-cbdb-parallel
  • Feel free to request cloudberrydb/dev team for review and approval when your PR is ready🥳

@avamingli avamingli requested a review from my-ship-it July 7, 2024 08:38
@avamingli avamingli force-pushed the mv_data branch 9 times, most recently from 71866fd to ec60ec5 Compare July 10, 2024 09:44
@avamingli
Copy link
Contributor Author

Enabled this feature in SingleNode mode.

A materialized view's data could be seen as up to date with
base tables if there are no writable operations since last
Refresh of the view.

If one of base tables in the query tree of a materailied view is
modified, the view data is not up to date. And we could not use
it to answer query.

This commit maintain the data satus of a materialized view and
it applies to normal materialized view, IVM with defer refresh.
IVM with immediately refresh is always up to date.

When a base table has writable operation, we try
to update view data status as:

- 'u'(up to date):
  Create Materialized View
  Refresh

- 'e'(expired):
  Update base tables
  Delete base tables
  Refresh With No Data
  Truncate base tables
  Create Materialized View With No Data

- 'i'(insert only):
  Insert info base tables
  Copy From
  Copy From on Segments

- 'r'(up to date but reorganized):
  Cluster base tables
  Vacuum Full base tables

Insert, Update, Delete, Copy From operations take effect if the
actual affected rows > 0, ex:
  insert into t1 select * from t2;
We don't need to update view if t2 has zero rows.

The real status will be decided both by current and the status we
try to mark as.Ex: we try to mark insert only on a expired status,
it will not success.

This doesn't work on utility mode, if user modified data in that
mode, should refresh views if want to use it to answer query.

AQUMV could use normal materialized views after this commit.

Authored-by: Zhang Mingli [email protected]
Copy link
Contributor

@my-ship-it my-ship-it left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except one comment

@my-ship-it my-ship-it merged commit 3d48d86 into apache:main Jul 11, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants