Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto qc #4

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Auto qc #4

wants to merge 5 commits into from

Conversation

zherbz
Copy link

@zherbz zherbz commented May 2, 2024

Repo containing notebooks to auto-qc plan HDF files both locally and from S3 for Max WSE Errors and WSE Time-to-Peak

@zherbz zherbz requested review from thwllms and wolfpack-daragon May 2, 2024 19:35
Copy link
Collaborator

@wolfpack-daragon wolfpack-daragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the tools and notebooks. The two folium maps (time to peak errors and wse errors) provide good info for modelers. Specific to the code. I'm guessing we'll want to extract the reading from hdf functions into a new PR in rashdf library, go through the review and deploy process, then test importing them here before merging to main? @thwllms can guide us. Here are my suggestions for that:

utils.py
get_domain_names (port directly to rashdf after evaluation duplication/conflicts)
decode_hdf_projection (port directly to rashdf after evaluation duplication/conflicts)
get_projection (port directly to rashdf after evaluation duplication/conflicts)
get_cell_pts (port directly to rashdf after evaluation duplication/conflicts)
get_perimeter (port directly to rashdf after evaluation duplication/conflicts)

hdf_wse.py
package_wse_error_outputs (split up by read hdf parts and processing parts)
read_hdf_wse_errors (split up by read hdf parts and processing parts)
read_hdf_wse_ttp (split up by read hdf parts and processing parts)
wse_error_qc (split up by read hdf parts and processing parts)
wse_ttp_qc (split up by read hdf parts and processing parts)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull the MBI reference out

# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
# MBI data folder (uncomment after first commit to keep large files out of version control)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove reference to MBI


Returns
-------
m
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be a bit more explicit: "m: folium.Map"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

template files can be removed from PR

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tox file can be removed since we're not using in this project

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the .env file is in .gitignore, users won't actually be able to see the template. In this readme, it would be helpful to explain where the .env file lives and its template (so that user can copy and paste)

@thwllms
Copy link

thwllms commented May 9, 2024

Hoping to look at this in detail soon, but just wanted to point out to @zherbz and @sray014 that both of your PRs include environment stuff which has the potential for conflict. Wondering if it would be better to structure this repo into different folders for each sub-project:

ffrd-notebooks
├── auto-qc/
├── storms-summary/
└── README.md

@thwllms
Copy link

thwllms commented May 15, 2024

utils.py
get_domain_names (port directly to rashdf after evaluation duplication/conflicts)
decode_hdf_projection (port directly to rashdf after evaluation duplication/conflicts)
get_projection (port directly to rashdf after evaluation duplication/conflicts)
get_cell_pts (port directly to rashdf after evaluation duplication/conflicts) get_perimeter (port directly to rashdf after evaluation duplication/conflicts)

I think all of these above are already part of rashdf.

hdf_wse.py
package_wse_error_outputs (split up by read hdf parts and processing parts)
read_hdf_wse_errors (split up by read hdf parts and processing parts)
read_hdf_wse_ttp (split up by read hdf parts and processing parts)
wse_error_qc (split up by read hdf parts and processing parts)
wse_ttp_qc (split up by read hdf parts and processing parts)

Agree that we need to add the ability to extract WSE values and error to rashdf. But I think this PR could be merged as-is to ffrd-notebooks and we can work out how to incorporate that functionality into rashdf later.

Copy link

@thwllms thwllms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks pretty good overall as a demo of leveraging RAS HDF data to automatically evaluate certain model results. I generally agree with @wolfpack-daragon's comments. If you want to get the WSE/error values incorporated into rashdf before merging this then I support that. Per my previous comment I think we should probably stick different sets of notebooks into different top-level folders in this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants