Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DKAN-4287 Make harvest_run id not a primary key. #4346

Draft
wants to merge 5 commits into
base: 2.x
Choose a base branch
from

Conversation

swirtSJW
Copy link
Contributor

@swirtSJW swirtSJW commented Nov 20, 2024

Fixes #4287

Describe your changes

QA Steps

This list is INCOMPLETE at this time.

From existing data

  • git checkout 2.18.3 (the tag before runs became an entity)
  • init the site
  • ddev drush dkan:sample-content:create
  • ddev drush cron
    [] validate no errors show in terminal from these commands.

From new data

[] validate that the Run table list displays ___

  • Add manual QA steps in checklist format for a reviewer to perform. Be as specific as possible, provide examples if appropriate.

Checklist before requesting review

If any of these are left unchecked, please provide an explanation

  • I have updated or added tests to cover my code
  • I have updated or added documentation

@swirtSJW swirtSJW force-pushed the DKAN-4287-de-key-harvest-run branch 2 times, most recently from 8ce1554 to 59ea0a1 Compare November 20, 2024 22:34
@swirtSJW swirtSJW self-assigned this Dec 4, 2024
@swirtSJW swirtSJW force-pushed the DKAN-4287-de-key-harvest-run branch 9 times, most recently from 9a08467 to f1d22db Compare December 10, 2024 21:16
*
* @return \Drupal\harvest\HarvestRunInterface|\Drupal\Core\Entity\EntityInterface|null
* The loaded entity or NULL if none could be loaded.
*
* @deprecated in dkan:2.19.11 and is removed from dkan:3.0.0 Use HarvestService::load().
Copy link
Contributor Author

@swirtSJW swirtSJW Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible that this can't be deprecated. Everywhere it is still called, there is no access to the actual ID. Mainly in tests where it is pulling previous specific run instances.

@swirtSJW swirtSJW force-pushed the DKAN-4287-de-key-harvest-run branch 4 times, most recently from b44beae to d87e4fa Compare December 11, 2024 04:32
@swirtSJW
Copy link
Contributor Author

There are still two test failures that are not immediately clear to me why they are failing
image

@swirtSJW swirtSJW force-pushed the DKAN-4287-de-key-harvest-run branch 2 times, most recently from c5c9048 to e3e46fa Compare December 11, 2024 14:09
@swirtSJW
Copy link
Contributor Author

Got it down to just one test failure
testGetExtractedUuids
Failed asserting that two arrays are equal.

@dafeder
Copy link
Member

dafeder commented Dec 11, 2024

@swirtSJW I don't see any test failures at the moment?

@swirtSJW
Copy link
Contributor Author

@paul-m Can you take a first pass at code review on this?
Tests are passing but I will hold off on adding QA steps until I perform this whole upgrade on PDC. Right now this is successful on vanilla but has not been fully run on an existing site.

@swirtSJW swirtSJW requested a review from paul-m December 11, 2024 22:56
@swirtSJW swirtSJW force-pushed the DKAN-4287-de-key-harvest-run branch from e3e46fa to b91e610 Compare December 11, 2024 23:06
Copy link
Contributor

@paul-m paul-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under the 2.x branch, I used sample_content to generate a harvest to run (drush dkan:sample-content:create), and then ran it (using cron).
Then I switched to this branch and ran drush updb.
Everything seemed to work... I did notice that the one harvest run in my harvest_runs table didn't have a UUID.

So I went ahead and did a new installation with this branch, and then ran the sample content harvest, and it did have a UUID.

So perhaps the update process doesn't generate UUIDs the way just creating an entity does...

Also I'd suggest that the utility functions, such as harvest_get_temp_run_ids() in harvest.install should be in HarvestUtility. That way it's all in one place and labeled as part of the update process.

@swirtSJW swirtSJW force-pushed the DKAN-4287-de-key-harvest-run branch 4 times, most recently from 5cb3055 to 2407716 Compare December 12, 2024 23:03
@swirtSJW
Copy link
Contributor Author

Everything seemed to work... I did notice that the one harvest run in my harvest_runs table didn't have a UUID

So it looks like a simple database write did not kick off the magic that would add a uuid that a normal drupal save would. So I intentionally added it.

@swirtSJW swirtSJW force-pushed the DKAN-4287-de-key-harvest-run branch 4 times, most recently from f6a4484 to e6d855f Compare December 13, 2024 22:33
@swirtSJW
Copy link
Contributor Author

I am stumped by this test failure.

There was 1 error:

1) Drupal\Tests\datastore\Unit\Plugin\QueueWorker\ImportJobTest::testBasics
TypeError: Drupal\datastore\Plugin\QueueWorker\ImportJob::getParser(): Return value must be of type Contracts\ParserInterface, CsvParser\Parser\Csv returned

It seems completely unrelated to the code in this PR.

Beyond this test @paul-m I made all the changes you requested.

@paul-m
Copy link
Contributor

paul-m commented Dec 19, 2024

Over in #4177 we deprecated some interfaces, which led to the fails.

@swirtSJW swirtSJW force-pushed the DKAN-4287-de-key-harvest-run branch from 73e27a4 to 01d7288 Compare December 31, 2024 17:17
@swirtSJW
Copy link
Contributor Author

swirtSJW commented Jan 2, 2025

Here is the run from PDC with existing data

 [notice] Database updates start.
 ----------------- ------------------------------- --------------- ----------------------------------------------------- 
  Module            Update ID                       Type            Description                                          
 ----------------- ------------------------------- --------------- ----------------------------------------------------- 
  dkan              9002                            hook_update_n   9002 - Enable data dictionary widget.                
  harvest           8006                            hook_update_n   8006 - Ensure the entity manager knows about         
                                                                    harvest_run entities.                                
  harvest           8007                            hook_update_n   8007 - Move data from harvest_ID_hashes tables to    
                                                                    harvest_hash entity. This will move all harvest      
                                                                    hash information to the updated schema, including    
                                                                    data which does not have a corresponding hash plan   
                                                                    ID. Outdated tables will be removed.                 
  harvest           8008                            hook_update_n   8008 - Move entries from harvest_[ID]_runs to        
                                                                    harvest_runs. This finishes the process started by   
                                                                    harvest_update_8007.                                 
  harvest           8009                            hook_update_n   8009 - Update harvest_run schema to add timestamp,   
                                                                    uuid, and true id. @see                              
                                                                    https://github.com/GetDKAN/dkan/issues/4287          
  harvest           8010                            hook_update_n   8010 - Move data from temp table back into           
                                                                    harvest_run. @see                                    
                                                                    https://github.com/GetDKAN/dkan/issues/4287          
  harvest           8011                            hook_update_n   8011 - Move entries from harvest_[ID]_runs to        
                                                                    harvest_runs. This finishes the process started by   
                                                                    harvest_update_8007 and re-runs 8008.                
  metastore         8009                            hook_update_n   8009 - Update existing data dictionary nodes to use  
                                                                    corrected schema.                                    
  metastore_admin   8012                            hook_update_n   8012 - Reinstall DKAN Metastore Admin configuration  
                                                                    to include new dkan menu items.                      
  search_api        fix_index_dependencies_orders   post-update     Re-save Search API index configurations to fix       
                                                                    dependencies order.                                  
 ----------------- ------------------------------- --------------- ----------------------------------------------------- 


 // Do you wish to run the specified pending updates?: yes.                                                             

> >  [notice] Update started: harvest_update_8006
> >  [notice] Update completed: harvest_update_8006
> >  [notice] Update started: harvest_update_8007
> >  [notice] Converting hashes for dialysis__data
> >  [notice] Converting hashes for dialysis__data_umkecc
> >  [notice] Converting hashes for home_health__data
> >  [notice] Converting hashes for home_health__data_two
> >  [notice] Converting hashes for home_health__HomeHealthCAHPS
> >  [notice] Converting hashes for hospice__data
> >  [notice] Converting hashes for hospital__data
> >  [notice] Converting hashes for inpatient__data
> >  [notice] Converting hashes for long_term_care_hospital__data
> >  [notice] Converting hashes for nursing_home__data
> >  [notice] Converting hashes for office_visit_cost__data
> >  [notice] Converting hashes for physician__data
> >  [notice] Converting hashes for supplier__data
> >  [notice] Update completed: harvest_update_8007
> >  [notice] Update started: harvest_update_8008
> >  [notice] Update completed: harvest_update_8008
> >  [notice] Update started: harvest_update_8009
> >  [notice] Table harvest_runs moved to harvest_runs_temp. 
> > Old harvest_run entity removed. 
> > New harvest_run entity installed. 
> > 
> >  [notice] Update completed: harvest_update_8009
> >  [notice] Update started: harvest_update_8010
> >  [notice] Processed: 0/0.
> > Data in harvest_runs updated to new schema:
> > Temporary table dropped.
> > 
> >  [notice] Update completed: harvest_update_8010
> >  [notice] Update started: dkan_update_9002
> >  [notice] Update completed: dkan_update_9002
> >  [notice] Update started: harvest_update_8011
> >  [notice] Converting runs for dialysis__data
> >  [notice] Converting runs for dialysis__data_umkecc
> >  [notice] Converting runs for home_health__data
> >  [notice] Converting runs for home_health__data_two
> >  [notice] Converting runs for home_health__HomeHealthCAHPS
> >  [notice] Converting runs for hospice__data
> >  [notice] Converting runs for hospital__data
> >  [notice] Converting runs for inpatient__data
> >  [notice] Converting runs for long_term_care_hospital__data
> >  [notice] Converting runs for nursing_home__data
> >  [notice] Converting runs for office_visit_cost__data
> >  [notice] Converting runs for physician__data
> >  [notice] Converting runs for supplier__data
> >  [notice] Harvest plan specific run tables coalesced into table harvest_runs.
> >  [notice] Update completed: harvest_update_8011
> >  [notice] Update started: metastore_update_8009
> >  [notice] Updated 0 dictionaries. If you have overridden DKAN's core schemas,
> >     you must update your site's data dictionary schema after this update. Copy
> >     modules/contrib/dkan/schema/collections/data-dictionary.json over you local
> >     site version before attempting to read or write any data dictionaries.
> >  [notice] Update completed: metastore_update_8009
> >  [notice] Update started: metastore_admin_update_8012
> >  [notice] Update completed: metastore_admin_update_8012
> >  [notice] Update started: search_api_post_update_fix_index_dependencies_orders
> >  [notice] Update completed: search_api_post_update_fix_index_dependencies_orders
>  [success] Finished performing updates.

@swirtSJW
Copy link
Contributor Author

swirtSJW commented Jan 2, 2025

harvest_runs table after the updates of existing data.
image

@swirtSJW
Copy link
Contributor Author

swirtSJW commented Jan 9, 2025

I thought I had this buttoned up but just discovered the list of imports is empty
image

@swirtSJW swirtSJW force-pushed the DKAN-4287-de-key-harvest-run branch 2 times, most recently from 7e4d773 to 5a7b090 Compare January 9, 2025 03:41
@swirtSJW swirtSJW force-pushed the DKAN-4287-de-key-harvest-run branch from 5a7b090 to 4d2d5d6 Compare January 9, 2025 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Converting runs from old harvest_ID_runs to harvest_runs fails if duplicate time stamps
3 participants