-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up dependencies indexing #243
Speed up dependencies indexing #243
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should work. The only problem that you don't clone the last file is it is with same base name.
So in this case you need to update metafile (just copying it from previous file?) and already calculated dependencies.
Thanks for all those pointers. I think that I have implemented your suggestion. Added some tests to make sure that we get the correct number of files listed in dependencies for a couple of datasets. Can you cross check that:
|
Codecov Report
@@ Coverage Diff @@
## dev #243 +/- ##
=====================================
Coverage 0.00% 0.00%
=====================================
Files 31 31
Lines 1446 1480 +34
=====================================
- Misses 1446 1480 +34
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
For some reason github don't allow me to see changes if I login. I'll put review in comments.
Overwise, it looks good for me |
Done |
Tell me when I can retest |
Agreed. Done. Mind you this change the order of input arguments.
Good point. Actually the whole thing about folder organizations of derivatives and raw are mostly suggestions, but I agree that we can make the default to nest the derivatives in the raw.
Yup I know but I am pretty sure (as far as I can recall) I had to do things that way because the behavior of Octave and Matlab is slightly different with gunzip. Need to double check. |
From quick look into docs, the only difference I could spot is that matlab automatically creates folder if it not exists. But in any way, this has no relations with dependencies calculations. |
latest commits implements #185 but relaxes constraints to have more than one instance of the string |
So the difference in gunzip behavior (and not Here is MWE below. Will try to add a work around with an Not sure how if Scriptsrc_folder = fullfile(pwd, 'input');
tgt_folder = fullfile(pwd, 'output');
% set up
copyfile('sub-01_T1w.nii.gz', src_folder);
%do
gunzip(fullfile(src_folder, 'sub-01_T1w.nii.gz'), tgt_folder) Starting point├── gunzip_cmp_matlab_octave.m
├── input
├── output
└── sub-01_T1w.nii.gz ResultsWith Matlab├── gunzip_cmp_matlab_octave.m
├── input
│ └── sub-01_T1w.nii.gz
├── output
│ └── sub-01_T1w.nii
└── sub-01_T1w.nii.gz With octave├── gunzip_cmp_matlab_octave.m
├── input
├── output
│ └── sub-01_T1w.nii
└── sub-01_T1w.nii.gz |
giving up and silence the tests for octave
It should be good for another cross check. Added a test for copy derivatives and unzip to try to make sure Octave and Matlab behave the same way on Linux. Will quickly run things on Windows. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good for me. Performance on my tests is the same:
>> tic; BIDS = bids.layout('funct_3D', false); toc;
Elapsed time is 13.879293 seconds.
>> size(bids.query(BIDS, 'data'))
ans =
4793 1
It is still a waiting time but I'm impression we hit matlab itself.
I checked the Approuve
, but for some reason ity didn't worked. For me it works and I "approve" the merge :-D
P.S. How is the advances on regexp for file selection? I can't find the corresponding merge request |
Sorry. Was done on local branch that I merged here: |
Ok thet's why I still have json -- you exclude them only for datasets following the schema. |
It was suggested by @arnodelorme in here |
Then will it be possible to get json files only if they are not metadada for data? Also, if he want to get metadata json, they can be retrived with |
Would you be ok if I opened a separate issue for this: I feel that otherwise we are ending up changing behavior on this PR that was supposed to be only about optimization and has already devolved into way more than this. Tempted to have an issue + meeting where several of the interested people on this can come together an agree on what we would like the API and the behavior to look like for a first version. |
It could be a good idea, if I will be invited) |
most definitely |
fixes #209
supersedes #185
copy_to_derivatives