-
-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow gc.collect() on close() #489
Comments
Good catch! |
- Resolve #489 Signed-off-by: Hiroshi Miura <[email protected]>
- Resolve #489 Signed-off-by: Hiroshi Miura <[email protected]>
Strange... There is no big difference in the score of benchmark... Any wrong things? |
Here is a test code. @capyvara could you improve the benchmark test code? I think a target data is relatively small than your condition. https://github.com/miurahr/py7zr/blob/master/tests/test_benchmark.py |
@miurahr I'm not fully sure we can test this automatically because Testing inside a Jupyter notebook was even worse, it was reading something like 1 item/s. |
- Resolve #489 Signed-off-by: Hiroshi Miura <[email protected]>
Specially when reading many small files, the time spent on gc.collect() is greater than the time decompressing data.
Example, 500 ~90kb files (600kb uncompressed) , LZMA, it can only read ~12 files per second.
Commenting out the
gc.collect()
on_var_release()
ref it speeds up to 138 files per second, 10x fasterIMHO it should not be in the lib responsibility to force invoke a a manual GC, it's the user that should handle its own memory the lib should only ensure no memory leaks on their own.
I see a commit about one year ago that added that, not sure what was the case it was trying to solve.
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: