-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend pyhf contrib download to allow for uncompressed targets #1111
Comments
See #1090. We can probably abstract this a way a little bit. |
Hm, seems one thing that needs to get considered though is since this is currently reading streams, then for magic_headers = {
"gzip": b"\x1f\x8b",
"zip": b"PK",
"JSON": re.compile(br"^\s*{"),
} there's the question of how to take fileobj = BytesIO(response.content) and check the header as there is no |
So sketching things out there's the possibility of doing something like magic_headers = {
"gz": b"\x1f\x8b", # gzip
"bz2": b"PK", # zip
}
header_len = max(len(header) for header in magic_headers.values())
with requests.get(archive_url) as response:
if compress:
with open(output_directory, "wb") as archive:
archive.write(response.content)
else:
fileobj = BytesIO(response.content)
file_type = None
for _file_type, header in magic_headers.items():
if fileobj.read(header_len) == header:
file_type = _file_type
with tarfile.open(
mode=f"r|{file_type}", fileobj=BytesIO(response.content)
) as archive:
archive.extractall(output_directory) but this probably isn't a step towards a solution as any(?) benefits of using a stream seem negated (I'm actually not really sure what the advantages of a stream over a file are here for |
Seeking is important: if you're going to But then again, adding round-trip times to read two bytes is bad for latency if you're accessing something remote, like HTTP through requests. In that case, it would probably make more sense to subclass But then again, the tarfile.open documentation says that |
Description
This feature request was a suggestion of @lukasheinrich's:
This should be easy to implement and I think mostly will just require making
pyhf/src/pyhf/contrib/utils.py
Lines 55 to 58 in c1727ce
be able to deal with non-compressed targets.
The text was updated successfully, but these errors were encountered: