-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add API to download likelihood patchset archives #1046
Merged
Merged
Changes from all commits
Commits
Show all changes
49 commits
Select commit
Hold shift + click to select a range
a2123dd
Add download command
matthewfeickert c193320
Add exception for invalid host
matthewfeickert f2c8751
Verify archive host is valid
matthewfeickert 92ca11f
Add tests for download
matthewfeickert 29455e9
Add raises to docstring
matthewfeickert beaa9d5
Add all the flags
matthewfeickert ff8e8b1
Make pyflakes happy
matthewfeickert 70f6367
Print output to screen for verbose tests
matthewfeickert f9570a8
Add comment on motivation for wait
matthewfeickert 0588447
Visually seperate logic
matthewfeickert ec77de8
Use Python 3.6 compliant code
matthewfeickert ffe4d33
Make download be pure-Python
matthewfeickert bb99c9b
Use term 'host' for consistency
matthewfeickert 001bacc
Check explicilty for failure state
matthewfeickert 1645b51
Use only Python stdlib for CLI
matthewfeickert 1eb04f9
Attempt to deal with Python 3.6 having different warnings
matthewfeickert bc03e95
Migrate from patchset to contrib
matthewfeickert 4d62266
Add example to docstring
matthewfeickert f21a69e
Fix docstrings
matthewfeickert 909fdac
Use 'pyhf contrib download' API to reinforce contrib nature
matthewfeickert ea23305
Use urllib.parse.urlparse
matthewfeickert f77156c
Use requests for opening archives
matthewfeickert c882c24
Add note that contrib extra is required
matthewfeickert 36a8dd0
Add www to match pattern in tests
matthewfeickert 2f8d1b3
Make requests a true dependency
matthewfeickert 7f3bb72
Try just the minimal for an optional dependency
matthewfeickert d9a3dbb
Make note of contrib more clear
matthewfeickert 47b7ad1
Move import inside of download function
matthewfeickert 75e1889
Add test for missing requests module
matthewfeickert 6e8ccd7
Guard contrib functions beyond cli in try except
matthewfeickert 3797d44
Update tests for guarded pyhf contrib download
matthewfeickert 7e23fa9
Add Contrib to Python API docs
matthewfeickert 8bddb71
Correct mislabeling of analysis
matthewfeickert 3fcbcba
Add Python API to contrib download
matthewfeickert 3717003
Wrap Python API in CLI API
matthewfeickert 83bd2ef
Allow for POSIX tar archives that are not gzip
matthewfeickert d04564c
Sort to avoid doctest error
matthewfeickert ef71fca
Fixup of test_scripts
matthewfeickert 506e59e
Use del of modules to force state
matthewfeickert 0ca56bc
Fix typo
matthewfeickert 39b8feb
Use mock after watching Anthony Sottile's video
matthewfeickert 8fc005a
Revert "Allow for POSIX tar archives that are not gzip"
matthewfeickert 0af5ba4
Add TOOD for removal of Python 3.6
matthewfeickert 560fe64
Add note
matthewfeickert 2afb313
Use log.error instead of print
matthewfeickert 0e05559
Make pyflakes happy
matthewfeickert 8178fa3
ERROR not INFO
matthewfeickert d8e2761
Remove unneeded variable
matthewfeickert b72e399
Don't rely on newlines being in output given systems
matthewfeickert File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
"""CLI for functionality that will get migrated out eventually.""" | ||
import logging | ||
import click | ||
from pathlib import Path | ||
|
||
from ..contrib import utils | ||
|
||
logging.basicConfig() | ||
log = logging.getLogger(__name__) | ||
|
||
|
||
@click.group(name="contrib") | ||
def cli(): | ||
""" | ||
Contrib experimental operations. | ||
|
||
.. note:: | ||
|
||
Requires installation of the ``contrib`` extra. | ||
|
||
.. code-block:: shell | ||
|
||
$ python -m pip install pyhf[contrib] | ||
""" | ||
|
||
|
||
@cli.command() | ||
@click.argument("archive-url", default="-") | ||
@click.argument("output-directory", default="-") | ||
@click.option("-v", "--verbose", is_flag=True, help="Enables verbose mode") | ||
@click.option( | ||
"-f", "--force", is_flag=True, help="Force download from non-approved host" | ||
) | ||
@click.option( | ||
"-c", | ||
"--compress", | ||
is_flag=True, | ||
help="Keep the archive in a compressed tar.gz form", | ||
) | ||
def download(archive_url, output_directory, verbose, force, compress): | ||
""" | ||
Download the patchset archive from the remote URL and extract it in a | ||
directory at the path given. | ||
|
||
Example: | ||
|
||
.. code-block:: shell | ||
|
||
$ pyhf contrib download --verbose https://www.hepdata.net/record/resource/1408476?view=true 1Lbb-likelihoods | ||
|
||
\b | ||
1Lbb-likelihoods/patchset.json | ||
1Lbb-likelihoods/README.md | ||
1Lbb-likelihoods/BkgOnly.json | ||
|
||
Raises: | ||
:class:`~pyhf.exceptions.InvalidArchiveHost`: if the provided archive host name is not known to be valid | ||
""" | ||
try: | ||
utils.download(archive_url, output_directory, force, compress) | ||
|
||
if verbose: | ||
file_list = [str(file) for file in list(Path(output_directory).glob("*"))] | ||
print("\n".join(file_list)) | ||
except AttributeError as excep: | ||
exception_info = ( | ||
str(excep) | ||
+ "\nInstallation of the contrib extra is required to use the contrib CLI API" | ||
+ "\nPlease install with: python -m pip install pyhf[contrib]\n" | ||
) | ||
log.error(exception_info) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
"""Helper utilities for common tasks.""" | ||
|
||
from urllib.parse import urlparse | ||
import tarfile | ||
from io import BytesIO | ||
import logging | ||
from .. import exceptions | ||
|
||
logging.basicConfig() | ||
log = logging.getLogger(__name__) | ||
|
||
try: | ||
import requests | ||
|
||
def download(archive_url, output_directory, force=False, compress=False): | ||
""" | ||
Download the patchset archive from the remote URL and extract it in a | ||
directory at the path given. | ||
|
||
Example: | ||
|
||
>>> from pyhf.contrib.utils import download | ||
>>> download("https://www.hepdata.net/record/resource/1408476?view=true", "1Lbb-likelihoods") | ||
>>> import os | ||
>>> sorted(os.listdir("1Lbb-likelihoods")) | ||
['BkgOnly.json', 'README.md', 'patchset.json'] | ||
>>> download("https://www.hepdata.net/record/resource/1408476?view=true", "1Lbb-likelihoods.tar.gz", compress=True) | ||
>>> import glob | ||
>>> glob.glob("1Lbb-likelihoods.tar.gz") | ||
['1Lbb-likelihoods.tar.gz'] | ||
|
||
Args: | ||
archive_url (`str`): The URL of the :class:`~pyhf.patchset.PatchSet` archive to download. | ||
output_directory (`str`): Name of the directory to unpack the archive into. | ||
force (`Bool`): Force download from non-approved host. Default is ``False``. | ||
compress (`Bool`): Keep the archive in a compressed ``tar.gz`` form. Default is ``False``. | ||
|
||
Raises: | ||
:class:`~pyhf.exceptions.InvalidArchiveHost`: if the provided archive host name is not known to be valid | ||
""" | ||
if not force: | ||
valid_hosts = ["www.hepdata.net", "doi.org"] | ||
kratsg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
netloc = urlparse(archive_url).netloc | ||
if netloc not in valid_hosts: | ||
raise exceptions.InvalidArchiveHost( | ||
f"{netloc} is not an approved archive host: {', '.join(str(host) for host in valid_hosts)}\n" | ||
+ "To download an archive from this host use the --force option." | ||
) | ||
|
||
with requests.get(archive_url) as response: | ||
if compress: | ||
with open(output_directory, "wb") as archive: | ||
archive.write(response.content) | ||
else: | ||
with tarfile.open( | ||
mode="r|gz", fileobj=BytesIO(response.content) | ||
) as archive: | ||
archive.extractall(output_directory) | ||
|
||
|
||
except ModuleNotFoundError as excep: | ||
log.error( | ||
str(excep) | ||
+ "\nInstallation of the contrib extra is required to use pyhf.contrib.utils.download" | ||
+ "\nPlease install with: python -m pip install pyhf[contrib]\n" | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we make the url be the
doi.org
one?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lukasheinrich I would love that, but we don't have a DOI for that likelihood tarball. Do you know of one? Or are you suggesting that we use a different likelihood for the example (the multi-b that does have the DOI)?