-
-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug #543 #599
Fix bug #543 #599
Conversation
9c4b61a
to
3218a09
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #599 +/- ##
==========================================
+ Coverage 81.62% 81.68% +0.06%
==========================================
Files 69 69
Lines 7103 7089 -14
==========================================
- Hits 5798 5791 -7
+ Misses 1305 1298 -7 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can confirm that this indeed fixes the exception raised in #543; and the PR looks good to me, however I'm not fully confident that I know enough about datalinks to catch any nuances.
My only comments are whether the new methods could be private so would not populate as prominently the user API.
👍 on the testing utilities.
pyvo/dal/adhoc.py
Outdated
@@ -169,6 +169,109 @@ class DatalinkResultsMixin(AdhocServiceResultsMixin): | |||
""" | |||
Mixin for datalink functionallity for results classes. | |||
""" | |||
def iter_datalinks_from_dlblock(self, datalink_service): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think both of these new methods could be prepended with _
, unless you envision them to be used by the end users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. While I see the merits of invoking only this method (the other one seems more adhoc - hence the best efforts that are needed here), for the regular user it's just an implementation detail and it shouldn't crop up into the API.
@andamian - could you please have a quick look at this? |
This looks good to me. I'm not sure about the usefulness of utype and ucd heuristics since the field names are defined in |
So far, it contains helper methods to halfway conventiently create VOTables and DALResults.
To do that, we split the original iter_datalinks in the two cases; one where the datalinks come from a table via a datalink meta RESOURCE, and the other where we believe there's datalink products in the table. In the second case (the one that was broken before), we try the name access_format as attribute and column name, and then a utype and two UCDs (generic and SIA1) each. We could add SSAP UCDs, but I'll only do that if someone is asking for it.
3218a09
to
cbe2490
Compare
This also now compares utypes case-insensitively in getbyutype as required by the SSAP standard. All this is ugly VO legacy, but I am rather confident that one of these days we will be glad we have these heuristics in.
On Tue, Sep 24, 2024 at 03:11:32PM -0700, Adrian wrote:
Agree. While I see the merits of invoking only this method (the
other one seems more adhoc - hence the best efforts that are needed
here), for the regular user it's just an implementation detail and
it shouldn't crop up into the API.
Jup -- the methods are now marked private, too; I certainly did not
mean them to be user-visible API.
|
On Tue, Sep 24, 2024 at 03:29:28PM -0700, Adrian wrote:
This looks good to me. I'm not sure about the usefulness of utype
and ucd heuristics since the field names are defined in `ObsCore`
but I guess it leaves the door open for future data formats to use
different names.
This is mainly to build a basis to support SSAP and SIAv1.
Except I forgot to actually add these, which I did in another commit.
That, in turn, uncovered that getbyutype has compared utypes
case-sensitively, which at least SSAP insists is wrong (Sigh!). So,
I've fixed that on top.
Could you briefly re-review? Thanks!
|
Aha. It makes sense now. However I wasn't able to track down all those UTYPEs and UCDs. It looks like SIAv1 refers to UCDs (the VOX... ones). Where do "obscore:access.format" and "obscore:access.reference" come from? I also couldn't find any reference of "meta.code.mime" but maybe they are specified in other documents. But it looks good otherwise. |
On Wed, Sep 25, 2024 at 10:22:23AM -0700, Adrian wrote:
Aha. It makes sense now. However I wasn't able to track down all
those UTYPEs and UCDs. It looks like SIAv1 refers to UCDs (the
VOX... ones). Where do "obscore:access.format" and
"obscore:access.reference" come from? I also couldn't find any
reference of "meta.code.mime" but maybe they are specified in other
documents. But it looks good otherwise.
The saga about "data model identifiers" in the VO is a bad mess. The
SIA1 "magic" UCDs were a first experiment nobody was happy with,
which engendered the utypes which only made everything worse because
nobody bothered to properly specify how to make and interpret them
(not to mention what they actually reference and the horrible
confusion about their "prefixes" (the stuff in front of the colon)).
Well: It's legacy we have to live with; *perhaps* MIVOT will lead a
way to a future with less ad-hoc guessing.
The obscore utypes, anyway, are in the Obscore spec,
http://ivoa.net/documents/ObsCore/, and the SSA ones are in in the
SSAP document, http://ivoa.net/documents/SSA/. And while, as you
say, the utypes *in obscore* are immaterial (as the column names are
fixed), in SSA it is actually them that identify the fields.
Accepting the obscore utypes in a non-obscore context is, I think,
harmless, while it may occasionally help some data publisher out of a
fix, and perhaps may even be useful on results of ad-hoc TAP queries
("SELECT access_url as cube_uri...").
If you're fine with the PR, would you merge it? Thanks!
|
Thank you for the contribution @msdemlei |
Upfront for the record, there is a possible API change here: iter_datalink would have returned [None] rather than [] when it did not find datalink-returnable things. But this seems (a) insane and (b) it couldn't happen because pyVO would crash before it did this.
Having said that, this is a (possible) fix for bug #543. The code in (2) in the bug now runs but yields no datalinks; that is the correct behaviour in this case because the media type is not datalink's.
There is in addition quite a bit of new heuristics in here to recognise possible pairs of product urls and media types. I'm afraid these kinds of heuristics are the best we can do.
Also in here are testing utilities, which currently are undocumented; I'd first like to see whether they are actually useful. Their purpose is to make it as simple as possible to produce DALResult-s for tests, which should allow to reduce the amount of test data in our repo.