Better exception #355

andamian · 2022-09-14T22:02:35Z

Possible solution for #352

codecov · 2022-09-14T22:14:58Z

Codecov Report

Merging #355 (e47c049) into main (907b164) will increase coverage by 0.05%.
The diff coverage is 61.90%.

❗ Current head e47c049 differs from pull request most recent head 2dc0bdf. Consider uploading reports for the commit 2dc0bdf to get more accurate results

@@            Coverage Diff             @@
##             main     #355      +/-   ##
==========================================
+ Coverage   78.36%   78.42%   +0.05%     
==========================================
  Files          46       46              
  Lines        5506     5520      +14     
==========================================
+ Hits         4315     4329      +14     
  Misses       1191     1191

Impacted Files	Coverage Δ
pyvo/dal/adhoc.py	`66.19% <33.33%> (-0.40%)`	⬇️
pyvo/dal/query.py	`84.77% <50.00%> (-0.40%)`	⬇️
pyvo/dal/exceptions.py	`85.00% <77.77%> (+6.62%)`	⬆️
pyvo/dal/tap.py	`70.98% <100.00%> (+0.36%)`	⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

andamian · 2022-09-14T22:19:10Z

Is the solution in this draft acceptable @msdemlei?
It turns the trace reported in the issue into:

Traceback (most recent call last):
  File "/Users/adriand/Documents/work/github/pyvo/pyvo/dal/adhoc.py", line 252, in getdataset
    response.raise_for_status()
  File "/Users/adriand/Documents/work/github/pyvo/.tox/py39-test/lib/python3.9/site-packages/requests/models.py", line 960, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error:  for url: https://almascience.eso.org/dataPortal/member.uid___A001_X14c3_X125d.J1743-0350_ph.spw16.mfs.I.mask.fits.gz

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/adriand/Documents/work/github/pyvo/pyvo/dal/adhoc.py", line 254, in getdataset
    raise DALServiceError.from_except(ex, url)
pyvo.dal.exceptions.DALServiceError: 403 Forbidden: anonymous cannot access member.uid___A001_X14c3_X125d.J1743-0350_ph.spw16.mfs.I.mask.fits.gz for https://almascience.eso.org/dataPortal/member.uid___A001_X14c3_X125d.J1743-0350_ph.spw16.mfs.I.mask.fits.gz

This is just a draft to get an idea if I'm on the right track. It returns a different exception for existing clients that are coded to expect the first one.

I've also improved TAP error messages.

I couldn't find any service to return the payload of the error message in VOTable format. CADC TAP or SIA ones do that but for sync calls only. PyVO hits the async end points even for sync queries (which does not make sense).

bsipocz · 2022-09-14T22:25:11Z

PyVO hits the async end points even for sync queries (which does not make sense).

smells like a bug?

andamian · 2022-09-14T23:12:27Z

I could very much be the case. I don't understand why in this particular case it was leaking the requests exception when in the other cases it would wrap it in its own exception. This fix uses the simple text payload in the exception message.

msdemlei · 2022-09-15T13:04:12Z

On Wed, Sep 14, 2022 at 03:19:21PM -0700, Adrian wrote: Is the solution in this draft acceptable @msdemlei?

It is certainly a step forward. As I said, I haven't properly reviewed what the protocol handlers do when extracting errors from VOTables, DALI or SCS style, and how consistent their error raising and message extraction is, and how much repeated code there is in there. But that's for another day. One of the things we ought to also consider is unifying the overflow warnings (where, by the way, I'd add a colon between "Potentical causes" and "MAXREC"). So... you have my vote. Thanks!

tomdonaldson

Thanks, @andamian . Other than my one comment and the change log, I'm happy with this change. I suspect the test coverage will be difficult to improve so I'm not worried about that.

tomdonaldson · 2022-09-16T20:38:04Z

pyvo/dal/tap.py

@@ -821,7 +821,7 @@ def raise_if_error(self):
            if theres an error
        """
        if self.phase in {"ERROR", "ABORTED"}:
-            raise DALQueryError("Query Error", self.phase, self.url)
+            raise DALQueryError("Query Error: " + self.job.message, self.phase, self.url)


Two thoughts here.

I couldn't convince myself quickly that self.job.message would have a value here. Would it be worth checking for that first?

I have a general concern with the use of the job and phase properties here. My real problem is with how those "properties" are defined, but I don't suggest any changes there for this PR. Those "properties" are both written to do an _update() every time they are accessed, and _update() hits network at the TAP service job URL. So for these two lines, we would do 3 GETs of the same information just to craft an error message. That's not efficient and if the error is related to communication problems with the service, then this makes it worse.

One of the reasons I don't like these properties is that there's no good way to be sure it's safe to use the underscore version (i.e., _job, _phase which don't do _updates()). In this case I think it would be safe to use the underscore versions on all 3 accesses, but it certain to be safe to leave the self.phase in the if and use self._job... and self._phase in the error call since the phase check would have called _update().

@tomdonaldson for 1:

tap = pyvo.dal.TAPService("https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/youcat") tap.run_async("select top 2 * from caom2.Observations")

Currently, the error is:

File "/Users/adriand/Documents/work/github/pyvo/pyvo/dal/tap.py", line 824, in raise_if_error raise DALQueryError("Query Error ", self.url) pyvo.dal.exceptions.DALQueryError: Query Error

The the job error message:

pyvo.dal.exceptions.DALQueryError: Query Error: IllegalArgumentException:Table [ caom2.Observations ] is not found in TapSchema. Possible reasons: table does not exist or permission is denied.

which is an improvement.

I think this might address #125 too

No 2. has an issue now #365

msdemlei · 2022-09-19T09:05:56Z

On Fri, Sep 16, 2022 at 01:40:03PM -0700, Tom wrote: properties here. My real problem is with how those "properties" are defined, but I don't suggest any changes there for this PR. Those "properties" are both written to do an `_update()` **every** time they are accessed, and `_update()` hits network at the TAP service job URL. So for these two lines, we would do 3 GETs of the same information just to craft an error message. That's not efficient and if the error is related to communication problems with the service, then this makes it worse.

Uh. I agree with that. I suppose as a general rule, I'd now say a simple property access should never hit the network, as that's expensive and there's so much that can go wrong, whereas a property access in python looks inoccous. But what's done is done, and we probably shouldn't change the API here, at least not until pyvo 2. I suppose the quick solution here is to write something like: cur_phase = self.phase if phase in {"ERROR", "ABORTED"}: raise DALQueryError("Query Error: " + self.job.message, phase, self.url) That's down to two network accesses. Still not pretty given the server may to totally out of whack, but some progress. Longer-term, I suspect for cases like this it would be nice to have a context manager that would retrieve the job resource once and then return an object exposing them as they were at that time. I'm thinking of something like: with job.get_from_remote() as frozen: if frozen.phase ... If other people also think that's a good idea, I might try my hand on something like this.

tomdonaldson · 2022-09-19T14:32:03Z

On Fri, Sep 16, 2022 at 01:40:03PM -0700, Tom wrote: properties here. My real problem is with how those "properties" are defined, but I don't suggest any changes there for this PR. Those "properties" are both written to do an _update() every time they are accessed, and _update() hits network at the TAP service job URL. So for these two lines, we would do 3 GETs of the same information just to craft an error message. That's not efficient and if the error is related to communication problems with the service, then this makes it worse.
Uh. I agree with that. I suppose as a general rule, I'd now say a simple property access should never hit the network, as that's expensive and there's so much that can go wrong, whereas a property access in python looks inoccous. But what's done is done, and we probably shouldn't change the API here, at least not until pyvo 2. I suppose the quick solution here is to write something like: cur_phase = self.phase if phase in {"ERROR", "ABORTED"}: raise DALQueryError("Query Error: " + self.job.message, phase, self.url) That's down to two network accesses. Still not pretty given the server may to totally out of whack, but some progress. Longer-term, I suspect for cases like this it would be nice to have a context manager that would retrieve the job resource once and then return an object exposing them as they were at that time. I'm thinking of something like: with job.get_from_remote() as frozen: if frozen.phase ... If other people also think that's a good idea, I might try my hand on something like this.

I agree that we can't/shouldn't change the underlying behavior now. Longer-term changes should be made carefully, but I don't yet have a sense for the "best" approach. I think a context manager like you suggest would be helpful, but the scope and benefit would be limited to that with. There may be broader rules we could follow based on the state, but it's not obvious where that behavior would live (e.g., if the PHASE is in some completed state, is there ever a reason to ask the server again?) Another question might be, should the API user ever get the choice of forcing or preventing a refresh?

For this limited case, since _update() updates all the job elements, I think it's safe to use self._job in the error generation. Since it's on self it's not a privacy violation, it's just choosing to use the cached value (like being in the context manager, but less explicitly). Based on where raise_if_error() is called, we could probably get away with using _phase as well, but that's hacky and less safe since we can't guarantee where raise_if_error() will be called in the future.

msdemlei · 2022-09-20T08:02:44Z

On Mon, Sep 19, 2022 at 07:32:13AM -0700, Tom wrote: For this limited case, since `_update()` updates all the job elements, I think it's safe to use `self._job` in the error generation. Since it's on `self` it's not a privacy violation,

I agree with that. Who's going to change the PR?

andamian · 2022-09-20T22:43:02Z

@bsipocz @tomdonaldson - I think this one is ready.

bsipocz

LGTM, one remaining tiny question that I didn't go into to dig up the answer for myself.

pyvo/dal/exceptions.py

bsipocz · 2022-09-21T13:16:09Z

While this is waiting for another approval, I've rebased it to make sure it passes the remote and doctests.

tomdonaldson

This looks great. Thanks @andamian !

bsipocz · 2022-09-23T00:05:59Z

Thanks @andamian!

andamian marked this pull request as draft September 14, 2022 22:03

andamian added the bug label Sep 14, 2022

tomdonaldson added this to the v1.4 milestone Sep 16, 2022

tomdonaldson requested changes Sep 16, 2022

View reviewed changes

andamian marked this pull request as ready for review September 20, 2022 22:24

bsipocz approved these changes Sep 20, 2022

View reviewed changes

pyvo/dal/exceptions.py Show resolved Hide resolved

Adrian Damian added 3 commits September 21, 2022 06:11

Initial version

4358e82

Fixed small bug

7a15ea9

Changed after review

2dc0bdf

bsipocz force-pushed the better_exception branch from e47c049 to 2dc0bdf Compare September 21, 2022 13:15

tomdonaldson approved these changes Sep 22, 2022

View reviewed changes

bsipocz merged commit 3191766 into astropy:main Sep 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better exception #355

Better exception #355

andamian commented Sep 14, 2022

codecov bot commented Sep 14, 2022 •

edited

Loading

andamian commented Sep 14, 2022

bsipocz commented Sep 14, 2022

andamian commented Sep 14, 2022

msdemlei commented Sep 15, 2022 via email

tomdonaldson left a comment

tomdonaldson Sep 16, 2022

andamian Sep 20, 2022 •

edited

Loading

andamian Sep 20, 2022

andamian Sep 21, 2022

msdemlei commented Sep 19, 2022 via email

tomdonaldson commented Sep 19, 2022

msdemlei commented Sep 20, 2022 via email

andamian commented Sep 20, 2022

bsipocz left a comment

bsipocz commented Sep 21, 2022

tomdonaldson left a comment

bsipocz commented Sep 23, 2022

Better exception #355

Better exception #355

Conversation

andamian commented Sep 14, 2022

codecov bot commented Sep 14, 2022 • edited Loading

Codecov Report

andamian commented Sep 14, 2022

bsipocz commented Sep 14, 2022

andamian commented Sep 14, 2022

msdemlei commented Sep 15, 2022 via email

tomdonaldson left a comment

Choose a reason for hiding this comment

tomdonaldson Sep 16, 2022

Choose a reason for hiding this comment

andamian Sep 20, 2022 • edited Loading

Choose a reason for hiding this comment

andamian Sep 20, 2022

Choose a reason for hiding this comment

andamian Sep 21, 2022

Choose a reason for hiding this comment

msdemlei commented Sep 19, 2022 via email

tomdonaldson commented Sep 19, 2022

msdemlei commented Sep 20, 2022 via email

andamian commented Sep 20, 2022

bsipocz left a comment

Choose a reason for hiding this comment

bsipocz commented Sep 21, 2022

tomdonaldson left a comment

Choose a reason for hiding this comment

bsipocz commented Sep 23, 2022

codecov bot commented Sep 14, 2022 •

edited

Loading

andamian Sep 20, 2022 •

edited

Loading