Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Adapt to breaking change in google-cloud-bigquery 0.32.0.dev1 #152

Merged
merged 8 commits into from
Apr 3, 2018

Conversation

tswast
Copy link
Collaborator

@tswast tswast commented Mar 22, 2018

There was a breaking change in 0.32.0 which changed the way
configuration for the query job gets loaded. Also, it added the
'description' field to the schema resource, so this change updates the
schema comparison logic to account for that.

Note: I've created this PR based on the code currently in master in the google-cloud-bigquery repo. The tests pass locally, but they won't pass anywhere else until google-cloud-bigquery 0.32.0 is released. I've added the DO NOT MERGE label to this PR for now.

Since pandas-gbq will be broken when google-cloud-bigquery does its next release, I think it would make sense to review this before than so it can be merged when everything is ready. Once it goes in, I can create a pandas-gbq release to minimize the breakage time.

@tswast tswast added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Mar 22, 2018
@tswast tswast self-assigned this Mar 22, 2018
@tswast tswast requested review from parthea and max-sixty March 22, 2018 23:06
del field['mode']
fields_remote = self._clean_schema_fields(
self.schema(dataset_id, table_id))
fields_local = self._clean_schema_fields(schema['fields'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much nicer

@max-sixty
Copy link
Contributor

I think we should either make this backwards compatible or check the version on import and raise. Otherwise people are going to get very obscure error messages if their imports aren't correct (which is very possible with pinned installs)

@tswast
Copy link
Collaborator Author

tswast commented Mar 23, 2018

Agreed. I had a hard time figuring out how to make this one backwards compatible since old versions would accept the config but then the API call to BigQuery fails later on.

I've updated the version number in _check_google_client_version(), so folks should get an understandable error message.

@max-sixty
Copy link
Contributor

Though IIUC it's only backward incompatible for those passing in config dicts, so a very small share of users. Not worth a big lift, then

@max-sixty
Copy link
Contributor

I've updated the version number in _check_google_client_version(), so folks should get an understandable error message.

Ah great! (some crossed wires with my last msg)

"inside config while it is specified "
"as parameter")
query = config['query']['query']
del config['query']['query']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no worth changing, but these two could be query = config['query'].pop('query')

@max-sixty
Copy link
Contributor

We could update the final build to also use master from google.cloud.bigquery as well as pandas - would be generally helpful as well as for this

@tswast
Copy link
Collaborator Author

tswast commented Mar 23, 2018

@maxim-lian Good idea. I've made the "MASTER" build use the version of google-cloud-python hosted on GitHub. 5f427a0

I'll keep an eye on the MASTER build on Travis to see if it works. https://travis-ci.org/tswast/pandas-gbq/builds/357586225 https://travis-ci.org/tswast/pandas-gbq/builds/357588439 https://travis-ci.org/tswast/pandas-gbq/builds/357606304

.travis.yml Outdated
@@ -28,6 +28,7 @@ install:
conda install -q numpy pytz python-dateutil;
PRE_WHEELS="https://7933911d6844c6c53a7d-47bd50c35cd79bd838daf386af554a83.ssl.cf2.rackcdn.com";
pip install --pre --upgrade --timeout=60 -f $PRE_WHEELS pandas;
pip install -e 'git+https://github.com/GoogleCloudPlatform/google-cloud-python.git#egg=version_subpkg&subdirectory=bigquery';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've struggled with these from time to time. I think the egg=version_subpkg is like foo and can be excluded. But if this works then great

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. I don't really want to mess with it, though. I got it from this answer on installing with pip from a git subdirectory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works!

@tswast
Copy link
Collaborator Author

tswast commented Mar 23, 2018

Finally got a green build on MASTER. https://travis-ci.org/tswast/pandas-gbq/jobs/357606308 Got failures on the other builds as expected, since google-cloud-bigquery hasn't released 0.32.0 yet.

tswast added 4 commits April 3, 2018 10:03
There was a breaking change in 0.32.0.dev1 which changed the way
configuration for the query job gets loaded. Also, it added the
'description' field to the schema resource, so this change updates the
schema comparison logic to account for that.

Updates the MASTER build in CI to also build with google-cloud-bigquery
at MASTER.
@codecov-io
Copy link

codecov-io commented Apr 3, 2018

Codecov Report

Merging #152 into master will decrease coverage by 42.49%.
The diff coverage is 64.7%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #152      +/-   ##
==========================================
- Coverage   78.87%   36.37%   -42.5%     
==========================================
  Files           8       10       +2     
  Lines        1529     1545      +16     
==========================================
- Hits         1206      562     -644     
- Misses        323      983     +660
Impacted Files Coverage Δ
pandas_gbq/tests/test_gbq.py 28.08% <0%> (-63.21%) ⬇️
pandas_gbq/tests/test__query.py 100% <100%> (ø)
pandas_gbq/gbq.py 31.97% <26.31%> (-44.96%) ⬇️
pandas_gbq/_query.py 84.61% <84.61%> (ø)
pandas_gbq/_load.py 62.5% <0%> (-32.5%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9666965...c636448. Read the comment docs.

@tswast
Copy link
Collaborator Author

tswast commented Apr 3, 2018

@maxim-lian I figured out how to make it backwards compatible so that we don't have to force upgrade from 0.29.0 versions. Basically, I use the same version detection code we're already doing in the _check_google_client_version() function but in a new function to construct the query configuration.

Please take a look. With this new way of doing it, I think we can merge without waiting to sync with a release of google-cloud-bigquery.

@tswast
Copy link
Collaborator Author

tswast commented Apr 3, 2018

@max-sixty
Copy link
Contributor

Very nice! Good job on the backward-compat.

@max-sixty
Copy link
Contributor

Re _query.py - I haven't seen other libraries that use private module names. It's fine and not worth changing, but generally libraries put the public API in pandas_gbq.__init__, and then there's no need to make the module names private.

@tswast
Copy link
Collaborator Author

tswast commented Apr 3, 2018

generally libraries put the public API in pandas_gbq.init, and then there's no need to make the module names private.

Makes sense. Filed #154 to follow-up with some module renames.

@tswast tswast changed the title BUG: Update pandas-gbq to latest version of google-cloud-bigquery BUG: Adapt to breaking change in google-cloud-bigquery 0.32.0.dev1 Apr 3, 2018
@tswast tswast merged commit 6294f7a into googleapis:master Apr 3, 2018
@tswast tswast deleted the gcb-0.32.0 branch April 3, 2018 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do not merge Indicates a pull request not ready for merge, due to either quality or timing.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants