Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: use original schema when appending #318

Merged
merged 5 commits into from
Apr 29, 2020

Conversation

ShantanuKumar
Copy link
Contributor

@ShantanuKumar ShantanuKumar commented Apr 25, 2020

Fixes #315
Don't overwrite table schema when appending to an existing table

TODO:

  • Adds relevant tests.
  • Lint passes.
  • Add to changelog.

Don't overwrite table schema when appending to an existing table
@ShantanuKumar
Copy link
Contributor Author

ShantanuKumar commented Apr 25, 2020

If I lint using black locally, it's removing the u attached to a unicode string here in the test file
https://github.com/pydata/pandas-gbq/blob/5538469fe147f8b25b89926fdb748a29cfc851dc/tests/system/test_gbq.py#L1106

I had to manually add it back.
But it seems the lint checks are failing on circle-ci because black is removing it when doing the test on circle-ci (doesn't seem to be the case)

@tswast
Copy link
Collaborator

tswast commented Apr 29, 2020

This is making a few more table GET HTTP requests than necessary. If you don't mind, I'll push a commit or two to this to refactor.

pandas-gbq already gets the table metadata when checking if a table
exists. This refactoring avoids extra calls to get the table metadata
when checking the schema.

also, fix a bug where update_schema appends columns that aren't in the
dataframe to the schema sent in the API request
@tswast
Copy link
Collaborator

tswast commented Apr 29, 2020

I just pushed d8c11aa with the suggested refactoring

@tswast tswast merged commit 3114e24 into googleapis:master Apr 29, 2020
@ShantanuKumar ShantanuKumar deleted the use_existing_table_schema branch April 30, 2020 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pandas should get the schema from bigquery if pushing to a table that already exists
2 participants