Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: fix reading dtypes #598

Merged
merged 4 commits into from
May 10, 2023
Merged

Conversation

yokomotod
Copy link
Contributor

Hello. I faced the same confusion as #579, so have tried to update the docs.

Not only the BQ DATE type but also the TIME, TIMESTAMP, and FLOAT64 types seems to be wrong.

It seems to be due to a breaking change in google-cloud-bigquery v3.
https://cloud.google.com/python/docs/reference/bigquery/latest/upgrading#changes-to-data-types-loading-a-pandas-dataframe

I have confirmed the correct dtypes as:

>>> import pandas

>>> sql1 = """
SELECT
  TRUE AS BOOL,
  123 AS INT64,
  123.456 AS FLOAT64,

  TIME '12:30:00.45' AS TIME,
  DATE "2023-01-01" AS DATE,
  DATETIME "2023-01-01 12:30:00.45" AS DATETIME,
  TIMESTAMP "2023-01-01 12:30:00.45" AS TIMESTAMP
"""

>>> pandas.read_gbq(sql).dtypes
BOOL                        boolean
INT64                         Int64
FLOAT64                     float64
TIME                         dbtime
DATE                         dbdate
DATETIME             datetime64[ns]
TIMESTAMP       datetime64[ns, UTC]
dtype: object

>>> sql2 = """
SELECT
  DATE "2023-01-01" AS DATE,
  DATETIME "2023-01-01 12:30:00.45" AS DATETIME,
  TIMESTAMP "2023-01-01 12:30:00.45" AS TIMESTAMP,
UNION ALL
SELECT
  DATE "2263-04-12" AS DATE,
  DATETIME "2263-04-12 12:30:00.45" AS DATETIME,
  TIMESTAMP "2263-04-12 12:30:00.45" AS TIMESTAMP
"""

>>> pandas.read_gbq(sql2).dtypes
DATE         object
DATETIME     object
TIMESTAMP    object
dtype: object

Fixes #579 🦕

@yokomotod yokomotod requested a review from a team as a code owner December 28, 2022 06:17
@yokomotod yokomotod requested review from a team and Neenu1995 December 28, 2022 06:17
@google-cla
Copy link

google-cla bot commented Dec 28, 2022

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@product-auto-label product-auto-label bot added size: s Pull request size is small. api: bigquery Issues related to the googleapis/python-bigquery-pandas API. labels Dec 28, 2022
@parthea parthea added kokoro:run Add this label to force Kokoro to re-run the tests. owlbot:run Add this label to trigger the Owlbot post processor. labels Jan 4, 2023
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Jan 4, 2023
@yoshi-kokoro yoshi-kokoro removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Jan 4, 2023
@yokomotod
Copy link
Contributor Author

Some tests failed with

google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started

Can anyone help me with this for review and merge?

@tswast tswast requested a review from a team as a code owner May 10, 2023 14:11
@tswast
Copy link
Collaborator

tswast commented May 10, 2023

Thanks for the contribution!

As an aside, in future I'd like pandas-gbq to align our dtypes handling with the latest dtype_backend parameter #621 which is in other pandas I/O methods like read_csv.

@tswast tswast added kokoro:force-run Add this label to force Kokoro to re-run the tests. owlbot:run Add this label to trigger the Owlbot post processor. labels May 10, 2023
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label May 10, 2023
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 10, 2023
@tswast tswast merged commit b45651d into googleapis:main May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-pandas API. size: s Pull request size is small.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DATE converted to dbdate instead of datetime64[ns] dtype
5 participants