Skip to content

Latest commit

 

History

History
541 lines (409 loc) · 35.4 KB

README.md

File metadata and controls

541 lines (409 loc) · 35.4 KB

Google Cloud Datastore Client for Java

Java idiomatic client for Cloud Datastore.

Maven Stability

Quickstart

If you are using Maven with BOM, add this to your pom.xml file:

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>com.google.cloud</groupId>
      <artifactId>libraries-bom</artifactId>
      <version>26.52.0</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
  </dependencies>
</dependencyManagement>

<dependencies>
  <dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-datastore</artifactId>
  </dependency>

If you are using Maven without the BOM, add this to your dependencies:

<dependency>
  <groupId>com.google.cloud</groupId>
  <artifactId>google-cloud-datastore</artifactId>
  <version>2.25.1</version>
</dependency>

If you are using Gradle 5.x or later, add this to your dependencies:

implementation platform('com.google.cloud:libraries-bom:26.52.0')

implementation 'com.google.cloud:google-cloud-datastore'

If you are using Gradle without BOM, add this to your dependencies:

implementation 'com.google.cloud:google-cloud-datastore:2.25.2'

If you are using SBT, add this to your dependencies:

libraryDependencies += "com.google.cloud" % "google-cloud-datastore" % "2.25.2"

Authentication

See the Authentication section in the base directory's README.

Authorization

The client application making API calls must be granted authorization scopes required for the desired Cloud Datastore APIs, and the authenticated principal must have the IAM role(s) required to access GCP resources using the Cloud Datastore API calls.

Getting Started

Prerequisites

You will need a Google Cloud Platform Console project with the Cloud Datastore API enabled. You will need to enable billing to use Google Cloud Datastore. Follow these instructions to get your project set up. You will also need to set up the local development environment by installing the Google Cloud Command Line Interface and running the following commands in command line: gcloud auth login and gcloud config set project [YOUR PROJECT ID].

Installation and setup

You'll need to obtain the google-cloud-datastore library. See the Quickstart section to add google-cloud-datastore as a dependency in your code.

About Cloud Datastore

Cloud Datastore is a fully managed, schemaless database for\nstoring non-relational data. Cloud Datastore automatically scales with\nyour users and supports ACID transactions, high availability of reads and\nwrites, strong consistency for reads and ancestor queries, and eventual\nconsistency for all other queries.

See the Cloud Datastore client library docs to learn how to use this Cloud Datastore Client Library.

See the [Google Cloud Datastore docs][cloud-datastore-activation] for more details on how to activate Cloud Datastore for your project.

See the [Datastore client library docs][datastore-client-lib-docs] to learn how to interact with the Cloud Datastore using this Client Library.

Creating an authorized service object

To make authenticated requests to Google Cloud Datastore, you must create a service object with credentials. You can then make API calls by calling methods on the Datastore service object. The simplest way to authenticate is to use Application Default Credentials. These credentials are automatically inferred from your environment, so you only need the following code to create your service object:

import com.google.cloud.datastore.Datastore;
import com.google.cloud.datastore.DatastoreOptions;

Datastore datastore = DatastoreOptions.getDefaultInstance().getService();

For other authentication options, see the Authentication page.

Storing data

Objects in Datastore are known as entities. Entities are grouped by "kind" and have keys for easy access. In this code snippet, we will create a new entity representing a person and store that data by the person's email. First, add the following imports at the top of your file:

import com.google.cloud.datastore.Entity;
import com.google.cloud.datastore.Key;
import com.google.cloud.datastore.KeyFactory;

Then add the following code to put an entity in Datastore.

KeyFactory keyFactory = datastore.newKeyFactory().setKind("Person");
Key key = keyFactory.newKey("[email protected]");
Entity entity = Entity.newBuilder(key)
    .set("name", "John Doe")
    .set("age", 51)
    .set("favorite_food", "pizza")
    .build();
datastore.put(entity);

Later, if you want to get this entity back, add the following to your code:

Entity johnEntity = datastore.get(key);

Running a query

In addition to retrieving entities by their keys, you can perform queries to retrieve entities by the values of their properties. A typical query includes an entity kind, filters to select entities with matching values, and sort orders to sequence the results. google-cloud-datastore supports two types of queries: StructuredQuery (that allows you to construct query elements) and GqlQuery (which operates using GQL syntax) in string format. In this tutorial, we will use a simple StructuredQuery.

Suppose that you've added more people to Datastore, and now you want to find all people whose favorite food is pizza. Import the following:

import com.google.cloud.datastore.Query;
import com.google.cloud.datastore.QueryResults;
import com.google.cloud.datastore.StructuredQuery;
import com.google.cloud.datastore.StructuredQuery.PropertyFilter;

Then add the following code to your program:

Query<Entity> query = Query.newEntityQueryBuilder()
    .setKind("Person")
    .setFilter(PropertyFilter.eq("favorite_food", "pizza"))
    .build();
QueryResults<Entity> results = datastore.run(query);
while (results.hasNext()) {
  Entity currentEntity = results.next();
  System.out.println(currentEntity.getString("name") + ", you're invited to a pizza party!");
}

Cloud Datastore relies on indexing to run queries. Indexing is turned on by default for most types of properties. To read more about indexing, see the Cloud Datastore Index Configuration documentation.

Updating data

Another thing you'll probably want to do is update your data. The following snippet shows how to update a Datastore entity if it exists.

KeyFactory keyFactory = datastore.newKeyFactory().setKind("keyKind");
Key key = keyFactory.newKey("keyName");
Entity entity = datastore.get(key);
if (entity != null) {
  System.out.println("Updating access_time for " + entity.getString("name"));
  entity = Entity.newBuilder(entity)
      .set("access_time", DateTime.now())
      .build();
  datastore.update(entity);
}

The complete source code can be found at UpdateEntity.java.

Complete source code

In AddEntitiesAndRunQuery.java we put together all the code to store data and run queries into one program. The program assumes that you are running on Compute Engine or from your own desktop. To run the example on App Engine, simply move the code from the main method to your application's servlet class and change the print statements to display on your webpage.

gRPC Java Datastore Client User Guide

In this feature launch, the Java Datastore client now offers gRPC as a transport layer option with experimental support. Using gRPC connection pooling enables distributing RPCs over multiple connections which may improve performance.

Installation Instructions

The client can be built from the grpc-experimental branch on GitHub. For private preview, you can also download the artifact with the instructions provided below.

  1. Download the datastore private preview package with dependencies:
curl -o <path-to-downloaded-jar>  https://datastore-sdk-feature-release.web.app/google-cloud-datastore-2.20.0-grpc-experimental-1-SNAPSHOT-jar-with-dependencies.jar
  1. Run the following commands to install JDK locally:
mvn install:install-file -Dfile=<path-to-downloaded-jar> -DgroupId=com.google.cloud -DartifactId=google-cloud-datastore -Dversion=2.20.0-grpc
  1. Edit your pom.xml to add above package to <dependencies/> section:
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-datastore</artifactId>
<version>2.20.0-grpc-experimental-1-SNAPSHOT</version>
</dependency>

And if you have not yet, add below to <repositories/> section:

<repository>
<id>local-repo</id>
<url>file://${user.home}/.m2/repository</url>
</repository>

How to Use

To opt-in to the gRPC transport behavior, simply add the below line of code (setTransportOptions) to your Datastore client instantiation.

Example:

DatastoreOptions datastoreOptions =
     DatastoreOptions.newBuilder()
             .setProjectId("my-project")
             .setDatabaseId("my-database")
             .setTransportOptions(GrpcTransportOptions.newBuilder().build())
             .build();

Setting the transport options explicitly to GrpcTransportOptions will signal the client to use gRPC instead of HTTP when making calls to the server.

To revert back to the existing stable behavior and transport, simply remove the transport options line or replace it with HttpTransportOptions. Please note this will require an application rebuild and restart. Example:

// will default to existing HTTP transport behavior
DatastoreOptions datastoreOptions = DatastoreOptions.newBuilder()
    .setProjectId("my-project")
    .setDatabaseId("my-database")
    .build();

// will also default to existing HTTP transport behavior
DatastoreOptions datastoreOptions =
            DatastoreOptions.newBuilder()
              .setProjectId("my-project")
              .setDatabaseId("my-database")
              .setTransportOptions(HttpTransportOptions.newBuilder()
              .setConnectTimeout(1000)
              .build()).build();

Note: client instantiations that already use setTransportOptions with HttpTransportOptions will continue to have the same behavior. Only transports that are explicitly set to gRPC will change.

Verify Datastore Transport Options Type

To verify which type of TransportOptions you have successfully configured, you can use the below lines of code to compare transport options type:

// checks if using gRPC transport options
boolean isGRPC = datastore.getOptions().getTransportOptions() instanceof GrpcTransportOptions;

// checks if using HTTP transport options
boolean isHTTP = datastore.getOptions().getTransportOptions() instanceof HTTPTransportOptions;

New Features

There are new gRPC specific features available to use in this update.

Connection Pool

A connection pool, also known as a channel pool, is a cache of database connections that are shared and reused to improve connection latency and performance. With this update, now you will be able to configure the channel pool to improve application performance. This section guides you in determining the optimal connection pool size and configuring it within the Java datastore client. To customize the number of channels your client uses, you can update the channel provider in the DatastoreOptions.

Determine the best connection pool size

The default connection pool size is right for most applications, and in most cases there's no need to change it.

However sometimes you may want to change your connection pool size due to high throughput or buffered requests. Ideally, to leave room for traffic fluctuations, a connection pool has about twice the number of connections it takes for maximum saturation. Because a connection can handle a maximum of 100 concurrent requests, between 10 and 50 outstanding requests per connection is optimal. The limit of 100 concurrent streams per gRPC connection is enforced in Google's middleware layer, and you are not able to reconfigure this number.

The following steps help you calculate the optimal number of connections in your channel pool using estimate per-client QPS and average latency numbers.

To calculate the optimal connections, gather the following information:

  1. The maximum number of queries per second (QPS) per client when your application is running a typical workload.
  2. The average latency (the response time for a single request) in ms.
  3. Determine the number of requests that you can send serially per second by dividing 1,000 by the average latency value.
  4. Divide the QPS in seconds by the number of serial requests per second.
  5. Divide the result by 50 requests per channel to determine the minimum optimal channel pool size. (If your calculation is less than 2, use at least 2 channels anyway, to ensure redundancy.)
  6. Divide the same result by 10 requests per channel to determine the maximum optimal channel pool size.

These steps are expressed in the following equations:

(QPS ÷ (1,000 ÷ latency ms)) ÷ 50 streams = Minimum optimal number of connections
(QPS ÷ (1,000 ÷ latency ms)) ÷ 10 streams = Maximum optimal number of connections
Example

Your application typically sends 50,000 requests per second, and the average latency is 10 ms. Divide 1,000 by 10 ms to determine that you can send 100 requests serially per second. Divide that number into 50,000 to get the parallelism needed to send 50,000 QPS: 500. Each channel can have at most 100 requests out concurrently, and your target channel utilization is between 10 and 50 concurrent streams. Therefore, to calculate the minimum, divide 500 by 50 to get 10. To find the maximum, divide 500 by 10 to get 50. This means that your channel pool size for this example should be between 10 and 50 connections.

It is also important to monitor your traffic after making changes and adjust the number of connections in your pool if necessary.

Set the pool size

The following code sample demonstrates how to configure the channel pool in the client libraries using DatastoreOptions. See ChannelPoolSettings and Performance Best Practices for more information on channel pools and best practices for performance.

Code Example

InstantiatingGrpcChannelProvider channelProvider =
      DatastoreSettings.defaultGrpcTransportProviderBuilder()
              .setChannelPoolSettings(
                     ChannelPoolSettings.builder()
                              .setInitialChannelCount(MIN_VAL)
                              .setMaxChannelCount(MAX_VAL)
                              .build())
              .build();

DatastoreOptions options = DatastoreOptions.newBuilder()
                          .setProjectId("my-project")
                          .setChannelProvider(channelProvider)
                          .setTransportOptions(GrpcTransportOptions.newBuilder().build())
                          .build();

Testing

This library has tools to help write tests for code that uses the Datastore.

On your machine

You can test against a temporary local Datastore by following these steps:

  1. Install Cloud SDK and start the emulator

To determine which host/port the emulator is running on:

$ gcloud beta emulators datastore env-init

# Sample output:
#   export DATASTORE_EMULATOR_HOST=localhost:8759
  1. Point your client to the emulator
DatastoreOptions options = DatastoreOptions.newBuilder()
.setProjectId(DatastoreOptions.getDefaultProjectId())
.setHost(System.getenv("DATASTORE_EMULATOR_HOST"))
.setCredentials(NoCredentials.getInstance())
.setRetrySettings(ServiceOptions.getNoRetrySettings())
.build();
Datastore datastore = options.getService();
  1. Run your tests

Example Applications

  • Bookshelf - An App Engine app that manages a virtual bookshelf.
    • This app uses google-cloud to interface with Cloud Datastore and Cloud Storage. It also uses Cloud SQL, another Google Cloud Platform service.
  • Flexible Environment/Datastore example - A simple app that uses Cloud Datastore to list the last 10 IP addresses that visited your site.
  • SparkDemo - An example of using google-cloud-datastore from within the SparkJava and App Engine Flexible Environment frameworks.
    • Read about how it works on the example's README page.

Samples

Samples are in the samples/ directory.

Sample Source Code Try it
Quickstart Sample source code Open in Cloud Shell
Avg Aggregation On Kind source code Open in Cloud Shell
Avg Aggregation With Limit source code Open in Cloud Shell
Avg Aggregation With Order By source code Open in Cloud Shell
Avg Aggregation With Property Filter source code Open in Cloud Shell
Count Aggregation In Transaction source code Open in Cloud Shell
Count Aggregation On Kind source code Open in Cloud Shell
Count Aggregation With Gql Query source code Open in Cloud Shell
Count Aggregation With Limit source code Open in Cloud Shell
Count Aggregation With Order By source code Open in Cloud Shell
Count Aggregation With Property Filter source code Open in Cloud Shell
Count Aggregation With Stale Read source code Open in Cloud Shell
Multiple Aggregations In Gql Query source code Open in Cloud Shell
Multiple Aggregations In Structured Query source code Open in Cloud Shell
Sum Aggregation On Kind source code Open in Cloud Shell
Sum Aggregation With Limit source code Open in Cloud Shell
Sum Aggregation With Order By source code Open in Cloud Shell
Sum Aggregation With Property Filter source code Open in Cloud Shell
Indexing Consideration Query source code Open in Cloud Shell
Create a union between two filters source code Open in Cloud Shell
Order Fields Query source code Open in Cloud Shell
Query Profile Explain source code Open in Cloud Shell
Query Profile Explain Aggregation source code Open in Cloud Shell
Query Profile Explain Analyze source code Open in Cloud Shell
Query Profile Explain Analyze Aggregation source code Open in Cloud Shell
Task List source code Open in Cloud Shell

Troubleshooting

To get help, follow the instructions in the shared Troubleshooting document.

Transport

Cloud Datastore uses both gRPC and HTTP/JSON for the transport layer.

Supported Java Versions

Java 8 or above is required for using this client.

Google's Java client libraries, Google Cloud Client Libraries and Google Cloud API Libraries, follow the Oracle Java SE support roadmap (see the Oracle Java SE Product Releases section).

For new development

In general, new feature development occurs with support for the lowest Java LTS version covered by Oracle's Premier Support (which typically lasts 5 years from initial General Availability). If the minimum required JVM for a given library is changed, it is accompanied by a semver major release.

Java 11 and (in September 2021) Java 17 are the best choices for new development.

Keeping production systems current

Google tests its client libraries with all current LTS versions covered by Oracle's Extended Support (which typically lasts 8 years from initial General Availability).

Legacy support

Google's client libraries support legacy versions of Java runtimes with long term stable libraries that don't receive feature updates on a best efforts basis as it may not be possible to backport all patches.

Google provides updates on a best efforts basis to apps that continue to use Java 7, though apps might need to upgrade to current versions of the library that supports their JVM.

Where to find specific information

The latest versions and the supported Java versions are identified on the individual GitHub repository github.com/GoogleAPIs/java-SERVICENAME and on google-cloud-java.

Versioning

This library follows Semantic Versioning.

Contributing

Contributions to this library are always welcome and highly encouraged.

See CONTRIBUTING for more information how to get started.

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms. See Code of Conduct for more information.

License

Apache 2.0 - See LICENSE for more information.

CI Status

Java Version Status
Java 8 Kokoro CI
Java 8 OSX Kokoro CI
Java 8 Windows Kokoro CI
Java 11 Kokoro CI

Java is a registered trademark of Oracle and/or its affiliates.