googleapis / nodejs-datastore

Node.js client for Google Cloud Datastore: a highly-scalable NoSQL database for your web and mobile applications.

Home Page:https://cloud.google.com/datastore/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Errors importing lots of entities to DataStore using the emulator

JustinBeckwith opened this issue · comments

From @glenpike on May 23, 2018 10:26

[x] - Search the issues already opened: https://github.com/GoogleCloudPlatform/google-cloud-node/issues
[x] - Search StackOverflow: http://stackoverflow.com/questions/tagged/google-cloud-platform+node.js
[404] - Check our Troubleshooting guide: https://googlecloudplatform.github.io/google-cloud-node/#/docs/guides/troubleshooting
[404] - Check our FAQ: https://googlecloudplatform.github.io/google-cloud-node/#/docs/guides/faq

If you are still having issues, please be sure to include as much information as possible:

Environment details

  • gcloud SDK: 202.0.0
  • OS: OSX El Capitan (10.11.6) Using about 12GB / 16GB memory
  • Node.js version: v8.11.2
  • npm version: v5.6.0
  • google-cloud-node version:
├─┬ @google-cloud/datastore@1.4.0
│ ├─┬ @google-cloud/common@0.16.2
├─┬ @google-cloud/logging-bunyan@0.5.0
│ └─┬ @google-cloud/logging@1.1.1
│   ├─┬ @google-cloud/common@0.13.6
│   ├─┬ @google-cloud/common-grpc@0.4.3
├─┬ @google-cloud/storage@1.7.0
│ ├─┬ @google-cloud/common@0.17.0

Using DataStore via: gstore-node@4.2.1

Steps to reproduce

Looping through a list of data and creating a model for each one, then calling a function which
uses save

    const { body } = ctx;
    let promises = [];

    // Save new Object
    body.new.forEach((model) => {
        const createdEntity = FromModel.create(model, model.id);
        promises.push(createdEntity.upsert());
    });

    const response = await Promise.all(promises);

Trying to import about 2.5k models, we are getting lots of errors that look like they're coming from grpc maybe? Workaround is to split data into chunks, e.g. 1/4 works.

The log of errors looks like this ('...' is replacing several repeated events):

10:02:46.826Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
10:02:46.826Z ERROR import: 13 INTERNAL: Half-closed without a request
10:02:46.827Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
...
10:02:46.841Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
10:02:46.841Z ERROR import: 13 INTERNAL: Half-closed without a request
10:02:46.841Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
10:02:46.841Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
10:02:46.841Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
10:02:46.841Z ERROR import: 13 INTERNAL: Half-closed without a request
10:02:46.841Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
...
10:02:46.858Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
10:02:46.858Z ERROR import: 13 INTERNAL: Half-closed without a request
10:02:46.858Z ERROR import: 13 INTERNAL: Half-closed without a request
10:02:46.858Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
10:02:46.858Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
10:02:46.858Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
10:02:46.858Z ERROR import: 13 INTERNAL: Half-closed without a request
10:02:46.859Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
10:02:46.873Z ERROR import: 13 INTERNAL: Half-closed without a request
10:02:46.874Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
...
10:02:46.898Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
10:02:46.898Z ERROR import: 13 INTERNAL: Half-closed without a request
10:02:46.898Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
10:02:46.898Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
10:02:46.899Z ERROR import: 13 INTERNAL: Half-closed without a request
10:02:46.899Z ERROR import: 1 CANCELLED: Received RST_STREAM with error code 8
10:03:43.145Z ERROR import: 4 DEADLINE_EXCEEDED: Deadline Exceeded
10:03:43.145Z ERROR import: 4 DEADLINE_EXCEEDED: Deadline Exceeded

The DEADLINE_EXCEEDED error seems to correspond with this in the emulator:

datastore] May 23, 2018 11:03:16 AM com.google.cloud.datastore.emulator.impl.LocalDatastoreFileStub$7 run
[datastore] INFO: Time to persist datastore: 198 ms
[datastore] Exception in thread "LocalDatastoreService-1" java.lang.OutOfMemoryError: unable to create new native thread
[datastore] 	at java.lang.Thread.start0(Native Method)
[datastore] 	at java.lang.Thread.start(Thread.java:714)
[datastore] 	at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)
[datastore] 	at java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1018)
[datastore] 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)
[datastore] 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[datastore] 	at java.lang.Thread.run(Thread.java:745)
[datastore] Exception in thread "LocalDatastoreService-4" java.lang.OutOfMemoryError: unable to create new native thread
[datastore] 	at java.lang.Thread.start0(Native Method)
[datastore] 	at java.lang.Thread.start(Thread.java:714)
[datastore] 	at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)
[datastore] 	at java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1018)
[datastore] 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)
[datastore] 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[datastore] 	at java.lang.Thread.run(Thread.java:745)

Copied from original issue: googleapis/google-cloud-node#2822

@glenpike does this repo using the datastore service, and not the emulator? I'm trying to figure out if this is an issue with the client library or the emulator :)

Hi @JustinBeckwith - this happens with the emulator, on the 'live' system it's behaving.

This problem is persistent in the emulator.

I have a reasonably small production datastore instance - barely over 1GB including indexes. I then exported a small fraction of that data, just a few of the entity types constituting about 125MB of data in the storage bucket. But I have been utterly unable to import that data into the emulator. No matter how much memory I give to the running process, it eventually errors out with OOM errors. (when I gave the docker container I was running it in 8GB of memory, it finally completed). Total size on disk was about 160MB. The runtime memory requirements relative to the total dataset size seem more than a little out of whack.

I'm just using a basic import command via curl, just like the documentation suggests (which also never make a single mention of memory management). We're talking about tens of thousands of entities here, not millions. This ought to be a trivial workload for any database. Are there any workarounds? I'm on a host with 16GB of memory and plenty of disk space.

+1 - I encounter this issue using the recommended documentation (posting via cURL). A single kind with a few thousand records (~2GB) should not choke this up. Any ideas?

@stephenplusplus could I trouble you to take a look?

commented

I have encountered the same issue while I was importing data of around 1GB. So the problem was in emulator only. The solution is to increase the allocated memory to JVM. It will eat up your entire CPU but it will work.

Stop your emulator.

Use the below command to increase the memory.

-Xms512m -Xmx1152m -XX:MaxPermSize=256m -XX:MaxNewSize=256m
-Xms: initial heap size
-Xmx: Maximum heap size

Increase it according to your needs and run your import again.

Hope this works for you...!