MatteoDiPaolo / googleTakeoutLocations-to-geoJson

Node stream based solution to translate Google Takeout Locations History json to GeoJson

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Tutorial

Youtube Preview

Description

This node script is intended to be used in order to translate Google takeout locations history to geojson.

  • What is GeoJson? --> read here
  • What is Google location history? --> read here
  • Where can I get my Google account location history? --> look here

Google location history archive

What follows is the content structure of a Google location history archive:

  • .zip archive
    • .html info file
    • locations history folder
      • .json locations history file

Google location history json file

A basic example of an input file

{
  "locations": [
    {
      "timestampMs": "1507330772000",
      "latitudeE7": 419058658,
      "longitudeE7": 125218684,
      "accuracy": 16,
      "velocity": 0,
      "altitude": 66,
      "activity": [
        {
          "timestampMs": "1507049587082",
          "activity": [
            {
              "type": "TILTING",
              "confidence": 100
            }
          ]
        },
        {
          "timestampMs": "1507049736368",
          "activity": [
            {
              "type": "IN_VEHICLE",
              "confidence": 33
            },
            {
              "type": "STILL",
              "confidence": 33
            },
            {
              "type": "UNKNOWN",
              "confidence": 17
            },
            {
              "type": "ON_FOOT",
              "confidence": 12
            },
            {
              "type": "WALKING",
              "confidence": 12
            },
            {
              "type": "ON_BICYCLE",
              "confidence": 6
            }
          ]
        }
      ]
    }
  ]
}

Google location history in GeoJson format

A basic example of an output file

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          12.5218684,
          41.9058658
        ]
      },
      "properties": {
        "timestamp": "2017-10-06T22:59:32.000Z",
        "accuracy": 16,
        "velocity": null,
        "altitude": 66
      }
    }
  ]
}

Stream based approach

Even if the transformation is straightforward it is not possible to achieve it using a simple array map because of the huge amount of data to manage.

Error arising without using streams

  • File: 2.json
  • Size: 1.16 GB
  • Error message: Cannot create a string longer than 0x3fffffe7 characters
  • Error code: ERR_STRING_TOO_LONG

Node cannot buffer the file for us because:

  • the size of the file is bigger than the maximum one Node.js itself is capable of creating a string for.
  • the size of the file is bigger than the maximum one Node.js is able to store in memory at once.

Stream solution

Defined in the streamProcessing file.

  1. #82B366 [Read] --- fileToStream --> Input file to stream.
  2. #D6B656 [Transform] --- streamParser --> Consumes text, and produces a stream of data items corresponding to high-level tokens.
  3. #D6B656 [Transform] --- streamPicker --> Is a token item filter, it selects objects from a stream ignoring the rest and produces a stream of objects (locations in our case).
  4. #D6B656 [Transform] --- streamArrayer --> It assumes that an input token stream represents an array of objects and streams out assembled JavaScript objects (locations in our case).
  5. #D6B656 [Transform] --- streamGeoJsoner --> It transforms google takeout locations into GeoJson locations.
  6. #D6B656 [Transform] --- streamStringer --> It stringifies GeoJson locations.
  7. #B85450 [Write] --- streamToFile --> Stream to Output file.

README_1.png

Prerequisites

  • node v13.7.0

Install

  • npm i

Test

  • npm run test

Run

  • Copy one or more Google location history json files inside the ./input folder.
  • npm run start [-- fromTimestampMs toTimestampMs]
  • Access the GeoJson files results of the translation inside the ./output folder.

Params

You can optionally set a time range window in order to filter locations out of the output.
Timestamp parameters must be defined as milliseconds epoch timestamps.

  • fromTimestampMs: lower bound timestamp.
  • toTimestampMs: upper bound timestamp.

Note that you can as well set only one of the two bounds either the lower or the upper one.

Examples

  1. Translate files without applying time filtering:
    • npm run start
  2. Maintain only locations with timestamp between Tuesday, 1 January 2019 00:00:00 and Tuesday, 31 December 2019 23:59:59:
    • npm run start -- 1546300800000 1577836799000
  3. Maintain only locations with timestamp subsequent to Tuesday, 1 January 2019 00:00:00:
    • npm run start -- 1546300800000
  4. Maintain only locations with timestamp prior to Tuesday, 31 December 2019 23:59:59:
    • npm run start -- x 1577836799000

Exectution example output

Input folder content

  • 1_locations.json
  • 4_locations.json
  • x_empty.json --> this is going to fail!

Process logs

README_2.png

More cool logs

Command: npm run start over 5 input files.

Please note the journey of the 2.json file:

  • 1167.11297 Mb --> 1.16 GB
  • 3688763 --> almost 4 milion locations processed
  • 551.443 Secs --> almost 10 minutes processing

README_3.png

Command: npm run start -- 1546300800000 1577836799000 over 5 input files.

Please note how the number of GeoJson locations in the output decreases.

README_4.png

Possible bug

About

Node stream based solution to translate Google Takeout Locations History json to GeoJson


Languages

Language:JavaScript 100.0%