firebase / geofire-js

GeoFire for JavaScript - Realtime location queries with Firebase

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature request: Optional custom data

stillgbx opened this issue · comments

My current project is based on Polymer/Firebase/Maps API.
I have to display to the user the number of records within the area he is viewing, using clustering technique.
All is fine with GeoFire and thousand of records (require some tweaks with the Polymer data binding system).

The user can filter records by type using the on/off switch buttons:
capture d ecran de 2016-09-27 13 05 37

If GeoFire lets me add custom data, then I can achieve this in a simple way without server side code.

For my use case, and for performance reasons, the added data must be relatively tiny.
For exemple:

"2016-001": {
  "g" : "u07t4v857n",
  "l" : [ 47.31595982215497, 5.042859856945543 ],
  "t" : "a"
}

"t" stands for "type".

I understand that adding custom data has an impact on performances.
It might be up to the user of GeoFire to decide the best compromise, depending on the context, between adding data and performance.

I'll link a PR to this issue.

Thanks for opening this issue and for the pull request. Great to see both of those!

This has been an oft-requested feature (#40 and #76 are related) and I've been a bit hesitant to include it in the official API. I will take a look at your implementation and see what kind of tradeoffs you took. It will probably be a little while longer for me to review your PR and think about the impact of this change. I appreciate your patience!

Thank you Jacob.
You'll see that the implementation is pretty simple. Let me know if I can do more.

commented

any updates @jwngr?

commented

Such feature would be helpful indeed.

current flow

  • retrieve nearby keys
  • fire .once() per key to retrieve nearby data

number of network requests needed is 1 + results.length, if there are ten results, we would need 11 requests, this is too much!

desired flow

  • retrieve nearby keys and associated nearby data at once

number of network requests needed is 1.

As for the performance, I am not sure how it can be affected here, the calculations would be independent of the data, wouldn't they be? everything would remain the same except now set() could have an extra parameter to add some custom data, and query() could be configured to pass that data, the calculations would still deal with only 'g' and 'l'. Why would the performance be compromised?

commented

alright, i think i got it, geoFire probably does a .once() on the whole node where all the geolocation data is stored (which could be huge), then runs an algorithm on the client side to fetch the nearby keys, this makes sense.

commented

Are there any best practices for such a case? the case of looking up nearby keys and their data that it is.

@sonaye - I have a modified GeoFire-JS fork that can store data on the GeoFire index. I just updated it a few minutes ago to bring it in line with GeoFire v4.1.2

Try it out and let me know if it works for you.

https://github.com/mikepugh/geofire-js

commented

@mikepugh i've looked quickly at your work, haven't tested it yet, thanks for the effort.

i've been thinking about this issue lately, even thought about building a simple new algorithm from scratch for learning purposes (based on https://en.wikipedia.org/wiki/Great-circle_distance).

one factor that could significantly improv perf with firebase, would be to separate the data from the location keys, you see -and i didn't verify this- geofire requests the geo node with all its children at first -could be one could be a million, but small in size-, then loops through them quickly and efficiently to get their distance from our coords of interest, then re-order them nearest to farthest, apply a filter by radius, and gives us back the keys.

based on the info above, we don't need to download all the data for the initial lookup right? if you have 1,000 records? that's a lot of data over the network, now let's get even more realistic, say 10,000, that's just not acceptable.

the initial thought the came into my mind was to separate the geo node into two nodes, the typical current node that we have and contain the location, and another one with the data associated with that location, something like:

geo: {
  "coords": {
    "$key": {
      "g": "9q8yjgjmsf",
      "l": {
        "0": 37.6346234,
        "1": -122.4352054
      }
    }
  },
  "data": {
    "$key": {
      "title": "lorem ipsum",
      "likes": 7 
    }
  }
}

this way, for the lookup -done by geofire- we would use ref('geo/coords'), then we can do a multi-path update on all the keys we found:

// retrieved from geofire, # of requests so far = 1
const keys = [
  'key',
  'another-key',
  'yet-another-key'
];

// to avoid collision of requests at the same instant
const unique = new Date().getTime();

// generate the multi-path objects needed
const pathsFound = {};
const pathsToBeRemoved = {};

keys
  .forEach(key => {
    pathsFound[`geo/data/${key}/found-${unique}`] = true;
    pathsToBeRemoved[`geo/data/${key}/found-${unique}`] = null;
  });

db
  .ref()
  .update(pathsFound) // flag found data, # of requests so far = 2
  .then(() =>
    db
      .ref('geo/data')
      .orderByChild(`found-${unique}`)
      .equalTo(true)
      .once('value', data => { // fetch data, # of requests so far = 3
        // do stuff with data ..
      })
      .then(() =>
        db
          .ref()
          .update(pathsToBeRemoved) // remove flag, # of requests so far = 4
      )
  );

the total number of requests is always 4, weather you have one geo recored or a million. multi-path updates grantee a successful change in "all" the paths listed or nothing is altered, and with the beauty of the async api that firebase provides, we can follow a systematic path here.

this is a short summary of how i got around this issue at the moment, and without the need to modify the code of geofire-js.

@sonaye - geofire-js takes advantage of an index created on the geo node, so that it doesn't have to download the entire dataset in order to find nearby keys.

I don't know what your exact use case is, but mine was to show nearby search results to my end users. Those search results contain a small amount of data and so it is much more efficient to make the single GeoFire call to firebase and get all of the data I need in one shot. My database has ~400K entries and searches are very fast with this approach. The end user only gets ~30-50 results for any given search.

commented

@mikelehen lorem ipsum 7 is 13 bytes in size, so for 400k that's 5.2 megabytes per call, i take it that geofire doesn't download that much of data?

can you provide me with any perf figures? the size of the object downloaded for a typical search request.

geofire-js takes advantage of an index created on the geo node, so that it doesn't have to download the entire dataset in order to find nearby keys

can you further elaborate on that?

my use case is nearly identical to yours.

i might conduct a little experiment comparing the two approaches later on.

@sonaye that is right - GeoFire is not downloading all of that data when I execute a query. I don't have precise perf figures for you but basically GeoFire makes a couple of calls out to Firebase and streams the results to my app over the websocket connection. For a sample search near my location, Firebase sent me ~10 frames and each frame was between 1 KB and 6 KB. Each frame contained multiple search results - I think I had about 30 results displayed to the user, but GeoFire provided ~60 results.

The way GeoFire works is it takes your search lat/long coordinates and a search radius, and it calculates GeoHash values for the bounding box of that circle. Since GeoFire requires a Firebase index on the geohash value for all keys stored on the geo node, it is then able to very quickly query those indexes and stream the results to you. It does stream more results than what is strictly within your search radius since it's working off a bounding box. GeoFire then locally trims out results which aren't within your search radius.

So it does get a bit more than absolutely necessary but it's still very fast and reasonably lightweight.

Some of the data I store on my geo node include:

  • Display Name
  • Phone #
  • E-mail
  • Company Name
  • Certifications
  • Id

The extra data isn't huge, so I haven't had any performance issues with it.

The recommended approach where-by I store that data on some other node in Firebase and query it directly by id was much slower. But my data set is fairly static - I'm tracking local companies so their locations don't move around and at most my app is going to display ~75 results to the end user. So your mileage will vary with my solution.

Could Compound queries help with this?

Such as: Get me all Italian themed restaurants within the area.

Example data;

{
  "g": "...",
  "l": [0,0],
  "type": "resturant",
  "theme": ["italian"]
}

Proposed: Logical AND by chaining multiple where's.

geoFire.compoundQuery({center: [lat,lng], radius: 300}).where('type', '==', 'resturant').where('theme', 'array-contains', 'italian')

Make sure to create a composite index for '/locations type , theme'.

This could also be useful to set an expiry timestamp and only pull back non expired records.

Such as: Upcoming events that are nearby.

geoFire.compoundQuery({center: [lat,lng], radius: 300}).where('expires', '>', currentTimestamp)

More reading specifically on the Query limitations, such as no != or OR querys but you can make multiple queries and merge them back up...
https://firebase.google.com/docs/firestore/query-data/queries

commented

data in firebase:

{
"bike1": {
"name: "b",
"ph": 9070987
},
"bike2": {
"name: "a",
"ph": 90754647
}
}

in the code:

let geoFire = new GeoFire(Firebase.database().ref());
geoFire.set('bike2', [centerLat, centerLon]);

after geoFire.set.. in the firebase:

{
"bike1": {
"name: "b",
"ph": 9070987
},
"bike2": {
"g": "w1gw9f2z",

"l": {
[ 32, 120]
}
}

where are other fields of bike2 ?

i tried to add additional child like set('loc'...) but in that case

geoFire.query({
      center: [32, 120],
      radius: 100
    });

    var onReadyRegistration = geoQuery.on("ready", function() {
      console.log("GeoQuery has loaded and fired all other events for initial data");
    })

does not find any bikes