protomaps / PMTiles

Cloud-optimized + compressed single-file tile archives for vector and raster maps

Home Page:https://protomaps.com/docs/pmtiles/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Getting http request lengh and offers from xyz coordinates

am2222 opened this issue · comments

Hi,
Is there any way to find length abd offset values based on xyz coordinates?
It would be great if pmtile could expose such a function.

why do you need the raw offset/length instead of the data itself?

I am working on a custom layer for mapbox and due to its limitations we can pass url with range parameters to its getTile function. Then mabox should be able to download the tile itself!

Can you post an example of such a getTile in an open source library like MapLibre GL JS?

@bdon Thanks for the response.
So I know maplibre is already supporting it pmtile. But since I am using mapbox in my setup I cannot switch now. However I decided to write a custom source for mapbox to support pmtiles.

So mapbox custom sources has two main abstract functions. one is load which loads tileset's metadata. Which is simple to add to load pmtile's metata from the pmtile instance. we just load headers and it is good to go.

https://github.com/mapbox/mapbox-gl-js/blob/bcc0e9760fd82db35a444cffff601f1dcab9e2fa/src/source/vector_tile_source.js#L107

However the issue is with the second abstract. it is loadTile which is responsible to get each tile by x,y,z coordinates.
So here is the mapbox source : https://github.com/mapbox/mapbox-gl-js/blob/bcc0e9760fd82db35a444cffff601f1dcab9e2fa/src/source/vector_tile_source.js#L218

I want to follow mapbox's internals as much as I can. So in their source they technically get a url and header params. so the idea is to get offset and lenght of per each x,y,z tile and pass it to params.request and mapbox should be able to handle sending the request and the rest will be handled by mapboxgl. I know this way we cannot manage edge cases but at least mapbox's web workers will handle the entire thing. Does this approach make sense?

My current approach is like handeling download of the tiles using pmtile library and pass data to mapbox internals. Which works but it has a memory leak somewhere when we overzoom the data.

Right now this custom source loads pmtiles in mapbox

class PmtileSourceImpl extends VectorTileSourceImpl {
    constructor(...args) {
      super(...args);
      this.type = SOURCE_TYPE;
      this.protocol = new pmtiles.Protocol();
  
      let PMTILES_URL =
       "https://protomaps.github.io/PMTiles/protomaps(vector)ODbL_firenze.pmtiles";
  
    //   let pmTileMapboxSource = new pmtiles.MapboxSource(PMTILES_URL)
      const p = new pmtiles.PMTiles(PMTILES_URL);
  
      // this is so we share one instance across the JS code and the map renderer
      this.protocol.add(p);
      this.instance = p;
      this.type='vector'
      this.scheme='tms'
    }
  
    load1(callback) {
      this._loaded = false;

      this._tileJSONRequest = this.instance
        .getMetadataAttempt()
        .then((tileJSON) => {
          this._tileJSONRequest = null;
          this._loaded = true;
          // metadata.tiles = [`${pmtiles_http_url}?${pmtiles_querystring}`]
          // return new Response(JSON.stringify(metadata), { status: 206 })
          extend(this, tileJSON);
       
        })
        .catch((err) => {
    
          if (callback) callback(err);
        });

    }
    loadTile(tile, callback) {
      let that = this;
      // $FlowFixMe[missing-this-annot]
      function done(err, data) {
        // debugger;
        delete tile.request;
  
        if (tile.aborted) return callback(null);
  
        // $FlowFixMe[prop-missing] - generic Error type doesn't have status
        if (err && err.status !== 404) {
          return callback(err);
        }
  
        if (data && data.resourceTiming)
          tile.resourceTiming = data.resourceTiming;
  
        if (this.map._refreshExpiredTiles && data) tile.setExpiryData(data);
        tile.loadVectorData(data, this.map.painter);
  
        // cacheEntryPossiblyAdded(this.dispatcher);
  
        callback(null);
  
        if (tile.reloadCallback) {
          this.loadTile(tile, tile.reloadCallback);
          tile.reloadCallback = null;
        }
      }
  
      const url = this.map._requestManager.normalizeTileURL(
        tile.tileID.canonical.url(this.tiles, this.scheme)
      );
      const request = this.map._requestManager.transformRequest(url, "Tile");
  
      const params = {
        request,
        data: "data",
        uid: tile.uid,
        tileID: tile.tileID,
        tileZoom: tile.tileZoom,
        zoom: tile.tileID.overscaledZ,
        tileSize: this.tileSize * tile.tileID.overscaleFactor(),
        type: "vector",
        source: this.id,
        scope: this.scope,
        // pixelRatio: browser.devicePixelRatio,
        showCollisionBoxes: this.map.showCollisionBoxes,
        promoteId: this.promoteId,
        isSymbolTile: tile.isSymbolTile,
        // brightness: this.map.style ? (this.map.style.getBrightness() || 0.0) : 0.0,
        extraShadowCaster: tile.isExtraShadowCaster,
      };
      params.request.collectResourceTiming = this._collectResourceTiming;
  
      if (!tile.actor || tile.state === "expired") {
        tile.actor = this._tileWorkers[url] =
          this._tileWorkers[url] || this.dispatcher.getActor();
  
        const afterLoad = (error, data, cacheControl, expires) => {
          params.data = {
            cacheControl: cacheControl,
            expires: expires,
            rawData: data,
          };
          if (tile.actor)
            tile.actor.send("loadTile", params, done.bind(this), undefined, true);
        };
  
        this.protocol.tile({ ...tile, url }, afterLoad);
 
      } else if (tile.state === "loading") {
        // schedule tile reloading after it has been loaded
        tile.reloadCallback = callback;
      } else {
        tile.request = tile.actor.send("reloadTile", params, done.bind(this));
      }
  
    }
  
    loadTile1(tile, callback) {
     
      const url = this.map._requestManager.normalizeTileURL(tile.tileID.canonical.url(this.tiles, this.scheme));
      const request = this.map._requestManager.transformRequest(url, 'Tile');
      const {z,x,y}=tile.tileID.canonical
      const pmtileRequest= this.instance.getZxy(z,x,y) // get request header info
      const params = {
          request,
          data: undefined,
          uid: tile.uid,
          tileID: tile.tileID,
          tileZoom: tile.tileZoom,
          zoom: tile.tileID.overscaledZ,
          tileSize: this.tileSize * tile.tileID.overscaleFactor(),
          type: 'vactor',
          source: this.id,
          scope: this.scope,
          // pixelRatio: browser.devicePixelRatio,
          showCollisionBoxes: this.map.showCollisionBoxes,
          promoteId: this.promoteId,
          isSymbolTile: tile.isSymbolTile,
          // brightness: this.map.style ? (this.map.style.getBrightness() || 0.0) : 0.0,
          extraShadowCaster: tile.isExtraShadowCaster
      };
      // rest handled by mapbox
    }
  }
  

@am2222 we can't reference any code in mapbox-gl-js because it's not open source, and that would affect the license of this project. i'm not able to help further on this issue.

It would be possible to change https://github.com/protomaps/PMTiles/blob/main/js/index.ts#L916 to return a raw offset/length, but you would also need the decompression type from the header to be able to interpret the data correctly. If there's a use case for this in an open source library we can consider adding that feature!

@bdon I did not mean to reference anything from Mapbox here. Just shared the approach I am taking.

@am2222 I understand, I would experiment with the approach I took above to get the raw offset/length. Keep in mind that doing the fetch from web workers may not be efficient because the PMTiles object needs to share state between requests so it doesn't need to download all the headers/directories each time.

Thanks @bdon !
I think right now my best shot is to download files in the main thread and pass data to the layer renderer.
But what about loading some of the processes of the pmtiles to a web worker? I feel sharing the state and tile cache in a webworker would help the performance. Do you think this makes any sense?

The fetching of HTTP requests should be in a centralized thread, on the UI thread or a single worker, because multiple tiles will share most of their prerequisite requests for headers/directories.

The other things going on in the PMTiles library are directory decoding and decompression. On a recent browser the decompression should happen using the native DecompressionStream APIs so is asynchronous. The directory decoding should be quite fast to do on the main thread, but let me know if you see slow parts of the code there.