stac-utils / pystac

Python library for working with any SpatioTemporal Asset Catalog (STAC)

Home Page:https://pystac.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Saving - float32 Error in Item

mendesdyl opened this issue · comments

This is the first issue that I am raising on GitHub, so go easy, please.
I keep getting this error no matter what I do or change when trying to save the catalog. To my knowledge, I don't have a random float somewhere in my code. I do not even know where to begin to debug this issue. I also got the same error when trying to run this line, but not for the catalog just the item. print(json.dumps(item.to_dict(), indent=4)) Below is the code that got me here.

import netcdf4
import pystac
from shapely.geometry import Polygon, mapping

def dataset_to_geojson(dataset):
    # THE dataset is a NetCDF4
    # Assuming 'lat' and 'lon' are the variable names for latitude and longitude
    lat = dataset.variables['lat'][:]
    lon = dataset.variables['lon'][:]

    # Calculate the bounding box
    bbox = [min(lon), min(lat), max(lon), max(lat)]

    # Create the footprint polygon
    footprint = Polygon([
        [min(lon), min(lat)],
        [min(lon), max(lat)],
        [max(lon), max(lat)],
        [max(lon), min(lat)],
        [min(lon), min(lat)]  # Closing the polygon
    ])

    return bbox, mapping(footprint)

def toISO8601(stringDate):
    formattedDate = stringDate.strftime("%Y-%m-%dT%H:%M:%SZ")
    return formattedDate

dataset = netCDF4.Dataset('https://www.star.nesdis.noaa.gov/thredds/dodsC/CoastWatch/VIIRS/npp-n20-s3a/chloci/DailyGlobalDINEOF/WW00/2022')
time = dataset.variables['time']
start_time = netCDF4.num2date(time[0], time.units, calendar=time.calendar)
end_time = netCDF4.num2date(time[-1], time.units, calendar=time.calendar)
bbox, geojson = dataset_to_geojson(dataset)
catalog = pystac.Catalog(id='NOAA MSL12 multi-sensor DINEOF global gap-filled products (aggregated)',
                         description='This my first attempt at a catalog.')
item = pystac.Item(id='2022 Agg Chla Global 9km Daily SQ DINEOF',
                 geometry=geojson,
                 bbox=bbox,
                 datetime=None,
                 properties={
                     'start_datetime': toISO8601(start_time),
                     'end_datetime': toISO8601(end_time)
                 })

item.add_asset(
    key='OpenDAP',
    asset=pystac.Asset(
        href='https://www.star.nesdis.noaa.gov/thredds/dodsC/CoastWatch/VIIRS/npp-n20-s3a/chloci/DailyGlobalDINEOF/WW00/2022',
        media_type=pystac.MediaType.HTML,
        title='OpenDAP Data Link'
    )
)
catalog.add_item(item)
catalog.normalize_hrefs(dir[0])
catalog.save(catalog_type=pystac.CatalogType.SELF_CONTAINED)

That last line is what raises the below error.
TypeError: Object of type float32 is not JSON serializable

Hi! Welcome to the land of GitHub! It looks like the issue is that the lat and lon values are stored as float32 in the netcdf file. So the numbers in your bounding box and geometry end up with numpy.float32 as their type. You can check that by inspecting bbox:

>>> type(bbox[0])
numpy.float32

There are a couple ways to get around this. The most straightforward is to add .item() to convert from numpy type to python type:

from shapely import geometry

def dataset_to_geojson(dataset):
    # THE dataset is a NetCDF4
    # Assuming 'lat' and 'lon' are the variable names for latitude and longitude
    lat = dataset.variables['lat'][:]
    lon = dataset.variables['lon'][:]
    min_lon = min(lon).item()
    max_lon = max(lon).item()
    min_lat = min(lat).item()
    max_lat = max(lat).item()

    # Calculate the bounding box
    bbox = [min_lon, min_lat, max_lon, max_lat]

    # Create the footprint polygon
    footprint = Polygon([
        [min_lon, min_lat],
        [min_lon, max_lat],
        [max_lon, max_lat],
        [max_lon, min_lat],
        [min_lon, min_lat]  # Closing the polygon
    ])

    return bbox, geometry.mapping(footprint)

As a sidenote there is also a box method in shapely.geometry so you can probably make the footprint more easily:

footprint = geometry.box(*bbox)

Let me know if that works for you!

That worked. Thank you very much.