hypertidy / ncmeta

Tidy NetCDF metadata

Home Page:https://hypertidy.github.io/ncmeta/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

nc_coord_var issues

mdsumner opened this issue · comments

"day" is not found:

ncmeta::nc_coord_var("rasterwise/extdata/rectilinear/ACCfronts_nc4.nc")
# A tibble: 3 x 6
  variable X     Y     Z     T     bounds
  <chr>    <chr> <chr> <chr> <chr> <chr> 
1 lon      lon   NA    NA    NA    NA    
2 lat      NA    lat   NA    NA    NA    
3 front    lon   lat   NA    NA    NA 

No rows:

ncmeta::nc_coord_var("rasterwise/extdata/NSIDC/alaska_2007_2008_swe_v01.nc")

(from https://github.com/mdsumner/stars/issues/4#issuecomment-487438234)

My day is pretty stacked here. I'll have a look at this tonight (you morning I suppose)

no problem, i just had to park it! - I had a look but don't fully understand the code yet, but it's definitely working - allows to me to barrel on with the units for dims and variables

On ACCfronts_nc4.nc -- that's a totally non-standard treatment of time.

$ ncdump -h ACCfronts_nc4.nc 
netcdf ACCfronts_nc4 {
dimensions:
	lon = 1081 ;
	lat = 363 ;
	time = 854 ;
variables:
	float lon(lon) ;
		lon:units = "degrees_east" ;
		lon:long_name = "longitude" ;
	float lat(lat) ;
		lat:units = "degrees_north" ;
		lat:long_name = "latitude" ;
	float month(time) ;
		month:long_name = "month" ;
	float year(time) ;
		year:long_name = "year" ;
	float day(time) ;
		day:long_name = "day" ;
		day:units = "central time of +/- 10-15 day window" ;
	float front(time, lat, lon) ;
		front:units = "valid range from 0 to 12" ;
		front:long_name = "frontal indices" ;
		front:description = "Indices:   0 - south of sBdy;  1 - between SACCF-S & sBdy;  2 - SACCF-N & SACCF-S;  3 - PF-S & SACCF-N;  4 - PF-M & PF-S;  5 - PF-N & PF-M;  6 - SAF-S & PF-N;  7 - SAF-M & SAF-S;  8 - SAF-N & SAF-M;  9 - SAZ-S & SAF-N; 10 - SAZ-M & SAZ-S; 11 - SAZ-N & SAZ-M; 12 - north of SAZ-N. " ;

// global attributes:
		:Conventions = "none" ;
		:institution = "CSIRO Marine and Atmospheric Research" ;
		:title = "ACC frontal indices: ACC fronts are mapped using combined MSLA data; Sokolov & Rintoul, JGR, 2009.  " ;
		:description = "ACC fronts are mapped using local (estimated in  30-deg sectors) frontal labels. Navigation of the ACC jets around shallow bottom topography is also taken into account. For further details see Sokolov & Rintoul, JGR, 2009a,b.            " ;
		:history = "Generated on Tue Nov  3 10:49:18 EST 2009." ;

To interpret this you need to read the year, month, and day variables and turn the combination into dates... Doable, but this file doesn't advertise any conventions... so, total one off as far as I know. Have you seen this pattern more than once? Could code it up but might be needless complexity? Other than the dimension named "time" there's really nothing to go on here to guess that this is a T dimension -- as far as some code is concerned it could just as well be a Z dimension! I could see adding a list of dimension / variable names to grep for to try and match these edge cases -- but probably easier to let the user optionally specify coordinate variables?

I think that NSIDC file came from me... it is also available here: https://nsidc.org/data/NSIDC-0736
This had me super confused till I literally looked at it sideways. You are NOT supposed to do strings like this!!! This should be 274 strings of length 10 not 10 strings of length 274!

If this was written the right way and there were a coordinates attribute, or standard_name attribute, or axis attribute, to go on, the function would get it -- but there's really nothing to go on here.

ncdump -v time alaska_2007_2008_swe_v01.nc 
netcdf alaska_2007_2008_swe_v01 {
dimensions:
	west_east = 327 ;
	south_north = 237 ;
	days = 274 ;
	charlength = 10 ;
variables:
	double lat(south_north, west_east) ;
		lat:coordinates = "XLONG XLAT" ;
		lat:units = "degrees_north" ;
	double lon(south_north, west_east) ;
		lon:coordinates = "XLONG XLAT" ;
		lon:units = "degrees_east" ;
	double elevation(south_north, west_east) ;
		elevation:coordinates = "XLONG XLAT" ;
		elevation:units = "meters" ;
	double swe(days, south_north, west_east) ;
		swe:coordinates = "XLONG XLAT" ;
		swe:units = "mm" ;
	double mask(south_north, west_east) ;
		mask:coordinates = "XLONG XLAT" ;
		mask:units = "[]" ;
	char time(charlength, days) ;
		time:units = "[]" ;

// global attributes:
		:title = "OUTPUT FROM WRF V3.6.1 MODEL" ;
		:creation_date = "22-May-2017 16:16:33" ;
		:wateryear = "2008" ;
		:dx = 9000.f ;
		:dy = 9000.f ;
		:map_projection = "Lambert Conformal" ;
data:

 time =
  "2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222",
  "0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
  "0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
  "7777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777788888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888",
  "----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------",
  "1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
  "0000000000000000000000000000000111111111111111111111111111111222222222222222222222222222222211111111111111111111111111111112222222222222222222222222222233333333333333333333333333333334444444444444444444444444444445555555555555555555555555555555666666666666666666666666666666",
  "----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------",
  "0000000001111111111222222222233000000000111111111122222222223000000000111111111122222222223300000000011111111112222222222330000000001111111111222222222200000000011111111112222222222330000000001111111111222222222230000000001111111111222222222233000000000111111111122222222223",
  "1234567890123456789012345678901123456789012345678901234567890123456789012345678901234567890112345678901234567890123456789011234567890123456789012345678912345678901234567890123456789011234567890123456789012345678901234567890123456789012345678901123456789012345678901234567890" ;
}

p.s. That character axis order issue above is one of the goofiest things I've ever seen in a NetCDF file. It demonstrates how axis order works in NetCDF really well! And is totally the wrong way to write character arrays!

Oh I didn't notice the decomposed date thing in ACCFronts, all is well - no auto possible!

That string thing had me stumped for ages, tidync now breaks them up with strsplit(, "") but my stars work uses rawToChar - I think the string split is the best way to do it, though RNetCDF maintains the dimension with rawchar = TRUE, but you need to convert them individually to character. Maybe raw values is right, though. It's totally nonsense to have data saying it has a dimension but it does not. I am considering bringing that up with RNetCDF itself - these things just aren't automatable the way they are (definitely "the netcdf way" in my experience! )

Just for kicks...

library(RNetCDF)
nc <- create.nc("char.nc", clobber = TRUE)
dim.def.nc(nc, "char", 5)
dim.def.nc(nc, "string", 1)

var.def.nc(nc, "right", "NC_CHAR", c("char", "string"))
var.put.nc(nc, "right", "right")

var.def.nc(nc, "wrong", "NC_CHAR", c("string", "char"))
var.put.nc(nc, "wrong", "wrong")
# Warning message:
# Strings truncated to length 1 

var.put.nc(nc, "wrong", "w", c(1, 1))
var.put.nc(nc, "wrong", "r", c(1, 2))
var.put.nc(nc, "wrong", "o", c(1, 3))
var.put.nc(nc, "wrong", "n", c(1, 4))
var.put.nc(nc, "wrong", "g", c(1, 5))

var.get.nc(nc, "wrong")
# [1] "w" "r" "o" "n" "g"

var.get.nc(nc, "right")
# [1] "right"

close.nc(nc)
$ ncdump char.nc 
netcdf char {
dimensions:
	char = 5 ;
	string = 1 ;
variables:
	char right(string, char) ;
	char wrong(char, string) ;
data:

 right =
  "right" ;

 wrong =
  "w",
  "r",
  "o",
  "n",
  "g" ;
}

very nice ;)