calliope-project / calliope

A multi-scale energy systems modelling framework

Home Page:https://www.callio.pe

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot define numeric index items when loading tabular data

brynpickering opened this issue · comments

What happened?

If you have a data dimension that you want to be numeric (e.g. years) then you can't load it from file as pandas will load it from CSV as strings. If you load the same data from YAML the index will be numeric. If you have a bit of data in YAML and a bit in tabular data then you get duplicated index items, one being the string format and the other the numeric format.

Why might we care?

If you have numeric dimensions, you can use them in the math. E.g., you can subtract technology lifetime from the investment year in pathway analysis. This isn't currently possible because the investment year is a timestamp and you can't subtract an integer (lifetime) from a timestamp.

A few possible workarounds

  1. If you want a numeric index, just use YAML
  2. Use the YAML index dtype to coerce the tabular data index dtype to numeric. This works if you define a bit of data in YAML.
  3. Always try to coerce all tabular data dimensions to numeric dtype (int then float if that fails?). This works if you don't define any data in YAML but still expect numeric data in CSV to lead to numeric dimensions.

(3) would seem "safest" as it doesn't rely on any YAML having been defined. However, it could be a problem if a user wanted their numeric-like data to actually be strings.

Which operating systems have you used?

  • macOS
  • Windows
  • Linux

Version

v0.7.0.dev2

Relevant log output

No response