Multiple rows and columns introduce false index items
irm-codebase opened this issue · comments
What happened?
Loading a file with multiple rows and multiple columns adds fake indexes sometimes.
See the file:
vintagesteps,2020,2030,2040,2050,2030,2040,2050,2040,2050,2050
investsteps,2020,2020,2020,2020,2030,2030,2030,2040,2040,2050
techs,,,,,,,,,,
geothermal,1,1,1,1,1,1,1,1,1,1
hydropower,1,1,1,1,1,1,1,1,1,1
waste,1,1,0.0,0,1,1,0.0,1,1,1
bioenergy,1,1,0.8,0,1,1,0.8,1,1,1
oil,1,1,1,1,1,1,1,1,1,1
coal,1,1,1,1,1,1,1,1,1,1
ccgt,1,1,0.0,0,1,1,0.0,1,1,1
wind,1,1,0.0,0,1,1,0.0,1,1,1
pv,1,1,1,0.2,1,1,1,1,1,1
battery_li,1,1,0.5,0,1,1,0.5,1,1,1
battery_phs,1,1,1,1,1,1,1,1,1,1
Loaded via:
vintage_availability_techs:
source: data_sources/investstep_series/available_vintages_techs.csv
rows: techs
columns: [vintagesteps, investsteps]
add_dimensions:
parameters: available_vintages
In this case, a fake index called techs
will be added.
Which operating systems have you used?
- macOS
- Windows
- Linux
Version
v0.7
Relevant log output
No response
Another example:
This one fails. Removing the header fixes the issue. Interestingly, the behavior changes depending on whether or not you are using a debugger.
nodes,techs,parameters,values
NORD,ccgt,initial_flow_cap,20000000
NORD,hydropower,initial_flow_cap,11191600
NORD,wind,initial_flow_cap,115600
NORD,pv,initial_flow_cap,8319100
NORD,battery_phs,initial_flow_cap,5064300
NORD,battery_phs,initial_storage_cap,469050200
NORD,waste,initial_flow_cap,384700
NORD,bioenergy,initial_flow_cap,2159900
CNOR,ccgt,initial_flow_cap,2000000
CNOR,hydropower,initial_flow_cap,1100900
CNOR,wind,initial_flow_cap,133600
CNOR,pv,initial_flow_cap,2270800
CNOR,waste,initial_flow_cap,23000
data_sources:
# Initial setup
initial_tech_capacity_params:
source: data_sources/initial_capacity_techs_kw.csv
rows: [nodes, techs, parameters]
OK, so this is a limitation of what we can ask of pandas.
A workaround:
data_sources:
# Initial setup
initial_tech_capacity_params:
source: data_sources/initial_capacity_techs_kw.csv
rows: [nodes, techs, parameters]
columns: [values]
drop: values
I can only reproduce this issue with your second example. The first one loads just fine.
I can only reproduce this issue with your second example. The first one loads just fine.
Odd, that's the one I saw as most problematic. I'll give an update if I can reproduce it...
For the second: I would like to propose that this type of "dropping" should be the standard, to ensure the files given to the model are "stand alone". Otherwise, you'd need to always consult two files. This way, you have good data practices "baked in".
What do you think?
Hmmm, that is true...
The only way to make it possible would be to force one type of table (i.e. rows only), which would make the input very inflexible).
Plan: enforce a header to always exist in a CSV, even if it is just one row. We will set header=0
as the bare minimum internally.