Learning county from pixels: Corn yield prediction with attention-weighted multiple instance learning
In this paper we use 2022, you can choose the year you want.
2018 and before: link
Download file: county -> cb_2018_us_county_500k
2019 and after: link
Download file: Counties -> 500,000 (national) shapefile [11 MB]
Unzip the .zip file and only need the .shp file
CEC: Cation Exchange Capacity
SOM: Soil Organic Matter
AWC: Avail. Water Holding Capacity
You need the GEE python installation first link
You need anaconda or miniconda, run:
conda env create -f geeenv.yaml
to create the conda environment.
Then run:
python get_data_mir.py \
--account projects/ee-xwang2696/assets/ \
--folder data \
--shp_file_path mir/cb_2016_us_county_500k \
--cec_file_path mir/cec \
--som_file_path mir/som \
--awc_file_path mir/awc \
--states 38 46 27 55 19 17 18 39 29 20 31 26 \
2>&1 | tee log.txt
to download the dataset. You need to modify the arguments "--account --folder --shp_file_path --cec_file_path --som_file_path --awc_file_path" based your own setting. The meaning of these parameters are shown in the argparse.
You need anaconda or miniconda, run:
conda env create -f attmir.yaml
to create the conda environment.
run:
python preprocess.py \
--process post \
--input_root ../input/ \
--output_root ../dataset/ \
--data_dir .././1100pixels/data/ \
--N 100 \
--startday 70 \
--endday 278 \
--startyear 2008 \
--endyear 2022 \
--usevar All \
--statefp 38 46 27 55 19 17 18 39 29 20 31 26
to make a clean and usable dataset. You need to modify the arguments "--process --input_root --output_root --data_dir --N" based your own setting. The meaning of these parameters are shown in the argparse.
method="att"
mkdir -p ../output/log/${method}/
mkdir -p ../output/result/${method}/
python AttMIR.py \
--input_root ../dataset/100from3000/ \
--output_root ../output/result/${method}/ \
--N 100 \
--lr 0.0001 \
--epoch 500 \
--train_info 500 \
--num_heads 1 \
--hidden_size 159 \
--visualize \
--testyears 2022 \
--enddays 278 \
--usevars NDWI GCI EVI LSTday LSTnight ppt tmax tmean tmin vpdmax vpdmin awc cec som historical year \
|& tee ../output/log/${method}/1.txt
You need to modify the arguments "--input_root --output_root --N" based your own setting. The meaning of these parameters are shown in the argparse.
method="mlp"
mkdir -p ../output/log/${method}/
mkdir -p ../output/result/${method}/
python MIR_baseline.py \
--input_root ../dataset/2from1100/ \
--output_root ../output/result/${method}/ \
--model mlp \
--N 2 \
--visualize \
--testyears 2018 2019 2020 2021 2022 \
--enddays 278 \
--usevars All \
|& tee ../output/log/${method}/1.txt
You need to modify the arguments "--input_root --output_root --N" based your own setting. The meaning of these parameters are shown in the argparse.
method="lr"
mkdir -p ../output/log/${method}/
mkdir -p ../output/result/${method}/
python MIR_baseline.py \
--input_root ../dataset/2from1100/ \
--output_root ../output/result/${method}/ \
--model lr \
--N 2 \
--visualize \
--testyears 2018 2019 2020 2021 2022 \
--enddays 278 \
--usevars All \
|& tee ../output/log/${method}/1.txt
You need to modify the arguments "--input_root --output_root --N" based your own setting. The meaning of these parameters are shown in the argparse.
method="ridge"
mkdir -p ../output/log/${method}/
mkdir -p ../output/result/${method}/
python MIR_baseline.py \
--input_root ../dataset/2from1100/ \
--output_root ../output/result/${method}/ \
--model ridge \
--N 2 \
--visualize \
--testyears 2018 2019 2020 2021 2022 \
--enddays 278 \
--usevars All \
|& tee ../output/log/${method}/1.txt
You need to modify the arguments "--input_root --output_root --N" based your own setting. The meaning of these parameters are shown in the argparse.
method="randomforest"
mkdir -p ../output/log/${method}/
mkdir -p ../output/result/${method}/
python MIR_baseline.py \
--input_root ../dataset/2from1100/ \
--output_root ../output/result/${method}/ \
--model randomforest \
--N 2 \
--visualize \
--testyears 2018 2019 2020 2021 2022 \
--enddays 278 \
--usevars All \
|& tee ../output/log/${method}/1.txt
You need to modify the arguments "--input_root --output_root --N" based your own setting. The meaning of these parameters are shown in the argparse.