This repository contains all the scripts used for the python class for JRFs at IITM
Some of the tutorials and resources to follow
- http://pure.iiasa.ac.at/id/eprint/14952/1/xarray-tutorial-egu2017-answers.pdf
- https://rabernat.github.io/research_computing/xarray.html
- Ocean Data Analysis https://currents.soest.hawaii.edu/ocn_data_analysis/exercise_data.html#id1
- Parallelization http://xarray.pydata.org/en/stable/dask.html
- Satellite Data Analyis https://github.com/nansencenter/nansat-lectures
- https://github.com/NCAR/CESM_postprocessing CESM Postprocessing
- https://github.com/NCAR/PyCect This repo is used to compare the results of a set of new CAM simulations against the accepted ensemble
- https://github.com/nichannah/ocean-regrid Regrid ocean reanalysis data from normal to tripolar grids
- https://github.com/jswhit/gfstonc Read GFS sigma and sfc files in python
- f2py
- Pandas
- https://github.com/tmiyachi/data2gfs Make python version of this using f2py
- Shallow water equation model using pyspharm https://github.com/jswhit/pyspharm and https://www.aosc.umd.edu/~dkleist/docs/shtns/doc/html/shallow_water_8py-example.html explaining the code
- Scientific Computing Lectures https://github.com/jrjohansson/scientific-python-lectures
- Geopandas satellite data analysis https://towardsdatascience.com/satellite-imagery-access-and-analysis-in-python-jupyter-notebooks-387971ece84b
- Rasterio https://medium.com/analytics-vidhya/satellite-imagery-analysis-with-python-3f8ccf8a7c32
- Eo-learn https://medium.com/dataseries/satellite-imagery-analysis-with-python-ii-8001e5c41a52
- Satpy
- Use of Landsat and Sentinel datasets
- Pyunicorn
- Keras, tensorflow, pytorch, django, theano, scikit-learn, theano, bokeh, pandas, seaborn, bokeh, plotly, scrapy,
- Python tutorial https://carpentrieslab.github.io/python-aos-lesson/ plotting CMIP data - highlight
- Python for oceanography http://www.soest.hawaii.edu/oceanography/courses/OCN681/python.html
- Python tools for oceanography https://pyoceans.github.io/sea-py/
- Python Land Surface Modelling https://www.geosci-model-dev.net/12/2781/2019/
- Python hydrology tools https://github.com/raoulcollenteur/Python-Hydrology-Tools
- Docker
- Python and GIS https://automating-gis-processes.github.io/CSC18/lessons/L1/overview.html
- https://automating-gis-processes.github.io/2016/
- https://geohackweek.github.io/raster/
- https://github.com/pangeo-data/pangeo
- https://github.com/pangeo-data/awesome-open-climate-science
- https://uwescience.github.io/sat-image-analysis/resources.html
- Radar data analysis https://data.world/datasets/radar https://arm-doe.github.io/pyart/ https://docs.wradlib.org/
- https://www.earthdatascience.org/courses/use-data-open-source-python/multispectral-remote-sensing/landsat-in-Python/
- Deep Learning on Satellite Imagery https://github.com/robmarkcole/satellite-image-deep-learning
- Google Earth Engine https://sites.google.com/view/eeindia-advanced-summit/summit-resources
- https://geohackweek.github.io/GEE-Python-API/
- https://github.com/google/earthengine-api/tree/master/python/examples/ipynb
- http://www.jerico-ri.eu/download/summer%20school%20-%20the%20netherlands/Genna%20Donchyts%20-%20GEE%20Training.pdf
- https://www.earthdatascience.org/tutorials/intro-google-earth-engine-python-api/
- Installing Google Earth Engine and requesting access https://github.com/google/earthengine-api/issues/27
- https://github.com/giswqs/earthengine-py-notebooks
- Google Earth Engine image to numpy https://mygeoblog.com/2019/08/21/google-earth-engine-to-numpy/
- Stippling to show statistical significance bradyrx/esmtools#13
- Resampling from swath to grid https://github.com/TerraFusion/pytaf
- Making a docker container for data science https://towardsdatascience.com/docker-for-data-scientists-5732501f0ba4
- Docker commands:
Run interactively: docker run -it manmeet3591/dl:iitm:latest
Install the necessary libraries
Open a new terminal and do docker images to see the id and run the following command
$ docker tag id_ manmeet3591/dl_iitm:v2
$ docker push manmeet3591/dl_iitm:v2
Projects for the class
https://docs.google.com/spreadsheets/d/1m2ZIJ_To8IbE18Teb70a7BVZg0o29sOM6rlgFkE2b3E/edit#gid=0
https://docs.google.com/document/d/12h9bcIdBPJUFc_fJssJe8hVBzledq2Dtk5-9OpKHbfg/edit
-
Homogenous regions India shape files: https://github.com/Cassimsannan/Shapefiles
-
Download CMIP6 data: https://github.com/TaufiqHassan/acccmip6
-
Download MSWEP data from Google drive:
Setup rclone: https://www.youtube.com/watch?v=vPs9K_VC-lg
- Run jupyter notebook from docker container
docker run --rm -it --entrypoint bash -p 8891:8891 manmeet3591/tensortrade
Inside the container jupyter-notebook --ip 0.0.0.0 --port=8891 --no-browser --allow-root &
In the browser http://localhost:8891/
$ rclone sync -v --exclude 3hourly/ --drive-shared-with-me GoogleDrive:/MSWEP_V280 /lus/dal/cccr_rnd/manmeet/AI_IITM/WeatherBench/data/dataserv.ub.tum.de/mswep/.
- Create any number of subplots matplotlib
$ fig,ax = plt.subplots(ncols=2,nrows=4, figsize=(11.69,8.27), subplot_kw={'projection': ccrs.PlateCarree()})
-
Google Earth Engine timelapse gif generator: https://9611d0317f71.ngrok.io/voila/render/timelapse.ipynb
-
Handling expver dimension in a netcdf file downloaded as ERA5 data
ds.reduce(np.nansum, 'expver') Solution from marco venturini https://confluence.ecmwf.int/pages/viewpage.action?pageId=173385064
-
GeoTIFF to netcdf and exporting data from Google Earth Engine https://medium.com/@wenzhao.li1989/nco-translate-geotiff-files-exported-from-gee-to-a-netcdf-file-with-correct-time-dimension-ce97a8f3043f
-
Make pipeline to avoid test data leaking into train https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html
-
Potential evapotranspiration (PET) from netcdf file https://climate-indices.readthedocs.io/en/latest/#
-
t-distributed Stochastic Neighbourhood Embedding (tSNE) versus PCA https://stats.stackexchange.com/questions/238538/are-there-cases-where-pca-is-more-suitable-than-t-sne
-
Stationarity of time series: https://towardsdatascience.com/stationarity-in-time-series-analysis-90c94f27322
-
SARIMAX model: https://towardsdatascience.com/end-to-end-time-series-analysis-and-forecasting-a-trio-of-sarimax-lstm-and-prophet-part-1-306367e57db8
-
Prevent kaggle from disconnecting https://stackoverflow.com/questions/57113226/how-to-prevent-google-colab-from-disconnecting
function ClickConnect(){ console.log("Working"); document.querySelector("colab-toolbar-button#connect").click() } setInterval(ClickConnect,60000)
-
Solving NVIDIA driver installation issues https://stackoverflow.com/questions/42984743/nvidia-smi-has-failed-because-it-couldnt-communicate-with-the-nvidia-driver/51113428#51113428
-
Installing NVIDIA drivers https://www.itzgeek.com/post/how-to-install-nvidia-drivers-on-ubuntu-20-04-ubuntu-18-04.html
-
Install cuda https://www.tensorflow.org/install/gpu
-
bashrc commands for cuda
export CUDA_HOME=/usr/local/cuda-11.0
export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64:/usr/local/cuda-11.0/lib:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-11.0/bin:$PATH
sudo ln -s /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.10 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.11 # https://stackoverflow.com/questions/63199164/how-to-install-libcusolver-so-11
echo | sudo -S ln -s /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.10 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.11
-
Test time augmentation https://towardsdatascience.com/test-time-augmentation-tta-and-how-to-perform-it-with-keras-4ac19b67fb4d
-
Geometric Deep Learning https://geometricdeeplearning.com/lectures/\\
-
DeepSphere Spherical convolutions using Graph convolutions https://github.com/deepsphere/deepsphere-pytorch
-
Interactively logging into Pratyush GPU
$ qsub -I -l select=1:ncpus=1:naccelerators=1:accelerator_model="Tesla_P100-PCIE-12GB" -q gpu
$ source activate py36
$ module load cudatoolkit
$ aprun -n 1 jupyter-notebook --no-browser --ip=0.0.0.0 --port=8890 >> NOTEBOOK_LOGFILE1 2>&1
$ tail -f NOTEBOOK_LOGFILE1
$ Ctrl+C
$ ssh -N -f -L localhost:8888:node:8890 cccr_rnd@nid00019 (Here nid should be the one as seen from NOTEBOOK_LOGFILE1)
$ source activate py36
$ module load cudatoolkit
$ firefox&
To check if a port is being used
$ netstat -antp ! grep -i port_id
The above statement will only work in the interactive login node.
Now start the notebook by noting the link from NOTEBOOK_LOGFILE1
-
Markdown tool https://dillinger.io/
-
LRP for explainable AI (XAI): https://github.com/albermax/innvestigate Application paper: https://arxiv.org/abs/2103.10005
-
Lower tropospheric stability (LTS = θ700hPa − θ1000hPa; Kelvin), which is defined as the difference in potential temperature (θ) between the 700-hPa level and the surface
-
Shallow copy and deep copy in python https://stackoverflow.com/questions/41125834/trying-to-do-a-shallow-copy-on-list-in-python
-
Climate indices https://climate-indices.readthedocs.io/en/latest/
-
Copying to box using lftp mirror -R folder
-
Transforming argparse python code to jupyter notebook: https://stackoverflow.com/questions/37534440/passing-command-line-arguments-to-argv-in-jupyter-ipython-notebook
Simply add the following lines:
import sys sys.argv = ['']
- Install pytorch with cuda
- Running docker containers on NVIDIA DGX A100
$ docker pull nvcr.io/nvidia/tensorflow:20.10-tf2-py3
$ docker run --gpus all -it -v /home/cccr_rnd:/apollo nvcr.io/nvidia/tensorflow:20.10-tf2-py3
Troubleshooting
-
Continue in outer loop using multi-loops https://stackoverflow.com/questions/14829640/how-to-continue-in-nested-loops-in-python
-
Numbering the subplots https://matplotlib.org/3.1.1/gallery/axes_grid1/simple_anchored_artists.html
-
Fortran compilation may sometimes be solved by running the command ulimit -s unlimited
-
There are visualization problems in cartopy if the lon is from 0 to 360 and not from -180 to 180
-
Run docker as a non-root user https://docs.docker.com/engine/install/linux-postinstall/
-
In the first instance of an image sometimes docker hub may deny you to push the image https://stackoverflow.com/questions/41984399/denied-requested-access-to-the-resource-is-denied-docker
-
Numpy to xarray : foo = xr.DataArray(data, coords=[times, locs], dims=["time", "space"])
data = ds_merra2_jjas.DUSCATAU.sel(time='2002').values[0,:,:] lats_ = ds_merra2_jjas.DUSCATAU.sel(time='2002').lat.values lons_ = ds_merra2_jjas.DUSCATAU.sel(time='2002').lon.values ds_merra2_jjas_new = xr.DataArray(data, coords=[lats_, lons_], dims=["lat", "lon"])
-
Using matplotlib to make map plots plt.contourf(ds_merra2_jjas.DUSCATAU.sel(time='2002').lon.values,
ds_merra2_jjas.DUSCATAU.sel(time='2002').lat.values ,
ds_merra2_jjas.DUSCATAU.sel(time='2002').values[0,:,:],
cmap='bwr') plt.colorbar() -
Sometimes xarray plot might show blank, the way to resolve that is select the area and that should work.
-
Pattern correlation formula: https://www.mdpi.com/2073-4441/10/1/28 may use weights as well for the pattern correlation
For the weights, the following can be followed: https://stackoverflow.com/questions/58881607/calculating-the-cosine-of-latitude-as-weights-for-gridded-data
-
When installing packages otherwise difficult to install like ESMF we can set the compiler environment variables such as CC and FC to force conda to install using that particular compiler. This saves a lot of time and effort. https://stackoverflow.com/questions/59284298/conda-install-c-anaconda-gcc-linux-64-not-being-used Many build tools such as make and CMake search by default for a compiler named simply gcc, so we set environment variables to point these tools to the correct compiler.
-
When using the isin function with sel we can at present use it only once in a call. Need to instantiate a new variable for doing it twice.
-
Installing PyRQA (Runs only with python 2.7) https://github.com/szhan/pyrqa
conda install https://anaconda.org/conda-forge/pytools/2017.2/download/linux-64/pytools-2017.2-py27_0.tar.bz2
conda install https://anaconda.org/conda-forge/pyopencl/2018.1.1/download/linux-64/pyopencl-2018.1.1-py27_1.tar.bz2
conda install -c conda-forge pocl
pip install Mako
pip install PyRQA
Even after all this, unable to run pyrqa smoothly. However, this activity ensured that the environment to run pyrqa was perfect. So then clone the github repository and inside the main github repository pyrqa, there is a folder pyrqa. Copy that to your desired location, rename it lets say PYRQA. And use the library as PYRQA.
-
Logging to a remote server without password https://www.thegeekstuff.com/2008/11/3-steps-to-perform-ssh-login-without-password-using-ssh-keygen-ssh-copy-id/
-
Create xarray dataset from dataarrays
-
Use tmux to run a process in the background: https://medium.com/@praveendhawan/tmux-run-commands-in-the-background-bad007810318
-
Indentation error when transferring from jupyter notebook to python script https://stackoverflow.com/questions/1024435/how-to-fix-python-indentation
-
Correcting aspect ratio in cartopy: ax[i,j].set_aspect('auto')