Awesome-forests is a curated list of ground-truth forest datasets for the machine learning and forestry community. The list targets data-based biodiversity, carbon, wildfire, ecosystem service, and general ecosystem analysis.
Getting started with data science in forests is HARD. The lack of organized datasets is one reason why. So, this list of datasets intends to get you started with analysing your forests.
This is a wide open and inclusive community; please feel free to add your favorite datasets via a pull request.
Photo of a dog in a forest, by Jamie Street on Unsplash
- Tree species classification
- Tree detection
- Biodiversity
- Tree crown segmentation
- Carbon quantification
- Forest type classification
- Change detection
- Wildfire
- Raw geospatial imagery
- Awesome-awesome
-
IDtrees NIST NEON (Weecology, University of Florida, NEON, 2020)
A tree species classification dataset from β3 National Forest sites, USA, with β400 labeled trees of β20 species with airborne RGB, Hyperspectral and Lidar imagery. -
Kaggle Forest Cover Type (USFS, 2013?)
A tree species classification dataset from Roosevelt National Forest, USA, with β15k labeled and β565k unlabeled trees with cartographic variables. -
Pasadena Urban Trees (Caltech, 2016)
A tree species classification dataset from urban Pasadena, USA, with β 80k labeled trees of 18 species with airborne and ground RGB imagery. -
Open AI Challenge: Aerial Imagery of South Pacific Islands (WeRobotics, Worldbank, 2018)
A tree species classification dataset from Kingdom of Tonga with 50kmΒ² data of 4 species with airborne RGB imagery.
-
Raw urban street tree inventory data (USFS, 2006-2013)
A raw dataset from 49 cities in California, USA, with β930k trees with forest structure variables (e.g., tree species, height, DBH, crown). -
New York City Street Tree Map (NYC Parks, ?-2021)
A raw dataset from urban New York City, USA, with >680k trees of >230 species. -
Raw data for urban trees in California communities (USFS, 2007-2012)
A raw dataset from urban California, USA, with β4k trees with forest structure variables (e.g., tree species, height, DBH, crown). -
NEON Woody Plant Vegetation Structure (NEON)
A raw dataset from 49 US national forests with forest structure variables (e.g., tree species, height, DBH, low-res. GPS)
-
DeepForest WeEcology NEON (Weecology, NEON, UofFlorida, 2018)
A tree detection dataset from β22 National Forest sites, USA with >15k labeled and >400k unlabeled trees with airborne RGB, Hyperspectral, and Lidar imagery. -
Kaggle Aerial Cactus Identification (CONACYT)
A cactus detection dataset from Mexiko with 17k cacti with airborne RGB imagery. -
Swedish National Forest Data Lab: Forest Damages β Larch Casebearer 1.0. (Swedish Forest Agency 2021)
A tree detection and classification dataset from 10 sites with RGB drone imagery. In total ~ 102k annotated bounding boxes labeled "Lark" or "other", of which ~ 44,5k are also labeled describing tree damage in four categories.
- see Tree species
-
Kaggle iNaturalist (iNaturalist, FGVC8, 2021)
A flora and fauna species classification dataset from global sites with 2.7M labeled images of 10k species with smartphone imagery. -
Kaggle GeoLifeCLEF 2021 (ImageCLEF, 2021)
A flora and fauna location-based species recommendation dataset from France with 1.9M labeled images of 31k species with satellite imagery and cartographic variables.
-
see Tree species for now.
-
todo: add allometric equations and above- and belowground carbon inventories
- todo
- An Unexpectedly Large Count of Trees in the West African Sahara and Sahel (Brandt et al., 2020)
A raw dataset of the West Sahara with β3k tree crown segmentations.
-
BigEarthNet: large-scale Sentinel-2 benchmark (TU Berlin, 2019)
A landcover multi-classification dataset from 10 European countries with β600k labeled images with CORINE land cover labels with Sentinel-2 L2A (10m res.) satellite imagery. -
Chesapeake land cover (Chesapeake Conservancy, Microsoft, NAIP, USGS, 2013-2017)
A land cover classification dataset from the Chesapeake Bay, USA, of a 6x7kmΒ² area with high- and low-resolution (NLCD) land cover labels with high- (NAIP, RGB-NIR) and low-resolution (Landsat 8, 13-band) satellite imagery. -
Kaggle Planet: Understanding the Amazon from Space (SCCON, Planet, 2017)
A land cover classification dataset from the Amazon with deforestation, mining, cloud labels with RGB-NIR (5m res.) satellite imagery. -
WiDS Datathon 2019: detection of oil palm plantations (Global WiDS Team & West Big Data Innovation Hub, 2019)
Binary palm oil plantation classification with 20k images with Planet RGB (3m res.) satellite imagery -
UC Merced land use dataset(UC Merced, 2010)
A small land cover classification dataset with 2100 images and 21 balanced classes with airborne (0.3m res.) imagery. -
Awesome satellite imagery datasets
A list with more satellite imagery datasets.
-
Dynamic EarthNet challenge (Planet, DLR, TUM, 2021)
A time-series prediction and multi-class change detection dataset of Europe over 2-years with 75 image time-series with 7 land-cover labels and weekly Planet RGB (3m res.) imagery. -
Semantic change detection dataset (SECOND) (Yang et al., 2020)
A land cover change detection dataset in over cities and suburbs in China with β5k image-pairs with 6 land cover classes and airborne imagery. -
ForestNet deforestation driver (Jeremy Irvin, Hao Sheng et al., 2020)
A dataset that consists of 2,756 LANDSAT-8 satellite images of forest loss events with deforestation driver annotations. The driver annotations were grouped into Plantation, Smallholder Agriculture, Grassland/shrubland, and Other. -
Awesome remote sensing change detection
A list with more change detection datasets.
- todo: add datasets for fire detection, fuel moisture quantification, wildfire spread prediction, etc.
-
Global ecosystem dynamics investigation (GEDI) (NASA, University of Maryland, 2021)
A satellite lidar dataset of the globe with topography and lidar pointcloud (100m res.). -
Norway's international climate and forests initiative imagery program (NICFI) (NICFI, Ksat, Airbus, Planet, 2020)
A satellite imagery dataset of tropical rainforests with monthly mosaics of RGB (5m res.) satellite imagery. -
National agriculture imagery program (NAIP) (FSA USDA, 2003-2021)
An airborne imagery dataset of CONUS with RGB-NIR (0.5m res.) imagery. -
see awesome-gis
-
Awesome satellite imagery datasets
A list of more satellite imagery datasets with annotations for deep learning and computer vision. -
Awesome GIS
A list of GIS resources.
- Awesome-forests contains individual entries from Awesome satellite imagery datasets and Awesome remote sensing change detection