Datasets

For this project, there were 5 datasets that were used to train the machine learning models: Land Surface Temperature (LST), Normalized Difference Vegetation Index (NDVI), Burned Area (BA) and Thermal Anomalies (TA), California Fire Incident Report and the California City and County Boundaries datasets.

dataset

The LST, NDVI, BA, and TA are datasets that are created using Moderate Resolution Imaging Spectroradiometer (MODIS), or satellite, data. After preprocessing the four MODIS datasets, the BA was used as a mask to target fire and no_fire regions, to extract the LST, NDVI and TA values from those regions. Then the fire and no_fire classifications were extracted from the California Fire Incident Report, combined with the LST and NDVI datasets and finally the combined dataset was utilized to train the models to predict fire or no_fire.

snapshot of dataset and class distribution

When the datasets were structured based on the California county data, the BA mask was not utilized. Instead the counties acted in a similar fashion as the BA dataset, and were used to extract the LST, NDVI and TA values based on the county regions. Then the combined datasets were used to train new models based on the county dataset.

Land Surface Temperature

LST Documentation graph

Normalized Difference Vegetation Index

NDVI Documentation graph

Thermal Anomalies

BA/TA Documentation graph

California Fire Incident Report

Fire Incident Report Documentation california fire incident report

California City and County Boundaries

California City and Boundaries california counties