The fire started on January 7th and burned for 24 days before it was contained, destroying most of the areas of the Pacific Palisades, Topanga and Malibu. It burned 23,7070 acres, killed 12 people (most of whom were first responders), and destroyed 6,837 structures.
According to the damage assessment reports in the CAL FIRE DINS, among those 6,837 structures were, mostly, single family homes, multi-family units, apartments buildings, mobile homes, businesses and public infrastructure. Entire neighborhoods and their infrastructure gone in less than a month.
Damage Assessment follows standardized criteria set by FEMA, FIRESCOPE, and DINS, meaning that personnel must visit each property and collect data manually on the structures condition.
Terrain, vegetation coverage, and the scope of the fires made locating damaged or destroyed strucures very difficult, while identifying structure types became nearly impossible.
You can listen to damage inspectors from LA County Fire and CalFire talk about the process below.
For emergency responders on the ground, having access to as much information as possible about a post-disaster setting could be the difference between life and death. For families, it could be the difference between knowing if the area their homes are in is safe– or if they even have a home to return to. Yet, it relies heavily on trained ground crews conducting visual inspections, a process that can be inefficient, dangerous and labor intensive in areas with severe destruction, hazardous conditions, or debris.
Convolutional Neural Networks are designed to process data in the form of multiple arrays, exploiting the natural property of signals to arrange themselves into compositional hierarchies, where higher-level more complex features are built out of simpler, lower level ones.
First, a feature extractor identifies lower-level elements of an image, then, a multilayer perceptron computes the probabilities of a given class being present in the output by "perceiving an image from the bottom up.
Think about what makes a house a house. Houses have roofs, they have windows, they have a driveway. The more of those smaller level features an input image has, the higher the probability that it's a house. This makes CNNs a natural choice for feature identification and classification– they are made to detect patterns that form naturally in remotely sensed imagery.
1
1
1
With the scale and severity of wildfires increasing each year, there's a pressing need for improved tools and methodologies for damage assessment.
Advances in remote sensing platforms have enabled the leveraging of computer vision and deep learning techniques with high accuracy levels.
1
1
1
As part of my Master's project, high resolution satellite imagery of the 2025 Palisades Fire’s post disaster zone was used to train a deep learning model in ArcGIS Pro 3.4.2 that classified building footprint as damaged or undamaged to showcase the capabilities and limitations of ESRI’s Deep Learning Tools as part of a deeper discussion about the applications of GeoAI to the damage assessment process.
Pacific Palisades & Sunset Mesa
Two neighborhoods in Los Angeles, California that were severely impacted by the wildfires. The study area was selected to only include urban areas within the recorded Palisades fire perimeters.
The feature classifier trained in this project uses a CNN architecture called Resnet34, which solves image classification problems using residual learning. It requires two inputs: a high resolution raster and a feature class layer that defines the boundaries of each feature.
Using these files, I based my process for training and deploying the model on ESRI's deep learning Workflow for Object Classification. First, I prepared my imagery, I made a training dataset, I trained the model, used the model, and prepared my model's results for accuracy assessment.
However, ESRI's deep learning tools also require an NVIDIA GPU with CUDA compute capability of 5.0 of higher and a minimum of 6GB of dedicated graphics memory, though they recommend 16GB or more.
My computer doesn't meet these minimum requirements, so I had to get creative. That's where I came up with what I like to call the "Cookie Cutter Method."
The computer used for the analysis is loaded with an NVIDIA GEForce RTX 3060, which is CUDA enabled, but only has 4GB of dedicated. graphics memory
As opposed to analyzing a large extent of data, the CCM subdivides the study area into digestible "cookies" the computer can handle, decreasing analysis runtime and ensuring there are no errors in the tool.
Subdivisions can then be merged into a single layer once DLM inferencing is complete.
To get started, I acquired a high-resolution image. The imagery was made available by Yer Çizenler, a Non-Government Organization that supports the use of free and open geospatial data in humanitarian activities and can be accessed via the OpenAerialMap repository. The raster is a 31cm resolution Maxar GE01 mosaic of several images covering the northmost portion of LA, stretching from Santa Monica to San Bernadino Forest. The imagery was trimmed down to encompass the boundaries of the study area, as shown below.
Building footprint for the region of Southern California was made available by OpenStreetMap and acquired from Geofabrik’s free download server. Since the most recent OSM data files provided have already removed buildings which were destroyed in the fires, an older file from December 2024 was used.
A polygon was subdivided into 20 parts. The processing computer was found to be able to handle an area of about 232 acres and produce results relatively quickly.
This was used as a basis to cut the other layers
The imagery was cut with the Split Raster tool. Nearest neighbor resampling method was used because it minimized changes to pixel values, which would help avoid any changes that might affect land cover values for the DLM.
Building Footprint was cut using the Pairwise Clip tool, a process I tried to automate with Python but found I couldn't due to the limitations of ArcPy. A time consuming but necessary process either way.
1
1
1
1
1
This is what all of those layers look like once you've cut them with the CCM. I ended up with a total of 40 layers I could then input into my model after I trained it.
Training samples were generated using the Create and Export Training Samples section of the Classify Objects Deep Learning Workflow. A training schema was crated with two classes– Damaged and Undamaged. A total of 2000 corresponding buildings were added to each class. Buildings of various shapes and sizes were selected since the final model would consider these factors when generating a classification. Houses all tend to look pretty different, so I wanted to have a diverse dataset.
The Palisades model was trained using the extracted image chips. The model specified was Feature Classifier, and a total of 50 epochs were used for the model’s preliminary training. Validation percent was set to 8% since the data consisted of hundreds of image chips. All other parameters were set to default since, according to ESRI, these produce good results.
Once the model is trained by the Train Deep Learning Model tool, the model’s learning rate, training & validation loss, and the average precision score are generated by the tool. For the purposes of assessing the training’s success, only training & validation loss and the average precision score were considered.
The validation set approach randomly divides the data into two sets– Training and Validation. First, the model is fit on the training set, and the extra data is used to test the fitted model. The resulting training and validation loss rates provide an unbiased sense of model effectiveness.
In the case of the Palisades model, the graph displays a smooth learning curve, indicating that the model’s performance improved steadily and consistently over each epoch. The Palisades model stopped improving at 26 epochs, as shown by the convergence between the training and validation loss curves.
Additionally, it performed with a 99% accuracy on its last epoch.
A confusion matrix graph was also generated by the tool, indicating the model’s overall accuracy metrics. Though small, the model had a 1% chance of misclassifying a building as damaged when it was actually undamaged. This is something I'd see pop up again once I actually ran my inferencing using the data I'd prepared earlier.
The adequate Input Raster and Building Footprint layer were run through the Palisades model for each of the CCM subdivisions.
The Classify Objects Using Deep Learning tool yielded 20 layers of classified building footprints using the CCM. The layers were all combined into a single layer using the Merge tool.
Buildings that were cut through by the CCM were reviewed and merged manually into one object
Once I ran all 40 of my layers through the Palisades model, buildings that have been classified can be put into a map just like this one after merging all 20 of the output layers. As you can see below, buildings were labeled as damaged or undamaged.
The Create Accuracy Assessment Points tool was used on the classified data, generating 1000 points which were ground-truthed manually. Afterwards, The layer containing the classified and ground-truthed accuracy assessment points was run through the Compute Confusion Matrix tool.
The most common way of assessing a model’s accuracy is through the confusion matrix. I used an equalized stratified sampling method with 500 samples in each class which I ground truthed manually. The confusion matrix generated by the tool showed that the model has a tendency to under report buildings as undamaged, while overpredicting buildings as damaged. We saw this during training, so I wasn’t too surprised.
The main metrics shown in a confusion matrix are User’s Accuracy, Producer’s Accuracy, Overall Accuracy and the Kappa coefficient.
User’s Accuracy (U_Accuracy on the table) tells us how likely it is that what we see has been classified on the map by the model is accurate. These are known as errors of commission, where the model identified buildings incorrectly when only looking at a certain class.
So, when considering only damaged buildings, it identified buildings as damaged correctly 98% of the time and incorrectly marked the other 2% as undamaged.
However, it struggled with classifying buildings as undamaged, only hitting the mark 90% for undamaged buildings and marking the remaining 10% as damaged. This is that overrepresentation of the damaged class that I mentioned previously.
Producer’s accuracy (P_Accuracy), on the other hand, shows our errors of omission. It represents the probability that a feature on the ground is correctly classified on a map, so it includes both the damaged and undamaged classes.
For the damaged class, only 90% of buildings it included were actually damaged. The rest were undamaged buildings which were wrongly classified.
Conversely, 98% of the buildings it included in the undamaged class were undamaged, with the other 2% having been omitted from the damaged class.
The Palisades model’s performance improved steadily and consistently. The model was able to classify buildings as damaged or undamaged correctly 94% of the time.
It’s possible to integrate DLM technology into the damage assessment process with a limited budget and limited processing power.
The output map generated by the Palisades model would be an excellent starting point for damage assessment crews, especially if this type of analysis is added to standard protocols. Damage assessment data can be combined with other metrics to help paint a broader picture of a disaster zone.
If it works so well, why aren’t agencies all over the US jumping to incorporate this technology into the damage assessment process?
Learned features are not always suitable for the characteristics of target datasets, meaning that emergency management agencies would have to modify pre-trained DLMs, which would require the creation of a large and diverse dataset.
Training, fine tuning or deploying a DLM requires a high-powered computer. CUDA enabled datacenter products cost hundreds of thousands of dollars, and even the more accessible GPUs are in the thousands.
Regulation is built on experience. FEMA is already integrating DLMs into the damage assessment process in an effort to try and get a clear picture of a disaster zone within 72hrs.
DLMs are malleable. They could be tweaked to identify damaged buildings after earthquakes, hurricanes, wildfires and even after damage resulting from conflicts or in active warzones.
GIS has always needed a hands-on, collaborative approach to new technologies. Applying pre-trained DLMs to new and diverse datasets could identify gaps in performance that could be fine-tuned to create higher accuracy results.
You can read my entire Master's Project and its abstract below, if you'd like to get a deeper dive into my process, results, and thoughts. This webpage was adapted from my oral presentation as a part of my graduation requirements. A Youtube Video of that presentation recording is coming soon!
Providing preliminary damage reports is essential to residents of post-disaster zones who need this information while planning their return to their property. As fire size, severity and frequency increase, it may become harder for local authorities to assess the amount of damage caused by these fires in a timely manner. High resolution satellite imagery of the 2025 Palisades Fire’s post disaster zone was used to train a deep learning model in ArcGIS Pro that classifies building footprint as damaged or undamaged. The model performed with high scores on several accuracy metrics, showing that off the shelf deep learning models can be applied to new data and trained to near perfect agreement, even on less powerful computers. With deep learning tools becoming more accessible, it may be wise to incorporate them as part of post disaster measures to maintain the public informed with real-time and accurate information. However, while these tools can be used alongside other demographic data to form relevant and informative damage reports, they suffer from accessibility issues like high imagery prices, high computing requirements, and expensive licensing that could make it difficult to apply this emerging technology in a broad range of scenarios.