Skip to content

Project methods overview

Data Sources

Data Processing Steps

Constructing a mill database

We mapped mills by first collecting three sets of candidate mills from OpenStreetMap, the Colorado Forest Products Database, and the Wyoming Wood Products Facilities database. For every candidate mill, we used Google Maps to search for the address, then visually interpreted imagery from the Google Satellite basemap layer to detect evidence of milling. Our criteria were 1) visible stacks of logs and/or boards, or 2) visible sawdust piles. While these criteria would exclude small operations or those that are entirely indoors, we decided to accept this as a limitation as our eventual goal was to assess the use of remote sensing data for monitoring mill activity. Each mill was categorized as having visible logs/boards or a sawdust pile, and this information was entered into a shared spreadsheet. Additionally we geocoded addresses to get spatial coordinates of mill locations.

Data Analysis

segment_geospatial

Using the samgeo package in Python (https://samgeo.gishub.org/) we tested three methods for detection of log piles. We envision this to be a quick way to distinguish area of log piles at a sawmill, potentially serving as a metric of production. We tested these methods on one site.

We first tested text prompting, which ended up being our best performing method. Using the code provided on the package website (https://samgeo.gishub.org/examples/text_prompts/), we entered the phrase "find stacks of lumber or stacks of logs". The center of the sawmill (coordinates on OSM) were used as the center of the plot and area of interest (AOI) was bounded by the min/max long/lat of the sawmill area. The code exports a tif, which can then be viewed in QGIS or another GIS software. Areas identified as log piles are white pixels while the background is black. If someone was interested in finding or tracking the area of log piles, you would find the area of one pixel then find the number of pixels coded as white.

We then tested input prompting, which entailed dropping a point on log piles and dropping points on the background to differentiate. We followed the code provided on the package website (https://samgeo.gishub.org/examples/input_prompts/). This method highlighted the majority of the the sawmill, so it would not be useful for identifying log piles.

Finally we tested the automatic mask generator (https://samgeo.gishub.org/examples/automatic_mask_generator/) which distinguished all the unique options in the AOI. This performed pretty well, but did not pick up as many log piles as the text prompting.

Mapping disturbances within mill isochrones

To understand how mills might be impacted by disturbance we quantified overlap of mill isochrones (areas accessible if one could haul wood up to 62 miles) and wildfires (from the MTBS data). Mill isochrones were computed using the mapboxapi R package. While wood can be hauled distances greater than 62 miles, the API restricts the maximum travel distance to 100 km (~62 miles). Once we overlaid isochrones and fire, we computed the fraction of isochrone that had burned for each mill.

Classification in Google Earth Engine (GEE)

  1. Feature Collection and Labeling

Create a FeatureCollection to represent your area of interest, specifically areas containing "log piles" (https://github.com/CU-ESIIL/FCC24_Group_4/assets/161641043/e1b42e37-8505-4758-891b-30d332b37e7c). This collection carries out the supervised learning by providing labeled examples that the classifier uses to learn the characteristics of different classes. Each feature within your collection is labeled according to its class (e.g., log piles are labeled with a class identifier), enabling the classifier to distinguish between the features you're interested in and the background.

  1. Import NAIP Imagery

NAIP (National Agriculture Imagery Program) imagery has been selected for its high-resolution and multispectral capabilities, providing detailed visual and near-infrared (NIR) information. This imagery is particularly useful for identifying detailed features on the Earth's surface, making it an excellent choice for tasks requiring fine spatial resolution, such as identifying small or dispersed features like log piles.

3.Image Preprocessing and Band Selection

Filter the NAIP imagery based on geographic bounds and date range, ensuring that the analysis focuses on relevant and timely data for your area of interest. By selecting specific bands ('R', 'G', 'B', 'N'), you tailor the input data to include both visual and NIR information, which is essential for distinguishing different materials and conditions (e.g., vegetation vs. non-vegetation, wet vs. dry materials).

  1. Training Data Preparation

The training data is prepared by sampling the NAIP imagery at the locations of your labeled features. This step extracts the spectral information from the selected bands at each labeled location, creating a dataset that associates this spectral information with the known class labels. This dataset forms the basis of the training phase, where the classifier learns the relationship between spectral signatures and class labels.

Naive Bayes Classification

We employed a Naive Bayes classifier, a probabilistic model that assumes independence between the features (in this case, the spectral bands). It works by calculating the probability of each pixel belonging to a given class based on the spectral information and the patterns learned during training. The pixel is then classified into the class with the highest probability.

Finally, the classified image is visualized on the map, with pixels colored according to their assigned class. This visualization helps in assessing the classifier's performance and understanding the spatial distribution of the identified features (log piles) within the imagery.

Random Forest Classification

Using the guidelines and syntax from google earth engine (https://developers.google.com/earth-engine/guides/classification), we attempted to do a random forest classification on training data of sawmill log piles. We performed a classification on the NAIP imagery from the 6 sawmill sites that weren't used for training data. This code was able to run and create an output, but only when performed at a low resolution. This low resolution is likely unable to identify the log piles and any unique spectral signature. When running at a higher resolution (<2 meters), it returned an error of "user memory limit exceeded." Future work should include a more robust dataset of log and non log training data, and be performed in the cloud to utilize higher computing power and memory!

Visualizations

We've collected visualizations in the slide presentation. Below are a few visualizations of our main methods:

OpenStreetMap Image 1: Contributing sawmill locations in Colorado and Wyoming to Open Street Map.

SamAutomaticMaskGenerator Image 2: Raw satellite imagery and classification created by SamAutomaticMaskGenerator.

GoogleEarthEngineClassification Image 3: Classification of logpiles in Google Earth Engine.

NaiveBayes Image 4: Example of feature collection based on NAIP in Naïve Bayes.

Conclusions

We have a map of confirmed saw mills that could possibly be monitored using remote sensing data. Segmentation of key state variables (area covered by logs, wood products, and sawdust) seems possible, but additional work would be needed to generate reliable estimates. Mill detection and monitoring may be an underutilized tool that could support applications related to natural climate solutions, natural resource management, and finance.


Last update: 2024-05-09