Personal tools
You are here: Home Analysis Products California Wildfire case and the STI Gateway smoke predictions

California Wildfire case and the STI Gateway smoke predictions

last modified Jun 06, 2009 11:02 AM

Excerpts of analysis of model results vs. AirNow and other in-situ observations.

Overview

To evaluate predictive performance and capabilities of the STI Gateway CMAQ modeling system, observed smoke concentration data were compared to model predictions for both the Southern (2007) and Northern (2008) California wildfire events. 

The STI Gateway system utilizes forecast meteorology from a local MM5 model run at 32km, and fire information from the SMARTFIRE fire information system.  Persistence is used to for future fire behavior.  Model runs are performed daily for 72 predictions.  The system utilizes the following modeling pathway for each of the SEMIP modeling steps, enabled by the BlueSky Framework:

        Modeling Steps                                   Gateway Pathway

gatewaypath

The resultant vertical smoke plumes are then fed into the Community Multiscale Air Quality (CMAQ) model.

This analysis uses the 0-23 forecast hours from each run to make a continuous time series of model output, and compares it with in-situ observations from 15 (2007) and 134 (2008) ground stations.

Result highlights

 

  • Unpaired analyses show the model performing at an exceptional level for a predictive system over space and time.
  • Paired analysis results range from exceptional to fair, with most of the paired results unsurprising.
  • The model does well overall predicting PM2.5 surface concentrations within a factor of two for most of the observed concentration range (low-high), however,
  • the model misses the ground concentration peaks either in time or in space (by a grid cell).

Considerations for evaluating the smoke modeling pathway predictive results

Expected results along the predictive smoke modeling pathway should differ from most other model performance evaluations.  There are several factors that influence the results of predictive modeling in association with a fire event.

Predictive fire size

  • Tomorrow's fire behavior (growth, size, location) cannot easily be determined
    • To get around this: today's fire behavior is used as tomorrow's

This inherently starts the modeling process with an initialization error.

Hourly fire growth

Additionally, the fire growth curve used by the system (in the form of the Time Rate of Consumption) is the static WRAP time profile.  While this reflects an "average" wildfire growth curve, it does not reflect the specific hourly fire growth of the fires modeled.  This mismatch is likely to affect timing of the model's peaks and potentially (through the covariation of this with wind shifts) the location of the model peaks compared with observations.

Fuels information

Additionally, fuels information, particularly for the Southern California fires, is known to be an issue because of the close proximity of the fires to the urban centers and issues with the 1-km grid cell fuel maps at these boundaries.

Error propagation

Additionally, with each modeling step their are inherent errors in that step, these errors propagate (error propagation discussion) as the smoke modeling pathway becomes longer.  These errors may magnify or cancel each other out.  In-depth analyses at each modeling step is required in order to characterize the error and each step and the error propagation magnitude and sign.

Model-to-Observations

From previous studies and analyses we expect predictive model results to have systematic biases larger (potentially significantly) than those typical from post-analyses model performance evaluations.  We expect to see this particularly within the paired statistics model performance metrics analyses.  In the unpaired portion of the analyses we would hope for an overall acceptable performance of the modeling pathway. 

Model predictive performance was evaluated against the PM2.5 observation data collected at 15 (2007) and 134 (2008) sites at various station sites located near and around the fires.  Data from the 2007 event were primarily influence by smoke for the entire event while data collected during the 2008 event may or may not have been influenced by smoke.

Unpaired Results

Unpaired results were used to evaluate overall model performance.  The paired (observation, modeled) data were released from their spatiotemporal pairing allowing for evaluation on the total distribution of the data.  This type of analysis answers the question:

Did the model simulate the range and frequency of PM2.5 concentrations observed during the events?

To answer this, we show Quantile-Quantile, Histogram, and Threshold Histograms below.

Quantile-Quantile Plot

With this plot the unpaired data are examined to determine if the the simulated data has a similar data distribution as the observed data.  To achieve this plot, the data are arranged independently from low to high.  If the simulated distribution of data exactly matched the observed distribution then the data would fall along the 1:1 line (center line).  The data falling between the 2:1 (upper) and 0.5:1 (lower) lines indicates that the simulated data are within a factor of two of the observed data. 

1998Q-Q

The 2008 northern California simulated PM2.5 (ug/m^3) data (y axis) fall within a factor of two of the observed data (x axis).   The data distribution of the simulated data are above the 1:1 line indicating an overall trend of over-prediction during the duration of this event.  The over-estimation is greatest in the observed concentration range of 400-600 ug/m^3.

Histogram Plot

Histograms are used examine the frequency of the simulated data with respect to the frequency of the observation data.  The data are binned and the number or frequency that an observed or simulated data point lands in each bin is plotted.

1998fullhist

 The 2008 northern California simulated data (red) display a similar frequency (y axis) to the observation data (columns).  At lower PM2.5 concentrations (x axis) the simulated data do not match the observed data (~10 umg/m^3 and less), instead the simulated data is more evenly distributed with a higher number of data points falling into the bins located around 20 ug/m^3. 

Overall the histograms show the simulated data matching the observed data in frequency, with a few exceptions at the lower end of the concentration scale.

 

Paired Results

Paired results are used to compare simulated data to observation data in space and time.  As expected the results from the paired analyses are less favorable then the unpaired analyses. However, the simulated data are predictive data and were generated with input data fields in the predictive mode, these results should not be directly compared to model performance analyses done during post analysis, when the best input data available can be found.

The paired data evaluation attempts to answer the question:

How well do the simulated data represent the observations in space (at the observation station) and time (hour by hour)?

To answer this question Model-over-Observations and Mean Fractional Bias were used to display and analyze the data.

 

Model-over-Observations Plot

The ratio of modeled data over the observational data plotted against the observation data provides an easy way to quickly examine the paired data and determine if there are trends as to where the simulated data are perform best.  Scatter of the data at the lower concentrations is expected, ideally the data should merge into a triangle like shape with the peaked centered on the ratio=1 line.

1998modeloverobs

For the 2008 northern California event the ratio of the simulated data-to-observations (y axis) show a general under-estimation by the predictive data to the observed PM2.5 concentrations (x axis), particularly at high concentrations (400+ ug/m^3).  For predictive simulated data through the smoke modeling pathway, ranging within a factor of 10-100 of the observation data is not uncommon.  The simulated peaks are within a factor of two in some cases, and this is the standard used by post-event analysis, the fact that predictive data are hitting that range shows that the paired simulated data, in some cases are exceptional.

 

Fractional Mean Bias Plot

There are many standard model performance metrics that are used to evaluate paired data (observed, modeled).  We choose to display the mean fractional bias (MFB), a more basic standard that is used to determine the simulated data bias through data pairs.  MFB is calculated as:

eqnforFMB

where Cm and Co are the modeled and observed values, respectively, and N is the total number of data in the sample size.  MFB is normalized by both the simulated and observation data and it ranges from  ±200% and 0 to 200%, respectively, giving equal weight to an over- and under-estimation (Seigneur et al., 2000, see reference list). 

mfe

 

The MFB (columns) for the southern California wildfire event show a negative bias in the simulated data for all of the monitoring sites (x axis) from which the observation data were collected.  The dashed lines represent the model performance criteria post-analyses are held too (+/- 35%). The MFB that falls near this line shows exceptional predictive model performance.


Categorical forecast model

Because of the issue of the exact timing of model and observational peaks, we examine ways of interpreting the model results to give useful information about the expected observations.

We choose the simplest model, a categorical threshold forecast, where we look for relationships where if the model is "high" we expect the observations to be "high."  These can be expressed in the form :

 

Cm ≥ Tm => Co ≥ To

That is, when the model value Cm reaches or exceeds some model threshold level Tm, does the observed value Co tend to reach or exceed some observational threshold level, ToSuch relationships are determinable by examining the histogram of observed values under different model thresholds Tm

 The following histogram highlights the observational values selected by different levels of Tm.

1997Threshistogram

 

This threshold histogram for the 2007 southern California event displays PM2.5 observed values (x axis) that occurred when the modeled value was greater than a given threshold Tm, for Tm = 0, 5, and 10 ug m-3.  Each histogram is denoted by a line, grey shading is applied between the Tm = 5 and Tm = 10 lines (the grey shaded area = difference between Tm=10 and Tm=5). Note that Tm=0 is equivalent to all observed values (open circles).  Ideally, as Tm increases the observed values selected would contain all observed values above some value. 

The histogram shows that for a given model threshold many different observational values can occur, although the median observational value does increase.   This result leads to an indication of model skill. 

We have attempted to maximize model skill by examining all integer Tm values from 0 to 50 ug m-3.  While beyond the scope of this page due to its being prepared for publication, the results suggest that the maximum skill for determining when observations go above To = 35 ug m-3 is a model threshold of approximately Tm ~ 12 ug m-3.  This balances the probability of detection (making sure observations above 35 are detected) with the false alarm rate.

Despite this, the false alarm rate remains high for any Tm.   The major reason for this is the use of directly paired data in this analysis.  If the model time and observation time are allowed to be slightly different this type of model becomes significantly better.  That is if the question changes from

The model is above Tm at 2pm so the 2pm observations should be above To

to 

The model peaks above Tm in the afternoon so the afternoon observations should peak above To

the probability of detection and false alarm rates become significantly better.

Document Actions