# Section 5 Calibration and Validation

After processing all of the data, the model was fitted using jags. The observation dataset was split into 80% for calibration, 20% for validation.

## 5.1 Parameter Estimates

### 5.1.1 Fixed Effects

Figure 5.1 and Table 5.1 present the estimated mean and 95% credible region interval (CRI) of each fixed effect parameter. The intercept term is not shown in the figure because the values are much larger than the other parameters, and would thus skew the scale.

Table 5.1: Estimated Mean and 95% CRI of Fixed Effects
Variable Mean Lower CRI Upper CRI
intercept 16.763 16.601 16.931
AreaSqKM 0.405 0.315 0.497
impoundArea 0.373 0.290 0.460
agriculture -0.216 -0.290 -0.142
devel_hi -0.103 -0.163 -0.043
forest -0.482 -0.565 -0.400
prcp2 0.034 0.032 0.036
prcp30 0.029 0.022 0.036
prcp2.da -0.045 -0.047 -0.042
prcp30.da -0.086 -0.093 -0.079
airTemp.da 0.055 0.029 0.080
airTemp.impoundArea -0.067 -0.090 -0.043
airTemp.agriculture -0.017 -0.038 0.006
airTemp.forest -0.018 -0.041 0.005
airTemp.devel_hi -0.007 -0.025 0.010
airTemp.prcp2 0.020 0.018 0.022
airTemp.prcp30 -0.051 -0.055 -0.047
airTemp.prcp2.da -0.016 -0.018 -0.014
airTemp.prcp30.da -0.015 -0.019 -0.011

### 5.1.2 HUC8 Random Effects

Figure 5.2 shows the estimated mean and 95% credible region interval (CRI) for each random effect and HUC8. Table 5.2 lists the estimated mean and 95% CRI of each parameter averaged over all HUC8s (mean value with standard deviation in parentheses).

Table 5.2: Mean and 95% CRI of HUC8 Random Effects Averaged Over All HUC8s (Mean Value and Std. Dev. in Parentheses)
Variable Count Mean Lower CRI Upper CRI
intercept.huc 139 -0.000 (0.527) -0.801 (0.601) 0.801 (0.589)
airTemp 139 1.950 (0.208) 1.678 (0.253) 2.222 (0.223)
temp7p 139 1.415 (0.286) 1.053 (0.312) 1.779 (0.340)

### 5.1.3 Catchment Random Effects

Figure 5.3 shows the distribution of the estimated mean for each random effect term over all catchments. CRIs are not shown due to the large number of individual catchments (7476). Table 5.3 lists the estimated mean and 95% CRI of each parameter averaged over all catchments (mean value with standard deviation in parentheses).

Table 5.3: Estimated mean and 95% CRI for each random effect averaged over all catchments (mean value with std. dev. in parentheses)
Variable Count Mean Lower CRI Upper CRI
intercept.site 2,492 0.000 (1.380) -0.794 (1.389) 0.794 (1.413)
airTemp 2,492 0.000 (0.348) -0.302 (0.362) 0.302 (0.366)
temp7p 2,492 -0.000 (0.344) -0.506 (0.404) 0.506 (0.356)

### 5.1.4 Year Random Effects

Figure 5.4 and Table 5.4 present the mean and 95% CRI of the intercept term for each year. Recall that there are no random effects for years other than the intercept.

Table 5.4: Estimated Mean and 95% CRI of Intercept Random Effect for Each Year
Year Mean Lower CRI Upper CRI
1991 -0.153 -0.514 0.159
1992 0.129 -0.172 0.467
1993 0.220 -0.089 0.577
1994 0.051 -0.233 0.339
1995 0.081 -0.170 0.357
1996 -0.138 -0.382 0.099
1997 0.137 -0.064 0.339
1998 -0.049 -0.264 0.170
1999 0.068 -0.123 0.267
2000 -0.300 -0.419 -0.182
2001 -0.010 -0.125 0.103
2002 -0.026 -0.144 0.086
2003 -0.111 -0.223 -0.001
2004 0.100 -0.013 0.213
2005 0.070 -0.050 0.188
2006 -0.111 -0.217 -0.007
2007 -0.204 -0.306 -0.104
2008 0.072 -0.033 0.174
2009 0.051 -0.049 0.152
2010 0.157 0.064 0.249
2011 -0.071 -0.163 0.019
2012 0.203 0.117 0.292
2013 0.134 0.043 0.224
2014 -0.012 -0.102 0.077
2015 -0.208 -0.300 -0.120
2016 0.215 0.120 0.306
2017 -0.294 -0.394 -0.197

## 5.2 Goodness-of-Fit

### 5.2.1 All Observations

Table 5.5 summarizes the model goodness-of-fit for all observations in the calibration an dvalidation datasets. Values in parentheses exclude the temporal auto-correlation term from the prediction calculations and thus represent model performance for ungauged catchments or time periods when observation data are not available.

Table 5.5: Summary statistics of model calibration and validation (values in parentheses denote value when temporal auto-correlation term is excluded)
Calibration Validation
# Observations 559,373 59,235
# Time Series 6,617 723
# Catchments 2,492 472
# HUC8s 139 96
# Years 27 22
RMSE (degC) 0.627 (1.102) 0.674 (1.523)
Mean Residual (degC) 0.009 (0.067) 0.015 (0.105)
Median Residual (degC) 0.009 (0.079) 0.010 (0.067)
Mean Absolute Residual (degC) 0.466 (0.834) 0.501 (1.139)
Median Absolute Residual (degC) 0.359 (0.659) 0.389 (0.873)
Minimum Residual (degC) -8.520 (-10.373) -6.321 (-8.017)
1st Percentile Residual (degC) -1.612 (-2.764) -1.679 (-3.613)
99th Percentile Residual (degC) 1.611 (2.817) 1.709 (4.383)
Maximum Residual (degC) 7.520 (13.295) 7.610 (9.663)

Figure 5.5 presents scatterplots of predicted vs. observed daily mean temperature for the calibration and validation datasets. The black line is the 1:1 line of equality. The red line is a linear regression trend line.

Figure 5.6 also presents scatterplots of predicted vs. observed daily mean temperature for the calibration and validation datasets but excludes the temporal auto-correlation term. The black line is the 1:1 line of equality. The red line is a linear regression trend line.

### 5.2.2 Deployments

Table 5.6 summarises the mean, median, minimum and maximum RMSE for each deployment (i.e. continuous timeseries of observations at a single location) in the calibration and validation datasets.

Table 5.6: Summary statistics of model calibration and validation RMSE for each deployment (values in parentheses exclude temporal autocorrelation term)
Calibration Validation
# Time Series 6617 723
Mean RMSE (degC) 0.618 (1.002) 0.665 (1.313)
Median RMSE (degC) 0.584 (0.902) 0.617 (1.103)
Minimum RMSE (degC) 0.131 (0.137) 0.259 (0.312)
Maximum RMSE (degC) 3.980 (6.985) 1.783 (5.698)

Figure 5.7 shows the distribution of deployment RMSE including the temporal autocorrelation term.

Figure 5.8 shows the distribution of deployment RMSE excluding the temporal autocorrelation term.

#### 5.2.2.1 Calibration Deployment Examples

Figures 5.9 to 5.12 show example deployments from the calibration dataset with the highest and lowest RMSE and including or excluding the temporal autocorrelation term.

#### 5.2.2.2 Validation Deployment Examples

Figures 5.13 to 5.16 show example deployments from the validation dataset with the highest and lowest RMSE and including or excluding the temporal autocorrelation term.

### 5.2.3 Catchments

Table 5.7 summarises the mean, median, minimum and maximum RMSE of all catchments in the calibration and validation datasets.

Table 5.7: Summary of catchment RMSE values for calibration and validation datasets (values in parentheses exclude temporal autocorrelation term)
Calibration Validation
# Time Series 2492 472
Mean RMSE (degC) 0.576 (0.941) 0.649 (1.373)
Median RMSE (degC) 0.557 (0.853) 0.607 (1.122)
Minimum RMSE (degC) 0.180 (0.228) 0.298 (0.331)
Maximum RMSE (degC) 2.016 (3.356) 1.783 (5.698)

Figure 5.17 shows the distribution of catchment RMSE including the temporal autocorrelation term.

Figure 5.18 shows the distribution of catchment RMSE excluding the temporal autocorrelation term.