# Initial results from 8cm hydraulic parameters

In the past two weeks I have begun to thoroughly address the spatial relationships in my data. I had three approaches in mind for mapping the hydraulic parameters. The first is a strait forward universal kriging of each variable (independently). This approach is not expected to give the best results but will provide a baseline to compare other approaches with. The second is collocated cokriging of each variable with one of the exhaustively sampled variables. This should improve interpolation results if there is a spatial relationship with one of the exhaustive variables. The third is cokriging with all of the hydraulic parameters together and collocated cokriging with an exhaustive variables. The theoretical advantage to approach three is the potential extraction of information from likely correlations between the hydraulic parameters for better spatial predictions. In other words, the hydraulic parameters are somewhat linked together through the model they describe. This link or relationship can provide additional information for spatial interpolation. The disadvantage is that each additional variable adds an additional variogram and cross-covariograms which need to be fit with the linear model of coregionalization. The additional plots make the fitting procedure less likely to find a good fit for all of the variables involved. The resulting model variogram that fits the group of variables as a whole may not fit each variable as well as the independent model variogram fits. Said another way, approach two only has to fit 2 variables, 1 target variable and 1 collocated variable so the model variogram will probably fit better than the one in approach 3 which has to fit 7 variables (6 target and 1 collocated). The idea is to compare better fit versus more information.

Approach one resulted in quite poor results. This is somewhat expected because there are not very many samples. One thing that I noticed was the lack of a strong spatial structure. My best guess is the following. Imagine an s shaped curve like the van Genuchten model. You can take 5 samples from the curve and you need to be able to describe the curve as accurately as possible from the samples. The best approach is to place one sample near the upper and lower boundaries and to place the rest where you can capture as much of the vertical change as possible. This results in three samples being located more closely together and two which are located farther apart. The sampling scheme used to collect the soil samples incorporated this same idea. The result is that the samples which are close to each other are also from regions where variation happens over short distances. This is reflected in the variogram as no spatial structure because the shorted distances are represented by points where variation is high and increasing distance doesn’t result in increasing variance because the maximum and minimum boundary have been met. In other words, after the short distance on our s curve, increasing distance from the center doesn’t result in an increase of variance because the slope is close to 0. This implies the presence of sharp boundaries in the study area. No surprises here, we already know this; the field has sandy paleo-channels. The only way that I have found to overcome this problem directly is to define zones and separate inner zone relationships and between zone relationships. I attempted this without success previously.

In theory, approach two is more robust to the sharp interfaces because the interfaces are reflected in the EC data. The EC data is incorporated into the interpolation and should improve results. This is what I found. However, the relationships between variables are not what I expected. I was expecting strong relationships and I found weak ones. This is surprising to me in some ways and not in other ways. We know that tr and ts are effected by bulk density and that n, a, and, Ks are effected by texture. Both bulk density and texture effect ECa. Therefore, it is logical to expect some correlations. What I found is that more hydraulic variables had a stronger relationship with elevation than ECa. The exception was ts and n. This makes some sense because elevation is dominated by peat oxidation. The sandy channels and northern edge have less peat on the surface and are clearly higher in elevation. Elevation theoretically captures density and texture to some degree. However, elevation is not effected by salinity or water content like ECa. The highest structural correlation coefficient (the correlation of variables at a range) was around .5 and most were less than .35. Cross validation results were improved over approach one as expected but still quite poor. The model frequently predicted the average of the data set and did not predict the variability well. Sometimes this is an indication of a neighborhood (the points used to predict the value at the unknown location) with too many points. However, reducing the neighborhood did not improve results. This means that the models doesn’t account for the spatial structure well.

Out of curiosity, I ran an FKA of the exhaustive variables in an attempt to characterize the areas of the field where ECa and elevation disagree. For the most part, everything agreed except areas near drainage ditches and one interesting location. This location had a very high ECa. Because it was an area of high variation, there was also a sample taken there. This sample had average texture and bulk density but had the highest concentration of cations. Areas near the drainage ditches would have higher water content near the surface than the rest of the field and therefore higher ECa. While the drainage ditches also effect elevation because the surface slopes toward them, this is not as apparent in the measurements. This is likely because the surface slope is low (less than 1% gradient) but the resulting slope in ECa from changing water content is higher. These differences are confirmation of the noise introduced into the ECa measurements that do not effect elevation.

The third approach worked surprisingly well. The linear model of coregionalization actually resulted in a very logical and physically meaningful fit. One of the ranges corresponds to the distance separating paleo-channels and the structural correlations are much stronger (up to .9+ for some and many .5+). After approach two failed, I expected approach three to fail also. Instead of the added plots hurting the fit, they apparently help. My guess is that pooling the data together acts to filter out the noise from spatial structure much like averaging does. The averaging results in a more representative model instead of one that fits to the noise. Cross validation results all improved greatly with the exception of n which decreased slightly.

Since I had the model, I ran a quick FKA of the hydraulic parameters. I will be using these results in conjunction with Mahanalobs distance to detect and remove multivariate outliers next week.

Comments are closed.