Title: | Process Stream Temperature, Intermittency, and Conductivity (STIC) Sensor Data |
---|---|
Description: | A collection of functions for processing raw data from Stream Temperature, Intermittency, and Conductivity (STIC) loggers. 'STICr' (pronounced "sticker") includes functions for tidying, calibrating, classifying, and doing quality checks on data from STIC sensors. Some package functionality is described in Wheeler/Zipper et al. (2023) <doi:10.31223/X5636K>. |
Authors: | Sam Zipper [aut, cre, cph] , Christopher T. Wheeler [aut] , Stephen Cook [ctb] , Delaney Peterson [ctb] , Sarah Godsey [ctb] |
Maintainer: | Sam Zipper <[email protected]> |
License: | AGPL (>= 3) |
Version: | 1.0 |
Built: | 2024-10-29 22:18:19 UTC |
Source: | https://github.com/heal-kgs/sticr |
This function takes the cleaned data frame generated by tidy_hobo_data
and the fitted model object generated by get_calibration
. It outputs a data frame with the same columns as the input, plus a calibrated specific conductivity column called SpC.
apply_calibration(stic_data, calibration, outside_std_range_flag = TRUE)
apply_calibration(stic_data, calibration, outside_std_range_flag = TRUE)
stic_data |
A data frame with a column named |
calibration |
a model object relating |
outside_std_range_flag |
a logical argument indicating whether the user would like to include an additional column flagging (with the letter "O") instances where the calibrated SpC value is outside the range of standards used to calibrate it. |
The same data frame as input, except with a new column called SpC
. This will be in the same units as the data used to develop the model calibration.
calibration <- get_calibration(calibration_standard_data) calibrated_df <- apply_calibration(tidy_stic_data, calibration, outside_std_range_flag = TRUE) head(calibrated_df)
calibration <- get_calibration(calibration_standard_data) calibrated_df <- apply_calibration(tidy_stic_data, calibration, outside_std_range_flag = TRUE) head(calibrated_df)
Calibrated STIC data used for function examples.
calibrated_stic_data
calibrated_stic_data
## 'calibrated_stic_data' A data frame with 1000 rows and 4 columns:
Date and time of measurement.
Raw uncalibrated conductivity recorded by STIC logger.
Temperature recorded by STIC logger.
Specific conductance calculated using 'apply_calibration' function.
AIMS project data.
Example calibration data for STIC sensor for conversion from uncalibrated conductivity to specific conductivity ('SpC').
calibration_standard_data
calibration_standard_data
## 'calibration_standard_data' A data frame with 4 rows and 3 columns:
Serial number for STIC sensor.
Specific conductance ('SpC') standard values used for soaking STIC.
Uncalibrated conductivity recorded by STIC when soaked in each standard.
AIMS project data.
Classified STIC data used for function examples.
classified_df
classified_df
## 'classified_df' A data frame with 1000 rows and 5 columns:
Date and time of measurement.
Raw uncalibrated conductivity recorded by STIC logger.
Temperature recorded by STIC logger.
Specific conductance calculated using 'apply_calibration' function.
Classified STIC data created by 'classify_wetdry' function.
AIMS project data.
This is a function to classify STIC data into a binary "wet" and "dry" column. Data can be classified according to any classification variable defined by the user. User can choose one of two methods for classification: either an absolute numerical threshold or as a chosen percentage of the maximum value of the classification variable.
classify_wetdry(stic_data, classify_var, threshold, method)
classify_wetdry(stic_data, classify_var, threshold, method)
stic_data |
A data frame with STIC data, such as that produced by apply_calibration or tidy_hobo_data. |
classify_var |
Name of the column in data frame you want to use for classification. |
threshold |
This is the user-defined threshold for determining wet versus dry based on the designated classification variable. If using the |
method |
User chooses which classification method used to generate the binary data. |
The same data frame as input, but with a new column called "wetdry"
.
classified_df <- classify_wetdry(calibrated_stic_data, classify_var = "SpC", method = "absolute", threshold = 200 ) head(classified_df)
classified_df <- classify_wetdry(calibrated_stic_data, classify_var = "SpC", method = "absolute", threshold = 200 ) head(classified_df)
Example field observations that could be compared to classified STIC data.
field_obs
field_obs
## 'field_obs' A data frame with 5 rows and 3 columns:
Date and time of field observation.
Field observation of stream water status ('wet' or 'dry').
Field observations of specific conductance.
Made up data.
This is a function to fit specific conductivity (SpC
) standards and uncalibrated conductivity measured by the STIC to a model object. This model can then be used to predict SpC values using apply_calibration. As of right now, only linear models are supported.
get_calibration(calibration_data)
get_calibration(calibration_data)
calibration_data |
STIC calibration data frame with columns |
A fitted lm
model object relating SpC
to the uncalibrated conductivity values measured by the STIC
head(calibration_standard_data) lm_calibration <- get_calibration(calibration_standard_data) summary(lm_calibration)
head(calibration_standard_data) lm_calibration <- get_calibration(calibration_standard_data) summary(lm_calibration)
This function provides multiple options for QAQC flagging of processed and classified STIC data frames, such as those generated by the classify_wetdry function. Users can select which operations are to be performed, and a single new QAQC column is created with all flags concatenated. QAQC options currently include: (1) correction and flagging of negative SPC values resulting from the calibration process, i.e., changing the negative values to 0 and flagging this (2) inspecting the wetdry classification time series for potential deviation anomalies based on user-defined windows
qaqc_stic_data( stic_data, spc_neg_correction = TRUE, inspect_deviation = TRUE, deviation_size = NULL, window_size = NULL )
qaqc_stic_data( stic_data, spc_neg_correction = TRUE, inspect_deviation = TRUE, deviation_size = NULL, window_size = NULL )
stic_data |
A data frame with classified STIC data, such as that produced by |
spc_neg_correction |
a logical argument indicating whether the user would like to correct negative SPC values resulting from the calibration process to 0.
The character code associated with this correction is |
inspect_deviation |
a logical argument indicating whether the user would like to identify deviation anomalies, in which a series of wet or dry readings less than or equal to 'deviation_size' in length is surrounded on both sides by 'window_size' or more observations of its opposite.
This operation is meant to identify potentially suspect binary wet/dry data points for further examination.
The character code associated with this operation is |
deviation_size |
a numeric argument specifying the maximum size (i.e., number of observations) of a clustered group of points that can be flagged as an deviation |
window_size |
a numeric argument specifying the minimum size (i.e., number of observations) that the deviation must be surrounded by in order to be flagged |
The same data frame as input, but with new QAQC columns or a single, concatenated QAQC column. The QAQC output
Can include: "C"
, meaning the calibrated SpC value was negative from 'spc_neg_correction'; "D"
, meaning the point was identified as
a deviation or deviation based on a moving window from 'inspect_deviation'; or "O"
, meaning the calibrated SpC was
outside the standard range based on the function apply_calibration
.
qaqc_df <- qaqc_stic_data(classified_df, spc_neg_correction = TRUE, inspect_deviation = TRUE, deviation_size = 4, window_size = 96 ) head(qaqc_df)
qaqc_df <- qaqc_stic_data(classified_df, spc_neg_correction = TRUE, inspect_deviation = TRUE, deviation_size = 4, window_size = 96 ) head(qaqc_df)
This function is intended to allow the user to visually assess the effects of classification threshold uncertainty on STIC classification. It takes the the model object used to calibrate SpC, as well as a classified STIC data frame with column names matching those produced by classify_wetdry.
test_threshold(stic_data, calibration)
test_threshold(stic_data, calibration)
stic_data |
classified STIC data frame with the variable names of that produced by classify_wetdry |
calibration |
the model object used to calibrate SpC, generated by the get_calibration function and used in apply_calibration |
A time series plot of classified wet/dry observations through time using three different absolute classification thresholds: the y-intercept of the fitted model developed in get_calibration, the y-intercept plus one standard error, and the y-intercept minus one standard error
lm_calibration <- get_calibration(calibration_standard_data) threshold_testing_plot <- test_threshold(stic_data = classified_df, calibration = lm_calibration)
lm_calibration <- get_calibration(calibration_standard_data) threshold_testing_plot <- test_threshold(stic_data = classified_df, calibration = lm_calibration)
This function loads raw HOBO STIC CSV files and cleans up columns and headers
tidy_hobo_data(infile, outfile = FALSE, convert_utc = TRUE)
tidy_hobo_data(infile, outfile = FALSE, convert_utc = TRUE)
infile |
filename (including path or URL if needed) for a raw CSV file exported from HOBOware. |
outfile |
filename (including path if needed) to save the tidied data frame. Defaults to |
convert_utc |
a logical argument indicating whether the user would like to convert from the time zone associated with their CSV to UTC |
a tidied data frame with the following column names: datetime
, condUncal
, tempC
.
clean_data <- tidy_hobo_data( infile = "https://samzipper.com/data/raw_hobo_data.csv", outfile = FALSE, convert_utc = TRUE ) head(clean_data)
clean_data <- tidy_hobo_data( infile = "https://samzipper.com/data/raw_hobo_data.csv", outfile = FALSE, convert_utc = TRUE ) head(clean_data)
Example tidied STIC data for input to calibration and classification process.
tidy_stic_data
tidy_stic_data
## 'tidy_stic_data' A data frame with 1000 rows and 3 columns:
Date and time of measurement.
Raw uncalibrated conductivity recorded by STIC logger.
Temperature recorded by STIC logger.
AIMS project data.
This function trims a tidied hobo data frame by datetime to eliminate periods where the logger wad recording but not placed in the stream network
trim_hobo_data( stic_data, time_start = "2021-07-16 18:00:00", time_end = "2021-07-27 01:00:00" )
trim_hobo_data( stic_data, time_start = "2021-07-16 18:00:00", time_end = "2021-07-27 01:00:00" )
stic_data |
A data frame with columns named |
time_start |
User enters the time at which the logger was placed in the stream network |
time_end |
User enters the time at which the logger was removed from the stream network |
a tidied data frame with the same columns as the input, but trimmed to the user-defined time
trimmed_data <- trim_hobo_data(tidy_stic_data, time_start = "2021-07-16 18:00:00", time_end = "2021-07-27 01:00:00" ) head(trimmed_data)
trimmed_data <- trim_hobo_data(tidy_stic_data, time_start = "2021-07-16 18:00:00", time_end = "2021-07-27 01:00:00" ) head(trimmed_data)
This function takes a data frame with field observations of wet/dry status and SpC and generates both a confusion matrix for the wet/dry observations and a scatterplot comparing estimated SpC from the STICs to field-measured values.
validate_stic_data( stic_data, field_observations, max_time_diff, join_cols, get_SpC )
validate_stic_data( stic_data, field_observations, max_time_diff, join_cols, get_SpC )
stic_data |
classified STIC data frame with the variable names of that produced by classify_wetdry. At a minimum, there must be |
field_observations |
The input data frame of field observations must include a |
max_time_diff |
Maximum allowed time difference (in minutes) between field observation and STIC reading to be counted as a match. |
join_cols |
A named vector of columns that need to be matched between |
get_SpC |
Logical flag whether to get STIC data for SpC ( |
The field_observations
data frame with new columns indicating the closest-in-time STIC wetdry classification (wetdry_STIC
), SpC measurement (SpC_STIC
; only if get_SpC = T
), and time difference between the field observation and STIC reading (timediff_min
).
stic_validation <- validate_stic_data( stic_data = classified_df, field_observations = field_obs, max_time_diff = 30, join_cols = NULL, get_SpC = TRUE )
stic_validation <- validate_stic_data( stic_data = classified_df, field_observations = field_obs, max_time_diff = 30, join_cols = NULL, get_SpC = TRUE )