OPUSraw_to_Preprocessed.Rd
This function loads a set of raw opus spectra files into R, processes them into a nested tibble format (propectr-friendly),
and applies several pre-processing techniques.
a folder containing the raw spectra
where should the produced files be saved
logical: should the final output be saved as RDS in 'save_location'?
logical: should the loaded raw spectra be saved as RDS in 'save_location'? This can be used to reload the raw spectra when calling this function again, which can save quite some time when loading a large amount of spectra
logical: should the final output be returned by the function? Set T, when using in pipeline, or assigning to variable. Otherwise set F, or output will be printed in console and cause lag.
Which characters in the sample column strings are the unique sample identifeirs? Provide a vetor of start and end position. E.g., default for BDF spec: Full file names are: "BDF-XXXX-y....., with XXXX being the sample name. Therefore, default is set to c(5,8)
Is a 'raw_spc' RDS file stored in 'save_location' from a previous run you want to load instead of reading raw opus files?
Sample names are converted to fully uppercase (removes case typos). Parametrisation for compatability (option to turn it off)
Vector of dates in string format uses to split all raw opus files into groupes for read-in. This is used because recalibration of the FT wavenumbers introduces a small shift which hinders direct merge of spectra by wavenumbers. Subsets are remerged after rs to 1 cm-1. Date_split string format should conform to default Posixct format. Either "YYYY-MM-DD" or "YYYY-mm-dd HH:MM:SS".
Locates the raw spc cols in the aggregated raw spectra dataframe. Default is for scans from 7500-400 cm-1 at 1 cm-1 (actually at about 0.6 cm-1), yielding 13860 spc columns. You can input either the total No. of spc_cols (e.g. 13860), the start and end index of the spc cols (c(5,13864)), or the full vector of spc col indices (c(5:13864)).
If save_raw=T and reload=F, the initially read-in raw opus files are saved after gathering (details see simplerspec documentation) them as a rds object in save_location as raw_spc.
If save=T, the final object containing the differently pre-processed spectra sets is saved in save_location as spc_data.
If return=T, the final object containing the differently pre-processed spectra is