--- title: "SMUFASA Workflow Example" author: "Jennifer McNichol" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{SMUFASA Workflow Example} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- # Load Packages ```{r message=FALSE, warning=FALSE} library(QFASA) library(dplyr) library(compositions) ``` # Modeling Inputs Prior to starting make sure that: - Fatty acid names in all files are the same (contain the exact same numbers/characters and punctuation) - There are no fatty acids in the prey file that do not appear in the predator file and visa versa ## Fatty Acid Set - This is the list of FAs to be used in the modelling. - The simplest alternative is to load a `.csv` file that contains a single column with a header row with the names of thee fatty acids listed below (see example file **"FAset.csv"**). - A more complicated alternative is to load a `.csv` file with the full set of FAs and then add code to subset the FAs you wish to use from that set -> *this alternative is useful if you are planning to test multiple sets*. ```{r} data(FAset) fa.set = as.vector(unlist(FAset)) ``` ## Matrix of Predator FA Signatures - The FA signatures in the originating `.csv` file should be proportions summing to 1. - Each predator signature is a row with the FAs in columns (see example file **"predatorFAs.csv"**). The FA signatures are subsetted for the chosen FA set (created above) and renormalized during the modelling so there is no need to subset and/or renormalize prior to loading the .csv file or running `p.SMUFASA` BUT make sure that the same FAs appear in the predator and prey files. - Your predator FA `.csv` file can contain as much tombstone data columns as you like, you must extract the predator FA signatures as separate input in order to run in `p.SMUFASA`. For example: in the code below, the predator `.csv` file ("`predatorFAs.csv`") has 4 tombstone columns (SampleCode, AnimalCode, SampleGroup, Biopsy). Prior to running `p.SMUFASA`, the tombstone (columns 1-4) and FA data (columns 5 onward) are each extracted from the original data frame. The FA data becomes the `predator.matrix` (which is passed to `p.SMUFASA`) and the tombstone data is retained so that it can be recombined with the model output later on. ```{r} data(predatorFAs) tombstone.info = predatorFAs[,1:4] predator.matrix = predatorFAs[,5:(ncol(predatorFAs))] npredators = nrow(predator.matrix) ``` ## Matrix of Prey FA Signatures - The FA signatures in the `.csv` file should be proportions summing to 1. - The prey file should contain all of the individual FA signatures of the prey and their lipid contents (where appropriate). - If you want to only include a subset of prey species, you must extract it prior to input (see code below). ```{r} data(preyFAs) prey.matrix = preyFAs[,-c(1,3)] # Selecting 5 prey species to include spec.red <-c("capelin", "herring", "mackerel", "pilchard", "sandlance") spec.red <- sort(spec.red) prey.red <- prey.matrix %>% filter(Species %in% spec.red) ``` ## Prey Lipid Content - Mean lipid content by species group is calculated from the full prey file using the species group as a summary variable (see code below). - **Note**: If no lipid content correction is going to be applied, then a vector of 1s of length equal to the number of species groups is used as the vector instead. I.e. `FC - rep(1,nrow(prey.matrix))`. - If you've decided on a subset of species, you must extract them from the mean lipid content vector as well. ```{r} FC = preyFAs[,c(2,3)] FC = FC %>% arrange(Species) FC.vec = tapply(FC$lipid,FC$Species,mean,na.rm=TRUE) FC.red <- FC.vec[spec.red] ``` # Running SMUFASA ```{r eval=FALSE} smufasa.est <- p.SMUFASA(predator.matrix, prey.red, FC.red, fa.set) ``` ### p.SMUFASA Output The MUFASA output is a list with 4 components: - Cal_Estimates - Diet_Estimates - nll - Var_Epsilon ## Calibration Coefficient Estimates A vector of length equal to the number of FAs used and whose sum is the total number of FAs used. Thos is a set of calibration coefficients common to all predators used. ````{r eval=FALSE} CalEst <- smufasa.est$Cal_Estimates ```` ## Diet Estimates The diet estimate vector returned by `p.SMUFASA` represents an overall common diet for all predators in `predator.matrix` . **Note**: If you wish to obtain diet estimates for each individual predator see the steps below. ````{r eval=FALSE} DietEst <- smufasa.est$Diet_Estimates ```` ## nll This is a vector of the negative log likelihood values at each iteration of the optimizer. ````{r eval=FALSE} nll <- smufasa.est$nll ```` ## Var_Epsilon This is the optimized diagonal values of the variance-covariance matrix of the errors. See reference in help file for details. ```{r eval=FALSE} VarEps <- smufasa.est$Var_Epsilon ``` # Obtaining Diet Estimates Once a vector of calibration coefficients is obtained via `p.SMUFASA` you can pass this vector to `p.QFASA` or `p.MUFASA` to obtain individual diet estimates. ## QFASA ```{r eval=FALSE} Q = p.QFASA(predator.matrix=predator.matrix, prey.matrix=prey.matrix, cal.mat=CalEst, dist.meas=2, gamma=1, FC=FC.red, start.val=rep(1,nrow(prey.red)), ext.fa=fa.set) ``` QFASA Diet Estimates: ```{r, eval=FALSE} DietEst.Q <- Q$'Diet Estimates' ``` *See* `p.QFASA` *documnetation for further details.* ## MUFASA ```{r,eval=FALSE} M <- p.SMUFASA(predator.matrix, prey.red, cal.est, FC.red, fa.set) DietEst.M <- M$Diet_Estimates ``` *See* `p.MUFASA` *documnetation for further details.*