---
title: "Units and priors"
date: "2024-05-14"
vignette: >
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
  %\VignetteIndexEntry{Units and priors}
output:
  rmarkdown::html_vignette
---


In this vignette, we will discuss two important points which are related to each other: the parameters physical units and how to choose parameter priors. Those two are related since the biological meaning of the priors necessarily depends on the units used for parameters.

## Parameter units

Some parameters of the network models are unitless, but those which aren't have units which depend on the **units used for the data**.

This is an overview of the different types of parameters that are used in a network model:

| Parameter name           | Description                                  | Unit                | Domain                   |
|--------------------------|----------------------------------------------|---------------------|--------------------------|
| `upsilon_<src>_to_<dst>` | Uptake rate from `src` to `dst`              | Depends on the data | $\left[0;+\infty\right[$ |
| `lambda_<src>`           | Loss rate from `src`                         | Depends on the data | $\left[0;+\infty\right[$ |
| `portion.act_<src>`      | Active portion of `src`                      | Unitless            | $\left[0;1\right]$       |
| `eta`                    | Default: coefficient of variation (tracer proportion) | Unitless            | $\left[0;+\infty\right[$ |
| `zeta`                   | Default: coefficient of variation (biomass)           | Unitless            | $\left[0;+\infty\right[$ |

### Uptake rates

Uptake rates (`upsilon_<src>_to_<dst>`) are expressed in **proportion of compartment material transferred per unit of time**. For example: 0.05 $\textrm{day}^{-1}$ means that 5% of `src` is transferred to `dst` per day.

The actual time unit corresponds to the time unit used in the observation data. For example, if the observation data give sampling times as hours since the beginning of the experiment, then the uptake rates will be in $\textrm{hour}^{-1}$.

### Loss rates

Loss rates (`lambda_<src>`) are also expressed in **proportion of compartment material transferred (lost) per unit of time**. For example: 0.1 $\textrm{week}^{-1}$ means that 10% of `src` is lost and exits the network system per week.

Again, the actual time unit corresponds to the time unit used in the observation data. It is the same as for uptake rates.

### Active portions

An active portion is the proportion of a compartment initial standing stock that is being involved in flows in the network, and must be comprised between 0 and 1. For example, an active portion of 0.25 means that only 25% of the initial compartment standing stock is used when calculating uptakes by other compartments and losses, while 75% of the initial compartment standing stock behaves as a refractory portion which is not involved in the network.

Active portions are useful to model compartments for which only some of the content is involved in exchanges at the time scale of the experiment (e.g. algal and bacterial layers growing on dead leaves in a stream). They can explain why consumers feeding selectively on this active portion might exhibit higher isotopic enrichments than the apparent enrichment of the consumed compartment.

### Coefficients of variations

By default, $\eta$ and $\zeta$ are the coefficients of variation for the distribution of observed isotopic proportions and compartment sizes, respectively, around their predicted mean values. As coefficients of variation, they are the ratio between the standard deviation and the mean of the distributions they describe, and are thus unitless.

For $\eta$, several parameterizations are actually implemented in `isotracer` and can be set with `set_prop_family()`:

- `"gamma_cv"`: observed proportions are assumed to follow a **gamma** distribution with **mean** the predicted proportions and **coefficient of variation** $\eta$ (this is the default).

- `"normal_cv"`: observed proportions are assumed to follow a **normal** distribution with **mean** the predicted proportions and **coefficient of variation** $\eta$.

- `"normal_sd"`: observed proportions are assumed to follow a **normal** distribution with **mean** the predicted proportions and the **standard deviation** $\eta$ (note that in this case $\eta$ has the same physical unit as the provided compartment sizes).

- `"beta_phi"`: observed proportions are assumed to follow a **beta** distribution with **mean** the predicted proportions a **precision parameter** $\phi = \eta$.

## Choosing priors

Once we understand what the physical units of each parameter are, we can make sensible choices for priors.

For **uptake and loss rates**, normal priors centered at 0 are reasonable choices since they allow for any positive value but gives more weight to smaller values and can be scaled by their standard deviation. For example, if we consider a stream ecosystem for which the time data was given in days, allowing for 25% of a compartment biomass to be renewed every day is already pretty generous, so a normal prior with standard deviation 1 is a slightly informative but still very permissive prior.

<div class="alert-primary" role="alert">

For a normal prior centered at 0 (and given that priors are truncated at 0 for rates in `isotracer`), about 68% of the probability mass is located between 0 and one standard deviation, and about 95% of the probability mass is between 0 and two standard deviations.

For example, for a prior `normal_p(0, 5)`:

- 68% of the prior probability mass is within [0,5] (one sigma).
- 95% of the prior probability mass is within [0,10] (two sigmas).
- 99.7% of the prior probability mass is within [0,15] (three sigmas).

</div>

For **active portions**, beta priors can be used if the user has any preliminary knowledge about expected active portions, but since the interval is bounded on $\left[0;1\right]$ a uniform prior is proper and acceptable.

For **coefficients of variation**, normal priors are again a reasonable choice. For example, a normal prior `normal_p(0, 2)` with standard deviation 2 is not very informative but gives more weight to coefficients of variation below 200%.

## Setting priors

For now, a few different types of priors can be specified by the user when building a network model. Let's illustrate this with our usual example of an aquarium with NH4, algae and Daphnia:


```r
library(isotracer)
mod <- new_networkModel() %>% set_topo("NH4 -> algae -> daphnia -> NH4")
```

What are the parameters of the model?


```r
params(mod)
```

```
## # A tibble: 8 × 2
##   in_model                 value
##   <chr>                    <dbl>
## 1 eta                         NA
## 2 lambda_algae                NA
## 3 lambda_daphnia              NA
## 4 lambda_NH4                  NA
## 5 upsilon_algae_to_daphnia    NA
## 6 upsilon_daphnia_to_NH4      NA
## 7 upsilon_NH4_to_algae        NA
## 8 zeta                        NA
```

The priors available in **isotracer** are:

- `constant_p(value)`: This is a special prior, where the parameter value is actually fixed to a constant value given by the user.
- `uniform_p(min, max)`: A uniform prior.
- `hcauchy_p(scale)`: The half-Cauchy prior we mentioned earlier, with `hcauchy_p(0.1)` being the default prior.
- `scaled_beta_p(alpha, beta, scale)`: A scaled Beta prior. This prior is basically a regular Beta prior with parameters $\alpha$ and $\beta$, which is then stretched so that it is defined on $\left[0;\textrm{scale}\right]$. A `scaled_beta(alpha, beta, 1)` is thus a regular Beta prior defined on $\left[0;1\right]$. Another way to see it is that if a random variable $X$ follows a scaled Beta distribution with parameters `(alpha, beta, scale)` then $X$/`scale` follows a Beta distribution with parameters `(alpha, beta)`.
- `exponential_p(lambda)`: An exponential prior with rate = `lambda`.
- `gamma_p(alpha, beta)`: A gamma prior with shape = `alpha` and rate = `beta`.

Priors of a network model are set using the `set_priors()` function:


```r
mod <- mod %>% set_priors(uniform_p(0, 10), "zeta")
priors(mod)
```

```
## # A tibble: 8 × 2
##   in_model                 prior                  
##   <chr>                    <list>                 
## 1 eta                      <NULL>                 
## 2 lambda_algae             <NULL>                 
## 3 lambda_daphnia           <NULL>                 
## 4 lambda_NH4               <NULL>                 
## 5 upsilon_algae_to_daphnia <NULL>                 
## 6 upsilon_daphnia_to_NH4   <NULL>                 
## 7 upsilon_NH4_to_algae     <NULL>                 
## 8 zeta                     <uniform(min=0,max=10)>
```

Note that by default `set_priors()` assigns priors to all parameters that match the name provided:


```r
mod <- mod %>% set_priors(scaled_beta_p(4, 8, 5), "lambda")
priors(mod)
```

```
## # A tibble: 8 × 2
##   in_model                 prior                                
##   <chr>                    <list>                               
## 1 eta                      <NULL>                               
## 2 lambda_algae             <scaled_beta(alpha=4,beta=8,scale=5)>
## 3 lambda_daphnia           <scaled_beta(alpha=4,beta=8,scale=5)>
## 4 lambda_NH4               <scaled_beta(alpha=4,beta=8,scale=5)>
## 5 upsilon_algae_to_daphnia <NULL>                               
## 6 upsilon_daphnia_to_NH4   <NULL>                               
## 7 upsilon_NH4_to_algae     <NULL>                               
## 8 zeta                     <uniform(min=0,max=10)>
```

If you want to avoid using partial matching, you can use the argument `use_regexp = FALSE`:


```r
mod <- mod %>% set_priors(constant_p(0.2), "eta", use_regexp = FALSE)
priors(mod)
```

```
## # A tibble: 8 × 2
##   in_model                 prior                                
##   <chr>                    <list>                               
## 1 eta                      <constant(value=0.2)>                
## 2 lambda_algae             <scaled_beta(alpha=4,beta=8,scale=5)>
## 3 lambda_daphnia           <scaled_beta(alpha=4,beta=8,scale=5)>
## 4 lambda_NH4               <scaled_beta(alpha=4,beta=8,scale=5)>
## 5 upsilon_algae_to_daphnia <NULL>                               
## 6 upsilon_daphnia_to_NH4   <NULL>                               
## 7 upsilon_NH4_to_algae     <NULL>                               
## 8 zeta                     <uniform(min=0,max=10)>
```

Notice in the last example how `eta` was affected while `zeta` was not.

<nav aria-label="Page navigation">
 <ul class="pagination justify-content-end">
  <li class="page-item"><a class="page-link" href="tutorial-050-fixed-effects.html">Previous: Including fixed effects of covariates</a></li>
  <li class="page-item"><a class="page-link" href="tutorial-080-mcmc-output-format.html">Next: MCMC output format</a></li>
 </ul>
</nav>