--- title: "Marked Point Process" author: "Philip Mostert" date: "`r Sys.Date()`" bibliography: '`r system.file("References.bib", package="PointedSDMs")`' biblio-style: authoryear output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Marked Point Process} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE, warning = FALSE, message = FALSE ) ``` Real datasets are often fairly complex, and information on the species other than their location data may also be collected. Additional information could be the length of a specific plant, or the weight of a group of mammals, something which describes the underlying characteristics of the studied species'. Using such information results in a *marked point process (*see @illian2012toolbox for an overview); and this framework may be incorporated in the *PointedSDMs* R package*.* This vignette gives an illustration on how include marks in the model framework, by using data on Eucalyptus globulus (common name: blue gum) on the *Koala Conservation Center on Philip Island* (Australia), collected by a community group between 1993 and 2004 [@moore2010palatability]. The dataset contains a multitude of different marks, however for this study we will be focusing on two: *food*, which is some index of the food value of the tree (calculated as dry matter intake multiplied by foliage palatability), and *koala*, describing the number of koala visits to each tree. No inference of the model is completed in this vignette due to computational intensity of the model. However the *R* script and data are provided below so that the user may carry out inference. ```{r, setup} library(spatstat) library(PointedSDMs) library(sf) library(sp) library(ggplot2) library(inlabru) library(INLA) library(fmesher) ``` ```{r load_data} data(Koala) eucTrees <- Koala$eucTrees boundary <- Koala$boundary ``` ```{r, clean_data, echo = FALSE,fig.width=7, fig.height=5} proj <- "+init=epsg:27700" boundary <- as(boundary, 'sf') st_crs(boundary) <- proj euc <- st_as_sf(x = eucTrees, coords = c('E', 'N'), crs = proj) euc$food <- euc$FOOD/1000 euc <- euc[unlist(st_intersects(boundary, euc)),] class(trees) mesh = fm_mesh_2d_inla(boundary = boundary, max.edge = c(10, 20), offset = c(15,20), cutoff = 2) mesh$crs <- st_crs(proj) ggplot() + geom_sf(data = st_boundary(boundary)) + geom_sf(data = euc, aes(color = koala)) + ggtitle('Plot showing number of koalas at each site') ggplot() + geom_sf(data = st_boundary(boundary)) + geom_sf(data = euc, aes(color = food)) + ggtitle('Plot showing the food value index at each site') ``` Lets first create a model for the tree locations only: in this example, we will assume that these locations are treated as *present-only* data. ```{r, points_only,fig.width=7, fig.height=5} points <- startISDM(euc, Boundary = boundary, Projection = proj, Mesh = mesh) points$specifySpatial(sharedSpatial = TRUE, prior.range = c(120, 0.1), prior.sigma = c(1, 0.1)) pointsModel <- fitISDM(points, options = list(control.inla = list(int.strategy = 'eb'))) pointsPredictions <- predict(pointsModel, mask = boundary, mesh = mesh, predictor = TRUE) pointsPlot <- plot(pointsPredictions, variable = 'mean', plot = FALSE) pointsPlot$predictions$predictions$mean + gg(euc) ``` Now lets add the marks to this framework by specifying the name of the response variable of the marks with the argument `markNames`, and the family of the marks with the argument `markFamily`. Doing so will model each marks as a separate observation process, based on the family specified. ```{r, include_marks,fig.width=7, fig.height=5} marks <- startMarks(euc, Boundary = boundary, Projection = proj, markNames = c('food', 'koala'), markFamily = c('gamma', 'poisson'), Mesh = mesh) marks$specifySpatial(sharedSpatial = TRUE, prior.range = c(120, 0.1), prior.sigma = c(1, 0.1)) marks$specifySpatial(Mark = 'koala', prior.range = c(120, 0.1), prior.sigma = c(1, 0.1)) marks$specifySpatial(Mark = 'food', prior.range = c(60, 0.1), prior.sigma = c(1, 0.1)) marksModel <- fitISDM(marks, options = list(control.inla = list(int.strategy = 'eb'), safe = TRUE)) foodPredictions <- predict(marksModel, mask = boundary, mesh = mesh, marks = 'food', spatial = TRUE, fun = 'exp') koalaPredictions <- predict(marksModel, mask = boundary, mesh = mesh, marks = 'koala', spatial = TRUE) plot(foodPredictions, variable = c('mean', 'sd')) plot(koalaPredictions, variable = c('mean', 'sd')) ``` For the second mark model we only include the *food* mark, but this time we use a *log-linear* model with additive Gaussian noise. This is specified using the `.$updateFormula` slot function, and then by adding the scaling component to the *inlabru* components with the `.$addComponents` slot function to ensure that it is actually estimated. Moreover we assume *penalizing complexity* priors for the two spatial effects, as well as specify *bru_max_iter* in the *options* argument to keep the time to estimate down. ```{r, marks_add_scaling,fig.width=7, fig.height=5} marks2 <- startMarks(euc, Boundary = boundary, Projection = proj, markNames = 'food', markFamily = 'gaussian', Mesh = mesh) marks2$updateFormula(Mark = 'food', newFormula = ~ exp(food_intercept + (shared_spatial + 1e-6)*scaling + food_spatial)) marks2$changeComponents(addComponent = 'scaling(1)') marks2$specifySpatial(sharedSpatial = TRUE, prior.sigma = c(1, 0.01), prior.range = c(120, 0.01)) marks2$specifySpatial(Mark = 'food', prior.sigma = c(1, 0.01), prior.range = c(120, 0.01)) marksModel2 <- fitISDM(marks2, options = list(control.inla = list(int.strategy = 'eb'), bru_max_iter = 2, safe = TRUE)) predsMarks2 <- predict(marksModel2, mask = boundary, mesh = mesh, formula = ~ (food_intercept + (shared_spatial + 1e-6)*scaling + food_spatial)) plot(predsMarks2) ```