---
title: "IPCW Cumulative Cost"
author: Klaus Holst & Thomas Scheike
date: "`r Sys.Date()`"
output:
  rmarkdown::html_vignette:
    fig_caption: yes
    # fig_width: 7.15
    # fig_height: 5.5 
header-includes: 
  - \usepackage{tikz}
vignette: >
  %\VignetteIndexEntry{IPCW Cumulative Cost}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  #dev="svg",
  dpi=50,
  fig.width=7, fig.height=5.5,
  out.width="600px",
  fig.retina=1,
  comment = "#>"
)
fullVignette <- Sys.getenv("_R_FULL_VIGNETTE_") %in% c("1","TRUE")
library(mets)
```

Overview 
========

We here describe how to do regression modelling for cumulative cost
\begin{align*}
{\cal U}(t) & = \int_0^t Z(s) dN(s)
\end{align*}
where $N(s)$ is a counting process that registers the times at which the cost is realized and accumulated, 
and $Z(t)$ is the cost (or marks) at the event times. The counting process can be 
a mix of random and fixed times. The data would thus be represented in counting process format with
the marks/costs going along with the event times. There are many  additional uses of
such cumulative process, for example, when considering time-lost in a recurrent events setting, that we return to
below. 

We can estimate the marginal mean of the cumulative process
\begin{align*}
\nu(t) & = E ( {\cal U}(t) )  
\end{align*}
possibly for strata with standard errors based on the derived influence function. 

We provide semi-parametric regression modelling using the  proportional model 
\begin{align*}
E ( {\cal U}(t) | X) & = \Lambda_0(t) \exp( X^T \beta).
\end{align*}

In addition for a fixed time-point $t \in [0,\tau]$ we can estimate the mean given covariates
\begin{align*}
E ( {\cal U}(t) | X) & = \exp( X^T \beta) 
\end{align*}
where $\tau$ is some maximum follow-up time. 

 - These quantities are estimated in a setting with independent right-censoring given $X$, and based on IPCW adjusted estimating equations. 
    - similarly to the Ghosh-Lin model for recurrent events 
 - A terminal event can be specified. 

We also estimate the probability of exceeding thresholds over time
\begin{align*}
 P ( {\cal U}(t)  > k ) & = \mu_k(t),
\end{align*}
and in the situation with a terminal this is based on a derived competing risks data that 
keeps track of the competing terminal event. 

\begin{tikzpicture}[
    node distance=3.5cm,
    box/.style={draw, rectangle, minimum width=2.8cm, minimum height=1cm, align=center}
]
% Nodes
\node[box] (start) {At Risk};
\node[box, right of=start, yshift=2.4cm] (exceed) {Exceed-k};
\node[box, right of=start, yshift=-2.4cm] (death) {D};

% Arrows with hazard labels
\draw[->, thick] (start) -- node[below] {\hspace{2 cm} $F_{exceed-k}(t)$} (exceed);
\draw[->, thick] (start) -- node[below] {$F_{D}(t)$} (death);
\end{tikzpicture}

Regression modelling of this quantity is also possible using competing risks 
regression models, using for example, the cifreg function of mets. 


HF-action data
=============

Considering the HF-action data we simulate a severity score for each event. 


```{r}
library(mets)
data(hfactioncpx12)
hf <- hfactioncpx12
hf$severity <- abs((5+rnorm(741)*2))[hf$id]

proc_design <- mets:::proc_design
## marginal mean using formula  
outNZ <- recurrentMarginal(Event(entry,time,status)~strata(treatment)+cluster(id)
			 +marks(severity),hf,cause=1,death.code=2)
plot(outNZ,se=TRUE)
summary(outNZ,times=3) 

outN <- recurrentMarginal(Event(entry,time,status)~strata(treatment)+cluster(id),data=hf,
		       cause=1,death.code=2)
plot(outN,se=TRUE,add=TRUE)
summary(outN,times=3) 
```

For comparison we also compute the IPCW estimates with and without marks at time 3, using the linear model, 
and note that they are identical. Standard errors are however based on different formula that are asymptotically 
equivalent, and we note that they are very similar.

```{r}
outNZ3 <- recregIPCW(Event(entry,time,status)~-1+treatment+cluster(id)+marks(severity),data=hf,
		  cause=1,death.code=2,time=3,cens.model=~strata(treatment),model="lin")
summary(outNZ3)
head(iid(outNZ3))

outN3 <- recregIPCW(Event(entry,time,status)~-1+treatment+cluster(id),data=hf,cause=1,death.code=2,time=3,
		 cens.model=~strata(treatment),model="lin")
summary(outN3)
head(iid(outN3))
```

We also apply the semiparametric proportional cost model with IPCW adjustment: 

```{r}
propNZ <- recreg(Event(entry,time,status)~treatment+marks(severity)+cluster(id),data=hf,cause=1,death.code=2)
summary(propNZ) 
plot(propNZ,main="Baselines")
     
GL <- recreg(Event(entry,time,status)~treatment+cluster(id),hf,cause=1,death.code=2)
summary(GL)
plot(GL,add=TRUE,col=2)
```
Those treated have 14 \% lower cumulative severity and 11\%  lower number of expected events. 


Exceed threshold
=================

Finally, we also estimate the probability of exceeding a cumulative severity at 1,5,10

```{r}
ooNZ <- prob.exceed.recurrent(Event(entry,time,status)~strata(treatment)+cluster(id)+marks(severity),data=hf,
			      cause=1,death.code=2,exceed=c(1,5,10,20))
plot(ooNZ,strata=1)
plot(ooNZ,strata=2,add=TRUE)
summary(ooNZ,times=3)
```

Cumulative time lost for recurrent events 
=========================================

The cumulative time lost for recurrent events has been defined as
\begin{align*}
  {\cal M}(t) = E[ \int_0^\tau (\tau-s) dN(s) ] = \int_0^\tau \mu(s) ds
\end{align*}
where $\mu(t) = E( N(t) )$ is the marginal mean of the recurrent events at time $t$. 


```{r}
hf$lost5 <- 5-hf$time

RecLost <- recregIPCW(Event(entry,time,status)~-1+treatment+cluster(id)+marks(lost5),data=hf,
		   cause=1,death.code=2,time=5,cens.model=~strata(treatment),model="lin")
summary(RecLost)
head(iid(RecLost))
```


SessionInfo
============


```{r}
sessionInfo()
```