---
title: "WIP: Cooking survival data, 5 minute recipes"
aouthor: Klaus Holst & Thomas Scheike
date: "`r Sys.Date()`"
output:
  rmarkdown::html_vignette:
    fig_caption: yes
    toc: true
    # fig_width: 7.15
    # fig_height: 5.5        
header-includes: 
  - \usepackage{tikz}
  - \usetikzlibrary{positioning, arrows.meta,calc}
vignette: >
  %\VignetteIndexEntry{Cooking survival data, 5 minutes recipes}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  ##dev="png",
  dpi=50,
  fig.width=7.15, fig.height=5.5,
  out.width="600px",
  fig.retina=1,
  comment = "#>"  
)
```

# Overview 

Simulation of survival data is important for both
theoretical and practical work. In a practical setting we might wish to
validate that standard errors are valid even in a rather small sample,
or validate that a complicated procedure is doing as intended. 
Therefore it is useful to have simple tools for generating survival data
that looks as much as possible like particular data. In a theoretical
setting we often are interested in evaluating the finite sample
properties of a new procedure in different settings that often are
motivated by a specific practical problem. The aim is 
provide such tools.

Bender et al. in a nice paper discussed how to generate
survival data based on the Cox model, and restricted attention to some
of the many useful parametric survival models (weibull, exponential).
We here use piecewise linear baseline functions that make it easy to simulate
data that follows closely the baseline given by the data using semi or 
nonparametric models. This makes it easy to capture important aspects of the
data. 

Different survival models can be cooked, and we here give recipes for
hazard and cumulative incidence based simulations. More recipes are
given in vignette about recurrent events. 

  - hazard based.
  - cumulative incidence.
  - recurrent events (see recurrent events vignette).


```{r}
 library(mets)
 options(warn=-1)
 set.seed(10) # to control output in simulations
```

# Hazard based, Cox models  


Given a survival time $T$ with cumulative hazard $\Lambda(t)=\int_0^t \lambda(s) ds$, it
follows that \cite{}
with $E \sim Exp(1)$ (exponential with rate 1), that $\Lambda^{-1}(E)$ will have the
same distribution as $T$.

This provides the basis for simulations of survival times with a given hazard  and is 
a consequence of this simple calculation
$$
  P(\Lambda^{-1}(E) > t) = P(E > \Lambda(t)) = \exp( - \Lambda(t)) = P(T > t).
$$

Similarly if $T$ given $X$ have hazard on Cox form 
$$
  \lambda_0(t) \exp( X^T \beta)
$$
where $\beta$ is a $p$-dimensional regression coefficient and $\lambda_0(t)$ a baseline 
hazard funcion, 
then it is useful to observe also that 
$\Lambda^{-1}(E/HR)$ with $HR=\exp(X^T \beta)$ has the same distribution as $T$ given $X$. 

Therefore if the inverse of the cumulative hazard can be computed we can generate survival with 
a specified hazard function. One useful observation is note that for a piecewise linear continuous
cumulative hazard on an interval $[0,\tau]$ $\Lambda_l(t)$ it is easy to compute the inverse. 

Further, we can approximate any cumulative hazard with a piecewise linear continous 
cumulative hazard and then simulate data according to this approximation. Recall that fitting
the Cox model to data will give a piecewise constant cumulative hazard and the regression coefficients so 
with these at hand we can first approximate the piecewise constant "Breslow"-estimator with a linear
upper (or lower bound) by simply connecting the values by 
straight lines. 

# Delayed entry 

If $T$ given $X$ have hazard  on Cox form 
$$
  \lambda_0(t) \exp( X^T \beta)
$$
and we wish to generate data according to this hazard for those that are alive at time $s$, that is 
draw from the distribution of $T$ given $T>s$ (all given $X$ ), then we note that  
$$
\Lambda_0^{-1}( \Lambda_0(s) + E/HR)) 
$$
with $HR=\exp(X^T \beta))$ and with $E \sim Exp(1)$ has the distributiion we are after. 

This is again a consequence of a simple calculation
$$
  P_X(\Lambda^{-1}(\Lambda(s)+ E/HR) > t) = P_X(E > HR( \Lambda(t) - \Lambda(s)) ) = P_X(T>t | T>s)
$$

The engine is to simulate data with a given linear cumulative hazard. First generating survival data based on
the cumulative hazard cumhaz:j

```{r}
 nsim <- 1000
 chaz <-  c(0,1,1.5,2,2.1)
 breaks <- c(0,10,   20,  30,   40)
 cumhaz <- cbind(breaks,chaz)
 X <- rbinom(nsim,1,0.5)
 beta <- 0.2
 rrcox <- exp(X * beta)
 
 pctime <- rchaz(cumhaz,n=nsim)
 pctimecox <- rchaz(cumhaz,rrcox)
```

Now looking at a simple cox model 

```{r}
 library(mets)
 n <- nsim
 data(bmt)
 bmt$bmi <- rnorm(408)
 dcut(bmt) <- gage~age
 data <- bmt
 cox1 <- phreg(Surv(time,cause==1)~tcell+platelet+age,data=bmt)

 dd <- sim.phreg(cox1,n,data=bmt)
 dtable(dd,~status)
 scox1 <- phreg(Surv(time,status==1)~tcell+platelet+age,data=dd)
 cbind(coef(cox1),coef(scox1))
 par(mfrow=c(1,1))
 plot(scox1,col=2); plot(cox1,add=TRUE,col=1)

 ## changing the parameters 
 cox10 <- cox1
 cox10$coef <- c(0,0.4,0.3)
 dd <- sim.phreg(cox10,n,data=bmt)
 dtable(dd,~status)
 scox1 <- phreg(Surv(time,status==1)~tcell+platelet+age,data=dd)
 cbind(coef(cox10),coef(scox1))
 par(mfrow=c(1,1))
 plot(scox1,col=2); plot(cox10,add=TRUE,col=1)
```


Multiple Cox models for cause specific hazards can be combined, and we start by drawing the 
covariates manually, below we just call the sim.phregs function that draws covariates from
the data, 

```{r}
 data(bmt); 
 cox1 <- phreg(Surv(time,cause==1)~tcell+platelet,data=bmt)
 cox2 <- phreg(Surv(time,cause==2)~tcell+platelet,data=bmt)

 X1 <- bmt[,c("tcell","platelet")]
 n <- nsim
 xid <- sample(1:nrow(X1),n,replace=TRUE)
 Z1 <- X1[xid,]
 Z2 <- X1[xid,]
 rr1 <- exp(as.matrix(Z1) %*% cox1$coef)
 rr2 <- exp(as.matrix(Z2) %*% cox2$coef)

 d <-  rcrisk(cox1$cum,cox2$cum,rr1,rr2)
 dd <- cbind(d,Z1)

 scox1 <- phreg(Surv(time,status==1)~tcell+platelet,data=dd)
 scox2 <- phreg(Surv(time,status==2)~tcell+platelet,data=dd)
 par(mfrow=c(1,2))
 plot(cox1); plot(scox1,add=TRUE,col=2)
 plot(cox2); plot(scox2,add=TRUE,col=2)
 cbind(cox1$coef,scox1$coef,cox2$coef,scox2$coef)
```

Now fully nonparametric model with stratified baselines 

```{r}
 data(sTRACE)
 dtable(sTRACE,~chf+diabetes)
 coxs <-   phreg(Surv(time,status==9)~strata(diabetes,chf),data=sTRACE)
 strata <- sample(0:3,nsim,replace=TRUE)
 simb <- sim.phreg(coxs,nsim,data=NULL,strata=strata)

 cc <-   phreg(Surv(time,status)~strata(strata),data=simb)
 plot(coxs,col=1); plot(cc,add=TRUE,col=2)

 simb1 <- sim.phreg(coxs,nsim,data=sTRACE)
 cc1 <-   phreg(Surv(time,status)~strata(diabetes,chf),data=simb1)
 plot(cc1,add=TRUE,col=3)
```

We now fit cause-specific hazard models with 3 causes (censoring as one of them)
and generate competing risks data with hazards taken from the fitted Cox models. 
Here a situation with stratified baselines for some of the models:

```{stratified}
 ## r with phreg 
 cox0 <- phreg(Surv(time,cause==0)~tcell+platelet,data=bmt)
 cox1 <- phreg(Surv(time,cause==1)~tcell+platelet,data=bmt)
 cox2 <- phreg(Surv(time,cause==2)~strata(tcell)+platelet,data=bmt)
 coxs <- list(cox0,cox1,cox2)
 dd <- sim.phregs(coxs,n,data=bmt)

 ## checking that  cause specific hazards are as given, make n larger
 scox0 <- phreg(Surv(time,status==1)~tcell+platelet,data=dd)
 scox1 <- phreg(Surv(time,status==2)~tcell+platelet,data=dd)
 scox2 <- phreg(Surv(time,status==3)~strata(tcell)+platelet,data=dd)
 cbind(cox0$coef,scox0$coef)
 cbind(cox1$coef,scox1$coef)
 cbind(cox2$coef,scox2$coef)
 par(mfrow=c(1,3))
 plot(cox0); plot(scox0,add=TRUE,col=2); 
 plot(cox1); plot(scox1,add=TRUE,col=2); 
 plot(cox2); plot(scox2,add=TRUE,col=2); 
 
 ########################################
 ## second example 
 ########################################

 cox1 <- phreg(Surv(time,cause==1)~strata(tcell)+platelet,data=bmt)
 cox2 <- phreg(Surv(time,cause==2)~tcell+strata(platelet),data=bmt)
 coxs <- list(cox1,cox2)
 dd <- sim.phregs(coxs,n,data=bmt)
 scox1 <- phreg(Surv(time,status==1)~strata(tcell)+platelet,data=dd)
 scox2 <- phreg(Surv(time,status==2)~tcell+strata(platelet),data=dd)
 cbind(cox1$coef,scox1$coef)
 cbind(cox2$coef,scox2$coef)
 par(mfrow=c(1,2))
 plot(cox1); plot(scox1,add=TRUE); 
 plot(cox2); plot(scox2,add=TRUE); 
```

 - sim.phreg for phreg, can deal with strata 
 - sim.phregs cause specific hazards on phreg form 


One more example  fully non-parametric 

```{r}
 library(mets)
 n <- nsim
 data(bmt)
 bmt$bmi <- rnorm(408)
 dcut(bmt) <- gage~age
 data <- bmt
 cox1 <- phreg(Surv(time,cause==1)~strata(tcell,platelet),data=bmt)
 cox2 <- phreg(Surv(time,cause==2)~strata(gage,tcell),data=bmt)
 cox3 <- phreg(Surv(time,cause==0)~strata(platelet)+bmi,data=bmt)
 coxs <- list(cox1,cox2,cox3)

 dd <- sim.phregs(coxs,n,data=bmt,extend=0.002)
 dtable(dd,~status)
 scox1 <- phreg(Surv(time,status==1)~strata(tcell,platelet),data=dd)
 scox2 <- phreg(Surv(time,status==2)~strata(gage,tcell),data=dd)
 scox3 <- phreg(Surv(time,status==3)~strata(platelet)+bmi,data=dd)
 cbind(coef(cox1),coef(scox1), coef(cox2),coef(scox2), coef(cox3),coef(scox3))
 par(mfrow=c(1,3))
 plot(scox1,col=2); plot(cox1,add=TRUE,col=1)
 plot(scox2,col=2); plot(cox2,add=TRUE,col=1)
 plot(scox3,col=2); plot(cox3,add=TRUE,col=1)
```


# Multistate models: The Illness Death model 

Using a hazard based simulation with delayed entry we can then simulate data 
from for example the general illness-death model. Here the cumulative hazards 
need to be specified.


\begin{tikzpicture}[
    >=Stealth,
    node distance=4cm,
    state/.style={
        rectangle,
        draw=black,
        thick,
        minimum width=3cm,
        minimum height=1cm,
        align=center
    }
]

  % Top states
  \node[state] (H) {Healthy \\ (1)};
  \node[state] (I) [right=of H] {Ill \\ (2)};

  % Dead centered below
  \node[state] (D) at ($(H)!0.5!(I) + (0,-3)$) {Dead \\ (3)};

  % Two straight arrows between Healthy and Ill
  \draw[->, thick] 
    ($(H.east)+(0,0.15)$) -- ($(I.west)+(0,0.15)$)
    node[midway, above] {$\lambda_{12}$};

  \draw[->, thick] 
    ($(I.west)+(0,-0.15)$) -- ($(H.east)+(0,-0.15)$)
    node[midway, below] {$\lambda_{21}$};

  % Death transitions
  \draw[->, thick] (H) -- node[left] {$\lambda_{13}$} (D);
  \draw[->, thick] (I) -- node[right] {$\lambda_{23}$} (D);

\end{tikzpicture}

We simply give the cumulative hazards for the different transitions to the function simMultistate
to simulate data from the model, subsequently we re-estimate the parameters based on the simulated
data to validate the procedure. 


```{r}
 data(CPH_HPN_CRBSI)
 dr <- CPH_HPN_CRBSI$terminal
 base1 <- CPH_HPN_CRBSI$crbsi 
 base4 <- CPH_HPN_CRBSI$mechanical
 dr2 <- scalecumhaz(dr,1.5)
 cens <- rbind(c(0,0),c(2000,0.5),c(5110,3))

 iddata <- simMultistate(nsim,base1,base1,dr,dr2,cens=cens)
 dlist(iddata,.~id|id<3,n=0)
  
 ### estimating rates from simulated data  
 c0 <- phreg(Surv(start,stop,status==0)~+1,iddata)
 c3 <- phreg(Surv(start,stop,status==3)~+strata(from),iddata)
 c1 <- phreg(Surv(start,stop,status==1)~+1,subset(iddata,from==2))
 c2 <- phreg(Surv(start,stop,status==2)~+1,subset(iddata,from==1))
 ###
 par(mfrow=c(2,2))
 plot(c0)
 lines(cens,col=2) 
 plot(c3,main="rates 1-> 3 , 2->3")
 lines(dr,col=1,lwd=2)
 lines(dr2,col=2,lwd=2)
 ###
 plot(c1,main="rate 1->2")
 lines(base1,lwd=2)
 ###
 plot(c2,main="rate 2->1")
 lines(base1,lwd=2)
 
```


# Cumulative incidence 

In this section we discuss how to simulate competing risks data that have a specfied cumulative 
incidence function.  We consider for simplicity a competing risks model with two causes and denote the
cumulative incidence curves as $F_1(t,X) = P(T < t, \epsilon=1|X)$ and  $F_2(t,X) = P(T < t, \epsilon=2|X)$.
Here given some covariate $X$.

To generate data with the required cumulative incidence functions a simple approach is to first figure out
if the subject dies and then from what cause, then finally draw the survival time according to the
conditional distribution.

For simplicity we consider survival times in a fixed interval $[0,\tau]$, and first flip a coin with 
and probabilities $1-F_1(\tau,X)-F_2(\tau,X)$ to decide if the subject is a survivor or dies. 
Then if subject dies we then flip a coin with probabilities $F_1(\tau,X)/(F_1(\tau,X)+F_2(\tau,X))$ and 
$F_2(\tau,X)/(F_1(\tau,X)+F_2(\tau,X))$ to decide if it is a cause $!$,  $\epsilon=1$, or a cause 2, $\epsilon=2$. 
Finally we draw the survival time using the cumulative incidence distribution.
The timing of a cause $j$ event is  thus 
$T = (\tilde F_1^{-1}(U,X)$ with $\tilde F_1(s,X) = F_1(s,X)/F_1(\tau,X)$ and $U$ is a uniform.

Then indeed $P(T \leq t, \epsilon=j|X) = F_j(t,X)$ for $j=1,2$.

We again note and use that if $\tilde F_j(s)$ and $F_j(s)$ are piecewise linear
continuous functions then the inverse is easy to compute. 


## Cumulative incidence I

We here simulate two causes of death with two binary covarites of logistic type
\begin{align*}
F_1(t,X)  &= \frac{ \Lambda_1(t,\rho_1) exp(X^T  \beta)}{1+\Lambda_1(t,\rho_1) exp(X^T  \beta)}
\end{align*}
and $F_2$ here enforcing the sum condition $F_1+F_2 \leq 1$
\begin{align*}
F_2(t,X)  & =  \frac{ \Lambda_2(t,\rho_2) exp(X^T  \beta)}{1+\Lambda_2(t,\rho_2) exp(X^T  \beta)} [ 1- F_1(\tau,X) ]
\end{align*}
or not 
\begin{align*}
F_2(t,X)  & =  \frac{ \Lambda_2(t,\rho_2) exp(X^T  \beta)}{1+\Lambda_2(t,\rho_2) exp(X^T  \beta)}
\end{align*}

The baselines are given as $\Lambda_j(t) = \rho_1 (1- exp(-t/r_j))$ where $\rho_j$ 
and $r_j$ are postive constants, and here $\tau=6$. 

To simulate the survival time we use a piecwise linear approximation of the cumulative
incidence functions and will thus depends on some grid for linear approximation. Our linear 
approximation can be made arbitrarily close to any specific smooth cumulative incidence function.


```{r}
library(mets)
nsim <- 100
rho1 <- 0.4; rho2 <- 2
beta <- c(0.3,-0.3,-0.3,0.3)

dats <- simul.cifs(nsim,rho1,rho2,beta,rc=0.5,depcens=0,type="logistic")

par(mfrow=c(1,2))
# Fitting regression model with CIF logistic-link 
cif1 <- cifreg(Event(time,status)~Z1+Z2,dats)
summary(cif1)
plot(cif1)
lines(attr(dats,"Lam1"))

dats <- simul.cifs(nsim,rho1,rho2,beta,rc=0.5,depcens=0,type="cloglog")
ciff <- cifregFG(Event(time,status)~Z1+Z2,dats)
summary(ciff)
plot(ciff)
lines(attr(dats,"Lam1"))
```

We can also use the parameters based on fitted models 

```{r}
 data(bmt)
 ################################################################
 #  simulating several causes with specific cumulatives 
 ################################################################
 ## two logistic link models 
 cif1 <-  cifreg(Event(time,cause)~tcell+age,data=bmt,cause=1)
 cif2 <-  cifreg(Event(time,cause)~tcell+age,data=bmt,cause=2)

 dd <- sim.cifs(list(cif1,cif2),nsim,data=bmt)

 ## still logistic link 
 scif1 <-  cifreg(Event(time,cause)~tcell+age,data=dd,cause=1)
 ## 2nd cause not on logistic form due to restriction
 scif2 <-  cifreg(Event(time,cause)~tcell+age,data=dd,cause=2)
    
 cbind(cif1$coef,scif1$coef)
 cbind(cif2$coef,scif2$coef)
 par(mfrow=c(1,2))   
 plot(cif1); plot(scif1,add=TRUE,col=2)
 plot(cif2); plot(scif2,add=TRUE,col=2)
```


## CIF Delayed entry

Now assume that given covariates $F_1(t;X) = P(T < t, \epsilon=1|X)$ and  $F_2(t;X) = P(T < t, \epsilon=2|X)$ are two 
cumulative incidence functions that satistifes the needed constraints. We wish to generate data that follows these two
piecewise linear cumulative indidence functions with delayed entry at time $s$.  We should thus
generate data that follows the cumulative incidence functions
$$
\tilde F_1(t,s;X)=   \frac{F_1(t;X) - F_1(s;;X)}{ 1 - F_1(s;X) - F_2(s;X)}
$$
and 
$$
\tilde F_2(t,s;X)=   \frac{F_2(t;X) - F_2(s;;X)}{ 1 - F_1(s;X) - F_2(s;X)}
$$
this can be done according to the recipe in the previous section.  
To be specific (ignoring the $X$ in the formula)
$$
  F_1^{-1}( F_1(s) + U \cdot (1 - F_1(s;X) - F_2(s;X)) )
$$
where $U$ is a uniform, will have distribution given by $\tilde F_1(t,s)$.

# Recurrent events

See also recurrent events vignette

 - sim.recurrent can simulate based on the Two-Stage model where the the 
    - the rate of the terminal event among survivors in on Cox form (phreg)
      - the rate of the recurrent events among survivors is on Cox form (phreg)
      - the rate of the recurrent events is a  marginal Ghosh-Lin model (recreg)
    - the simulations is based on approximations with piecewise linear models based on a grid.
    - the events can be dependent via a frailty random effects (Gamma distributed)
  - simRecurrentII, simRecurrent, simRecurrentList
    - A frailty Gamma model where the rate of the  events and the terminal event are given based on 
       cumulative baselines and relative risk covariate effects. Thus ends up on Cox form  given the frailty and covariates.
    - simRecurrentList can take multiple recurrent events and multiple causes of death 


Two-stage models 

```{r}
 data(hfactioncpx12)
 hf <- hfactioncpx12
 hf$x <- as.numeric(hf$treatment) 
 n <- 1000

 ##  to fit Cox  models 
 xr <- phreg(Surv(entry,time,status==1)~treatment+cluster(id),data=hf)
 dr <- phreg(Surv(entry,time,status==2)~treatment+cluster(id),data=hf)
 estimate(xr)
 estimate(dr)

 simcoxcox <- sim.recurrent(xr,dr,n=n,data=hf)

 xrs <- phreg(Surv(start,stop,statusD==1)~treatment+cluster(id),data=simcoxcox)
 drs <- phreg(Surv(start,stop,statusD==3)~treatment+cluster(id),data=simcoxcox)
 estimate(xrs)
 estimate(drs)

 par(mfrow=c(1,2))
 plot(xrs); 
 plot(xr,add=TRUE)
###
 plot(drs)
 plot(dr,add=TRUE)

```

and a now with Ghosh-Lin and Cox marginals 

```{r}
 recGL <- recreg(Event(entry,time,status)~treatment+cluster(id),hf,death.code=2)
 estimate(recGL)
 estimate(dr)

 simglcox <- sim.recurrent(recGL,dr,n=n,data=hf)

 simcoxcox <- sim.recurrent(xr,dr,n=n,data=hf)
 dtable(simcoxcox,~statusD)

 recGL <- recreg(Event(entry,time,status)~treatment+cluster(id),hf,death.code=2)
 simglcox <- sim.recurrent(recGL,dr,n=n,data=hf)

 GLs <- recreg(Event(start,stop,statusD)~treatment+cluster(id),data=simglcox,death.code=3)
 drs <- phreg(Surv(start,stop,statusD==3)~treatment+cluster(id),data=simglcox)
 estimate(GLs)
 estimate(drs)

 par(mfrow=c(1,2))
 plot(GLs); 
 plot(recGL,add=TRUE)
###
 plot(drs)
 plot(dr,add=TRUE)

```


Frailty models 

```{r}
 data(CPH_HPN_CRBSI)
 dr <- CPH_HPN_CRBSI$terminal
 base1 <- CPH_HPN_CRBSI$crbsi 
 base4 <- CPH_HPN_CRBSI$mechanical

 n <- 100
 rr <- simRecurrent(n,base1,death.cumhaz=dr)
 ###
 par(mfrow=c(1,3))
 showfitsim(causes=1,rr,dr,base1,base1,which=1:2)

 rr <- simRecurrentII(n,base1,base4,death.cumhaz=dr)
 dtable(rr,~death+status)
 showfitsim(causes=2,rr,dr,base1,base4,which=1:2)

 cumhaz <- list(base1,base1,base4)
 drl <- list(dr,base4)
 rr <- simRecurrentList(n,cumhaz,death.cumhaz=drl)
 dtable(rr,~death+status)
 showfitsimList(rr,cumhaz,drl) 
```


# Parametric models
While the semi‑parametric Cox model provides substantial flexibility for
simulating survival data, there are situations where a fully parametric
simulation model is convenient or preferable. Here we consider a Weibull model
 parametrized so that the cumulative hazard is given by $$\Lambda(t) = \lambda
 \cdot t^s$$ where $s$ is the **shape parameter**, and $\lambda$ the **rate
 parameter**. We allow regression on both parameters
 \begin{align*} \lambda :=
 \exp(\beta^\top X), \quad s := \exp(\gamma^\top Z) \end{align*}
  where $X$ and
 $Z$ are covariate vectors. Specifically, this opens up for exploring
 non‑proportional hazards when $s$ depends on covariates.

Revisiting the TRACE data example we can compare the predictions from the Cox
and the Weibull-Cox model stratified by `chf` and with a proportional hazard
effect of `age` 
```{r weibull1}
data(sTRACE, package = "mets")
dat <- sTRACE
cox1 <- phreg(Surv(time, status > 0) ~ strata(chf) + I(age - 67), data = sTRACE)
coxw <- phreg_weibull(Surv(time, status > 0) ~ chf + age,
    shape.formula = ~chf,
    data = sTRACE
    )
coxw

tt <- seq(0, max(sTRACE$time), length.out = 100)
newd <- data.frame(chf = c(1, 0), age=67)
pr <- predict(coxw, newdata = newd, times = tt, type="chaz")
plot(cox1, col = 1)
lines(tt, pr[, 1, 1], lty=2, lwd=2)
lines(tt, pr[, 1, 2], lty = 1, lwd = 2)
```

To simulate data we can use the `rweibullcox()` function. Note that the
`stats::rweibull()` function gives a different parametrization where the
cumulative hazard is given by $H(t) = (t/b)^s$, i.e., with the same scale
parameter but where the scale parameter $b$ is related to the rate parameter we consider by $r := b^{-s}$.


```{r weibull_sim}
n <- 5000
newd <- mets::dsample(size=n, sTRACE[,c("chf","age")]) # bootstrap covariates
lp <- predict(coxw, newdata=newd, type="lp") # linear-predictors
head(lp)

## simulate event times
tt <- rweibullcox(nrow(lp), rate = exp(lp[,1]), shape= exp(lp[,2]))

# censoring model
censw <- phreg_weibull(Surv(time, status==0) ~ 1, data=sTRACE)
censpar <- exp(coef(censw))
censtime <- pmin(8, rweibullcox(nrow(lp), censpar[1], censpar[2]))

# combined simulated data
newd <- transform(newd, time=pmin(tt, censtime), status=(tt<=censtime))
head(newd)

# estimate weibull model on new data
phreg_weibull(Surv(time,status) ~ chf + age, ~chf, data=newd)
```

All these steps are wrapped in the `simulate` method:
```{r}
# simulate(coxw, n = 5, cens.model = NULL, data=newd, var.names = c("time", "status"))
simulate(coxw, nsim = 5)
```


# SessionInfo


```{r}
sessionInfo()
```