# msPCA
An R Package for Sparse PCA with Multiple Principal Components

## Installation 
This package can be installed from [CRAN](https://CRAN.R-project.org/package=msPCA) directly: 

```r
install.packages("msPCA")
```

Alternatively, it can be installed from this Github repository using the `devtools` package. You would first need to install `devtools`:

```r
install.packages("devtools")
```

and then run the following commands: 

```r
library(devtools)
install_github('jeanpauphilet/msPCA')
```

## Getting started
The package consists of one main function, `msPCA`, which takes as input:
- a data matrix (either the correlation or covariance matrix of the dataset),
- the number of principal components (PCs) to be computed, r,
- a list of r integers corresponding to the sparsity of each PC.
  
It returns an object with 4 fields
- `x_best` (p x r array containing the sparse PCs), 
- `objective_value`
- `feasibility_violation`
- `runtime`.

Here is a short example demonstrating how to use the package. First, you need to load the library. 
```r
library(msPCA)
```
Then, define the input variables.
```r
library(datasets)
df <- datasets::mtcars
TestMat <- cor(df)
```
And then simply call the function

```r
mspca(TestMat, 2, c(4,4))
```

<a id="Files"></a>

## Development 
Here, we provide more information about the code structure and organization to help developers that would like to improve the method or build up on it. 

### Files
- R
  - RcppExports.R<br />
      It offers the R interface, which will call the corresponding C++ functions. Regenerate or change it manually if needed (e.g., if the interface changes). We recommend generating it automatically by using `Rcpp::compileAttributes()`.
  - main.R<br />
      It contains all the functions of the package. For the functions coded in Rcpp (and exported in the RcppExports.R file), this script provides (i) user-friendly names, (ii) documentation. This script also defines useful supporting functions.
- man/ contains the pages of the manual: one page for the package and one per function. The are generated automatically from the comments in R/main.R via the `devtools::document()` command. 
- src/ contains the source files of the algorithm, in C++. 
  - ConstantArguments.h<br />
      It contains some parameters of the algorithm that are not directly tuneable by the end user.
  - msPCA_R_CPP.cpp<br />
      It contains the implementation of the algorithm.
  - RcppExports.cpp<br />
      It contains the converted function that can be used by R. Regenerate or change it manually if needed (e.g., if the interface changes). It can be generated using `Rcpp::compileAttributes()`.
  - Makevars<br />
      This is not currently used. Use it to set attributes, such as the version of C++ for compilation.
  - Makevars.win<br />
      This is not currently used. Use it to set attributes, such as the version of C++ for compilation.
- test/ contains some template R notebooks
    - notebook_mtcars.R compares the PCs generated by msPCA on the mtcars dataset with the ones obtained using several alternative packages (elasticnet, PMA, sparsepca)
    - notebook_plot.R provides code to represent the resulting PCs on any 2D-plane
    - notebook_synthetic.R compares the performance of msPCA and elasticnet on synthetically generated data with 2 true sparse PCs. Results are stored in the 'msPCA_synthetic_results.csv' file and graphically represented.
- NAMESPACE<br />
    It is used to build this package. Change it if needed (e.g., if the interface changes).
- DESCRIPTION<br />
    It contains the description of this package.
- LICENSE<br />
    It contains the license information.
- msPCA.Rproj<br />
    It contains the settings of this R project. It is used by RStudio and often does not need to be changed.
### Guidance to future developers
- The essence of this algorithm is in the file "msPCA_R_CPP.cpp" and the file "ConstantArguments.h", where "msPCA_R_CPP.cpp" handles the computation and "ConstantArguments.h" lists all internal arguments. 

