---
title: "OnassisJavaLibs: Java Libraries to support Onassis, Ontology Annotation and Semantic Similarity software"
author: "Eugenia Galeota"
date: "`r Sys.Date()`"
output: BiocStyle::html_document
vignette: >
  %\VignetteIndexEntry{OnassisJavaLibs: Java Libraries to support Onassis, Ontology Annotation and Semantic Similarity software}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
  %\VignettePackage{OnassisJavaLibs}
  %\VignetteDepends{OnassisJavaLibs}
---

# Introduction

OnassisJavaLibs is a data package containing various compiled libraries in jar format and their source code used by `r BiocStyle::Biocpkg("Onassis")` to
<ul>
<li>
 create a conceptmapper pipeline aimed at finding concepts belonging to OBO ontologies  (OBO, OWL or RDF format) in any type of text </li>
 <li> compute different semantic similarity measures between two concepts of an ontology or two sets of concepts. </li>
 </ul>
 
# Contents of the package

Conceptmapper Java (1.8 version) libraries (https://github.com/UCDenver-ccp/ccp-nlp) version 3.3.2, and semantic similarity libraries (https://github.com/UCDenver-ccp/ccp-nlp) version 0.9.5 have been compiled using maven. 

They are available in jar format within the extdata directory of this package and can be located through `r system.file('extdata', 'java', packade='OnassisJavaLibs')`.

# Accessing and compiling the Java source code

The source code for the java libraries is available in the `java` directory of package tarball. Users interested in compiling their own jar files can refer to the following information. 

The `conceptmapper` subdirectory contains the java code to annotate text with concepts from OBO ontologies. 
To create a jar file including all the needed dependencies, the source code can be compiled using maven (Apache Maven 3.3.9 was used).

The `slib` and `similarity` subdirectories contain the java code to determine the semantic similarities between concepts, and can be compiled and installed with the following goals, respectively:

```{r, engine = 'bash', eval = FALSE}
#From the conceptmapper directory
mvn clean compile assembly:single --Dlog4j.configuration=log4j2.properties

#From the slib directory
mvn clean install 

#From the similarity directory
mvn clean install assembly:single 

```


# Use of the package 

The methods and classes implemented in the described Java libraries can be used through R functions and methods available within Onassis. Alternatively, the Java code can be directly executed using `rJava`. For example a dictionary from an OBO ontology file can be created through the following code: 

```{r echo=TRUE, eval=TRUE}

require(rJava)

#Initializing the JVM
.jinit()

#Adding the path to the jar file
jarfilePath <- file.path(system.file('extdata', 'java', 'conceptmapper-0.0.1-SNAPSHOT-jar-with-dependencies.jar', package='OnassisJavaLibs'))

.jaddClassPath(jarfilePath)

#Creating an instance of the OntologyUtil with the sample obo file
ontoutil <- .jnew("edu.ucdenver.ccp.datasource.fileparsers.obo.OntologyUtil", .jnew('java/io/File', file.path(system.file('extdata', 'sample.cs.obo', package='OnassisJavaLibs'))))

#Creating the output file containing the conceptmapper dictionary
outputFile = .jnew("java/io/File", "dict.xml")

#Building of the dictionary from the OBO ontology
dictionary <- J("edu.ucdenver.ccp.nlp.wrapper.conceptmapper.dictionary.obo.OboToDictionary")$buildDictionary(
  outputFile,
  ontoutil,
  .jnull(),
  J("edu.ucdenver.ccp.datasource.fileparsers.obo.OntologyUtil")$SynonymType$EXACT
)
```

To compute the semantic similarity between two terms of the same ontology, classes in the similarity library can be used in this way:

```{r echo=TRUE, eval=TRUE}

#Adding the similarity library containing the similarity class to compute semantic similarities
jarfilePath <- file.path(system.file('extdata', 'java', 'similarity-0.0.1-SNAPSHOT-jar-with-dependencies.jar', package='OnassisJavaLibs'))

.jaddClassPath(jarfilePath)

#Creating an instance of the class Similarity
similarity <- .jnew("iit/comp/epigen/nlp/similarity/Similarity")


#Loading the ontology in a grah structure
file_obo <- file.path(system.file('extdata', 'sample.cs.obo', package='OnassisJavaLibs'))

ontology_graph <- similarity$loadOntology(file_obo)

#Setting the semantic similarity measures 
measure_configuration <- similarity$setPairwiseConfig('resnik', 'seco')

#Terms of the ontologies need to be converted into URIs

term1 <- 'http://purl.obolibrary.org/obo/CL_0000771'
  
term2 <- 'http://purl.obolibrary.org/obo/CL_0000988'

URI1 <- .jcast(similarity$createURI(term1), new.class = "org.openrdf.model.URI", check = FALSE, convert.array = FALSE)

URI2 <- .jcast(similarity$createURI(term2), new.class = "org.openrdf.model.URI", check = FALSE, convert.array = FALSE)

# Computation of the semantic similarity score

similarity_score <- .jcall(similarity, "D", "pair_similarity", URI1, URI2, .jcast(ontology_graph, new.class = "slib.graph.model.graph.G"), measure_configuration)

similarity_score
```

# Acknowledgements 

We would like to thank you the library providers.

The methods for the conceptmapper pipeline and defining the ccp-nlp type system, have been developed and published by the Reagents of the University of Colorado under BSD 3-clause license. 

The methods for computing the semantic similarities instead have been developed and published by the the Ecole des mines d'Alès under the GPL-compatible CeCILL license.

Both licenses are provided within the package.