---
title: "OnassisJavaLibs: Java Libraries to support Onassis, Ontology Annotation and Semantic Similarity software"
author: "Eugenia Galeota"
date: "`r Sys.Date()`"
output: BiocStyle::html_document
vignette: >
%\VignetteIndexEntry{OnassisJavaLibs: Java Libraries to support Onassis, Ontology Annotation and Semantic Similarity software}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
%\VignettePackage{OnassisJavaLibs}
%\VignetteDepends{OnassisJavaLibs}
---
# Introduction
OnassisJavaLibs is a data package containing various compiled libraries in jar format and their source code used by `r BiocStyle::Biocpkg("Onassis")` to
-
create a conceptmapper pipeline aimed at finding concepts belonging to OBO ontologies (OBO, OWL or RDF format) in any type of text
- compute different semantic similarity measures between two concepts of an ontology or two sets of concepts.
# Contents of the package
Conceptmapper Java (1.8 version) libraries (https://github.com/UCDenver-ccp/ccp-nlp) version 3.3.2, and semantic similarity libraries (https://github.com/UCDenver-ccp/ccp-nlp) version 0.9.5 have been compiled using maven.
They are available in jar format within the extdata directory of this package and can be located through `r system.file('extdata', 'java', packade='OnassisJavaLibs')`.
# Accessing and compiling the Java source code
The source code for the java libraries is available in the `java` directory of package tarball. Users interested in compiling their own jar files can refer to the following information.
The `conceptmapper` subdirectory contains the java code to annotate text with concepts from OBO ontologies.
To create a jar file including all the needed dependencies, the source code can be compiled using maven (Apache Maven 3.3.9 was used).
The `slib` and `similarity` subdirectories contain the java code to determine the semantic similarities between concepts, and can be compiled and installed with the following goals, respectively:
```{r, engine = 'bash', eval = FALSE}
#From the conceptmapper directory
mvn clean compile assembly:single --Dlog4j.configuration=log4j2.properties
#From the slib directory
mvn clean install
#From the similarity directory
mvn clean install assembly:single
```
# Use of the package
The methods and classes implemented in the described Java libraries can be used through R functions and methods available within Onassis. Alternatively, the Java code can be directly executed using `rJava`. For example a dictionary from an OBO ontology file can be created through the following code:
```{r echo=TRUE, eval=TRUE}
require(rJava)
#Initializing the JVM
.jinit()
#Adding the path to the jar file
jarfilePath <- file.path(system.file('extdata', 'java', 'conceptmapper-0.0.1-SNAPSHOT-jar-with-dependencies.jar', package='OnassisJavaLibs'))
.jaddClassPath(jarfilePath)
#Creating an instance of the OntologyUtil with the sample obo file
ontoutil <- .jnew("edu.ucdenver.ccp.datasource.fileparsers.obo.OntologyUtil", .jnew('java/io/File', file.path(system.file('extdata', 'sample.cs.obo', package='OnassisJavaLibs'))))
#Creating the output file containing the conceptmapper dictionary
outputFile = .jnew("java/io/File", "dict.xml")
#Building of the dictionary from the OBO ontology
dictionary <- J("edu.ucdenver.ccp.nlp.wrapper.conceptmapper.dictionary.obo.OboToDictionary")$buildDictionary(
outputFile,
ontoutil,
.jnull(),
J("edu.ucdenver.ccp.datasource.fileparsers.obo.OntologyUtil")$SynonymType$EXACT
)
```
To compute the semantic similarity between two terms of the same ontology, classes in the similarity library can be used in this way:
```{r echo=TRUE, eval=TRUE}
#Adding the similarity library containing the similarity class to compute semantic similarities
jarfilePath <- file.path(system.file('extdata', 'java', 'similarity-0.0.1-SNAPSHOT-jar-with-dependencies.jar', package='OnassisJavaLibs'))
.jaddClassPath(jarfilePath)
#Creating an instance of the class Similarity
similarity <- .jnew("iit/comp/epigen/nlp/similarity/Similarity")
#Loading the ontology in a grah structure
file_obo <- file.path(system.file('extdata', 'sample.cs.obo', package='OnassisJavaLibs'))
ontology_graph <- similarity$loadOntology(file_obo)
#Setting the semantic similarity measures
measure_configuration <- similarity$setPairwiseConfig('resnik', 'seco')
#Terms of the ontologies need to be converted into URIs
term1 <- 'http://purl.obolibrary.org/obo/CL_0000771'
term2 <- 'http://purl.obolibrary.org/obo/CL_0000988'
URI1 <- .jcast(similarity$createURI(term1), new.class = "org.openrdf.model.URI", check = FALSE, convert.array = FALSE)
URI2 <- .jcast(similarity$createURI(term2), new.class = "org.openrdf.model.URI", check = FALSE, convert.array = FALSE)
# Computation of the semantic similarity score
similarity_score <- .jcall(similarity, "D", "pair_similarity", URI1, URI2, .jcast(ontology_graph, new.class = "slib.graph.model.graph.G"), measure_configuration)
similarity_score
```
# Acknowledgements
We would like to thank you the library providers.
The methods for the conceptmapper pipeline and defining the ccp-nlp type system, have been developed and published by the Reagents of the University of Colorado under BSD 3-clause license.
The methods for computing the semantic similarities instead have been developed and published by the the Ecole des mines d'Alès under the GPL-compatible CeCILL license.
Both licenses are provided within the package.