Marvin

Semantic text annotation tools using Wordnet and DBPedia

View the Project on GitHub nikolamilosevic86/Marvin

Welcome to Marvin - Semantic text annotator

Marvin is a semantic text annotation tool that uses various external sources to annotate imputed text. Marvin text annotator can be also used as a java library. Marvin currently supports tagging using Wordnet and DBPedia (linked data version of Wikipedia). So our small Marvin semantic annotator already has a lot of knowledge, which will make anyone probably depressed and therefore we gave him a name with reference to Hitchhikers guide to the Galaxy depressed robot. Marvin, semantic text annotator

Installation

Marvin semantic text annotatior is a java program that can be also used as a java library for other application. This means it is an .jar file, which contains all the resources inside it. However, for use of Wordnet, Wordnet has to be installed from https://wordnet.princeton.edu/wordnet/download/current-version/

After the installation it is necessary to configure Wordnet path in file_properties.xml file. Tag that currently states following:

<param name="dictionary_path" value="C:\Program Files (x86)\WordNet\2.1\dict"/>

has to be changed with the correct path within the machine where Wordnet is installed. There are no other requirements for the installation.

Running Marvin semantic text annotator

Running from the command line

In order to run Marvin semantic annotator you can type in command line

java -jar Marvin.jar "Sentence to be semantically annotated."

Running Marvin semantic text annotator as a library

In order to run Marvin semantic text annotator as a library, you need to also to install WordNet on a used machine and then include the Marvin.jar file into your Java project.

The library contains methods that would query DBPedia and Wordnet and both method return a LinkedList of objects called WordMeaningOutputElement

public class WordMeaningOutputElement {
    public String appearingWord; // what is the word in text
    public String Description; //Definition in Wordnet or abstract from DBPedia
    public String Source; // String Wordnet or DBPedia
    public int startAt; // position where the labelled word starts in your string
    public int endAt; // position where the labelled word ends in your string
    public String id; // id in Wordnet or URI from DBPedia
    public String URL; // link to Wordnet definition of term or DBPedia URI
}

Roadmap

Roadmap of project includes addition of new external sources for semantic tagging. Currently the idea is to include the following:

Reference

If you used Marvin, please cite:

Authors and Contributors

This project was created as a part of Nikola Milosevic's (@nikolamilosevic86) PhD project at the University of Manchester, supervised by dr Goran Nenadic (Univerisity of Manchester), Cassie Gregson (AstraZeneca) and Robert Hernandez (AstraZeneca). More information about Nikola you can find at his web page or you may follow him on twitter

Support or Contact

Having trouble with this project? Contact nikola.milosevic@manchester.ac.uk and we’ll help you sort it out.