Note: SpOntecular is currently in development.
SpOntecular is a proof of concept for (semi-) automating the extraction of an ontology from technical specifications using the large language model GPT-4x and the semantic framework Apache Jena. The aim is to reduce the manual effort required to identify the individual ontology features.
SpOntecular is actively being developed. Below is an outline of its current and planned features.
- Automated extraction of classes and based on that, deriving of taxonomic (hierarchical) and non-taxonomic relationships as well as cardinality constraints.
- Possibility to
- provide custom definitions of the ontology features
- add specific examples to provide more context for few-shot prompting
- blacklist falsely identified features to exclude them from subsequent extraction cycles
- Functionality to import existing ontologies
- Functionality to download the resulting ontology
The extraction process was implemented as a seven-step workflow. The first four stages are used for the actual extraction of the individual ontology components using GPT-4. To do this, the text corpus from which the ontology is to be generated is first passed to GPT-4 via an API call, together with the appropriate prompt. JSON has been defined as the output format.
In step 1, the concepts are first identified and returned as a JSON list. The results are then passed together with the text corpus to step 2 to build the concept hierarchy and to step 3 to identify the non-taxonomic relations. The identified non-taxonomic relations are then passed to stage 4 to derive the corresponding cardinalities. In addition to passing the intermediate results to each subsequent stage, they are also written to a cache. The cache is initially used to store the individual components of the ontology in order to merge them later.
Function | Technology |
---|---|
Front-End - Template Engine - Interactivity |
Thymeleaf Alpine.js, htmx |
Backend | Spring Boot |
Document processing | Apache POI, Apache PDFBox |
Ontology processing | Apache Jena |
Containerization | Docker |
- Provide your OpenAI API key as environment variable
OPENAI_API_KEY
.
- Enter
http://localhost:8090
after startup - or visit live demo at https://spontencular.konstantinwolters.com