Research Projects

The Department has also been, since mid 90s, involved in a large number of national and international research projects. Among the most important of the cca 15 years, with substantial funding from EU, CSF (Czech Science Foundation, aka GAČR) or another major agency, have been:

  • EU CHIST-ERA CIMPLE – “Countering Creative Information Manipulation with Explainable AI”, 2021-2024, contact Prof. V. Svátek.
  • EU H2020 HeartBIT 4.0 – “Application of innovative Medical Data Science for heart diseases”, 2020-2022, contact Prof. P. Berka
  • EU COST CA18209 Nexus Linguarum – “European network for Web-centred linguistic data science”, 2019-2023, contact Prof. V. Svátek (core group member)
  • CSF 18-23964S „Focused categorization power of web ontologies“, 2018-2021, contact Prof. V. Svátek
  • EU H2020, 2015-2017, contact Dr. V. Svátek
  • CSF 14-14076 „COSOL – Categorization of Ontologies in Support of Ontology Life Cycle“, 2014-2016, contact Dr. O. Zamazal
  • EU FP7 LinkedTV, 2011-2015, contacts Dr. V. Svátek and Dr. T. Kliegr
  • EU FP7 LOD2, 2010-2014, contact Dr. V. Svátek
  • EU FP7 MultilingualWeb-LT, 2012-2013, contact J. Kosek
  • CSF 201/08/0802 “Application of Knowledge Engineering Methods in Knowledge Discovery from Databases”, 2008-2012, contact Prof. J. Rauch
  • CSF „PatOMat – Automation of Ontology Pattern Detection and Exploitation“, 2010-2012, contact Dr. V. Svátek
  • CSF „Web Semantization“, 2010-2012, contact Dr. V. Svátek
  • EU FP6 IST „Knowledge-Practices Laboratory“ (KP-Lab),, 2006-2010, contact Dr. V. Sklenák
  • EU FP6 IST „Knowledge Space of Semantic Inference for Automatic Annotation and Retrieval of Multimedia Content“ (K-Space, see CORDIS record), 2006-2008, contact Dr. V. Svátek
  • EU DG SANCO Public Health Programme: „Quality Labeling of Medical Web Content using Multilingual Information Extraction“ (MedIEQ),, 2006-2008, contact Dr. V. Svátek
  • EU eContent EDC-22249 „Multilingual Content Aggregation System based on TRUST Search Engine“ (M-CAST),, 2005-2006, contact Dr. V. Sklenák
  • CSF 201/05/0325 “New methods and tools for knowledge discovery in databases”, 2005-2007, contact Prof. J. Rauch
  • CSF „Intelligent analysis of WWW content and structure“, 2003-2005, contact Dr. V. Svátek

The Department also participated as minor (ad hoc / associate / network) partner in a number of other EU projects, such as TAILOR, MultilingualWeb, PetaMedia, KDubiq, Knowledge Web, OntoWeb, KDnet or MLnet.

Besides the funded projects, there are also long-term informal projects partially aligned with them; they focus on development and maintenance of software tools and other artifacts such as linked datasets, ontologies and standards. The prominent ones are listed in the following table.

Logo Description Contact Group
RDFRules RDFRules is an analytics engine for rule mining in RDF knowledge graphs. See a live demo and the project repository. Zeman, Kliegr DMKD+SWOE
EasyMiner EasyMiner inductively learns rules using the “web search” paradigm, over an HTML interface. See Kliegr DMKD
LISp-Miner LISp-Miner is a desktop data mining tool for effectively mining richly structured association hypotheses from tabular data; it also features a grid-based mining solution (LM-Grid) and generation of (nugget-containing) artificial datasets suitable for data mining education (ReverseMiner). See Rauch DMKD
GAIN General Analytics INterceptor tracks and aggregates user interaction with web applications. It powers the platform (placed 2nd out of 17 competitors in the ACM RecSys’13 News Recommender Challenge).
Collaboration with the Web Intelligence group at the CTU.
Kliegr DMKD EntityClassifier (THD tool) identifies entities in text and assigns them DBpedia types. It exploits the Linked Hypernym Dataset containing millions of entity-type assignments, of which most are novel w.r.t. both DBpedia and Yago 2s. See and
Collaboration with the Web Intelligence group at the CTU.
PatOMat PatOMat is a suite of tools for pattern-based transformation of ontologies and vocabularies. It among other includes a graphical plugin for Protégé, called GUIPOT. Best demo award at EKAW’12. See Zamazal
SWOE OpenData.CZ is a joint initiative with Charles University, Prague, towards linked open government data publication (in Czech Republic and EU in general). In this context the Public Contracts Ontology has been developed, and a number of open datasets have been RDFized. See Svátek SWOE Within the project, an RDF data model for fiscal data based on the Data Cube Vocabulary has been developed, and is used by the project partners as well as several other parties. Svátek SWOE
DBpediaCZ Czech DBpedia is a national branch of DBpedia. It contains structured information in RDF, automatically extracted from Czech Wikipedia using mapping rules and other inventory. Zeman
DB-quiz DB-quiz is a game in which two players strive to connect the three sides of the triangular board by occupying hexagonal fields. A field is occupied through correctly answering a question generated from a fact base: Czech or English DBpedia, or a custom one. The challenging aspects of the game design are question despoilerification and difficulty rating. Zeman SWOE
SPARQlab SPARQLab is an online exercise book for learning the SPARQL query language. The current version uses the linked dataset of the Czech Social Security Administration. (Both the application and the dataset documentation is currently in Czech only.) Svátek SWOE
LODSight LODSight is a linked dataset summarization tool with special focus on employed vocabularies. It aims to give insights into a dataset and vocabularies at the same time, by distinguishing between different vocabularies and providing examples of concrete values from the dataset as annotations to the summary graph. Dudáš
PURO PURO is a formalism designed for “style-neutral” ontological background models from which OWL models in different modeling styles can be generated. Two interconnected tools have been developed for authoring PURO models and generation of OWL models from them: PURO Modeler and OBOWLMorph. Dudáš
OOSP OOSP, Online Ontology Set Picker, is a tool for rapidly building tailored collections of ontologies based on their various metrics. Such collections can then be used as benchmarks for testing ontology management tools. The source collections used are LOV and Bioportal. Zamazal SWOE
NEST NEST is an expert system shell with multiple methods of uncertainty processing and reasoning algorithms. Desktop version at; web-based version at; newest domain-specific application (ontology visualization tool recommender) at Berka IIS
AIQ Extended AIQ test: Algorithmic Intelligence Quotient test (Legg and Veness, 2013) is a test of machine intelligence that approximates the Universal Intelligence of an agent (Legg and Hutter, 2007). The extended version available at addresses several issues of the original test. Also available at is a tool to conduct semantic analysis of environment programs used by the test. Vadinský IIS

JNVDL is a Java-based tool for controlling the processing and validation of compound XML documents. See Kosek WELT

Support of the Internationalization Tag Set Version 2.0 in the W3C Markup Validator. See and Kosek WELT