Research group: Data Mining and Knowledge Discovery (DMKD)

DMKD’s color among the KIZI groups is blue, referring to the ‘color’ of the ‘oceans’ of data that can be submitted to data mining and knowledge discovery tools.

News

(For older news see page bottom)

Research focus

The Data Mining and Knowledge Discovery (DMKD) group at KIZI (one of its four research groups, overarched by the virtual Knowledge Engineering Group) undertakes research in analyzing various kinds of data in structured, semi-structured and textual form, and deriving useful knowledge from it. The focal areas of the group currently are:

The research on data mining had been present at the Department long before this term became coined: tools for combinatorial data analysis (KAD) and “learning an expert system from observational data” (ESOD, later re-implemented by P. Berka as KEX), both derived from the even earlier GUHA method, appeared, under supervision of J. Ivánek, in early 1980s. Since mid 1990s the flagship datamining tool of KIZI has been LISp-Miner system (conceived by J. Rauch and developed by M. Šimůnek), currently after a major redesign centered around the new LM Workspace module and with scripting support based on LISp-Miner Control Language. Most recently, a family of web-oriented data mining tools arose under the leadership of T. Kliegr, such as EasyMiner.eu (in 2011) integrating the CMS-based reporting tool SEWEBAR, leveraging on background knowledge.

In parallel, there is ongoing work on mining from texts, with special focus on Wikipedia: the Targeted Hypernym Discovery method (THD, now part of the EntityClassifier.eu tool) and the associated LHD dataset.

The research has been supported by a number of research projects. The most important had recently been LinkedTV, an Integrated Project funded by the EU FP7 (2011-2015), under which the text mining tool EntityClassifier.eu and the InBeat.eu recommender have been developed (under the supervision of T. Kliegr). There have also been several CSF (Czech Science Foundation) projects, coordinated by J. Rauch. Newly, the group is engaged in the EU Horizon 2020 project OpenBudgets.eu, where analyses of fiscal data are taking place.

The group also co-organized several international events, most notably, RuleML 2014  (T. Kliegr, J. Rauch) and ISMIS 2009 (J. Rauch, P. Berka), several editions of the ECML/PKDD Discovery Challenge (P. Berka) and of the Linked Data Mining Challenge (V. Svátek).

Team

Group leaders: Jan Rauch, Tomáš Kliegr, Petr Berka, Milan Šimůnek

Other group members:

Past members: Jan Bouchner, Barbora Červenková, Milan Dojchinovski, Ivo Lašek, Andrej Hazucha, Martin Labský, Jan Nemrava, David Pejčoch, Radek Škrabal.

Collaborations

Within the University, the DMKD group mainly cooperates with

Within the Czech Republic, there is lasting cooperation with the Web Intelligence group (led by Dr. Tomáš Vitvar) at the Czech Technical University. In particular, PhD students from the CTU group (M. Dojchinovski, J. Kuchař and I. Lašek) have been directly involved in research activities of the LinkedTV project.

At the international level, the group collaborates with numerous foreign partners, either within EU projects (in particularly, the EU FP7 IP LinkedTV project) or on informal basis. Examples of such joint research are:

Selected recent publications

Education

Activities of the group are reflected in several courses taught at the University, most notably the MSc level courses:

A specialized Bc level course is:

A data mining primer is also provided as part of the Bc level course (mandatory for all students of the Informatics specialty):

Finally, there is also a relevant PhD-level course:

Older news:

 


Copyright (C) 2000 - 2017 University of Economics in Prague