Seminář: Mapping the mining model representation to background knowledge representation (using PHP) in the SEWEBAR CMS

Datum a čas 5. 5. 2011 10:30 - 12:00
Místnost 403 NB

Mapping the mining model representation to background knowledge representation (using PHP) in the SEWEBAR CMS

Prezentující: Stanislav Vojíř

In the data mining process, it is necessary to prepare the source dataset – for example, to select the cutting or grouping of continuous data attributes etc. and use the knowledge from the problem area. Such a preparation process can be guided by background (domain) knowledge obtained from experts. In the SEWEBAR project, we collect the knowledge from experts in a rich XML-based representation language, called BKEF, using a dedicated editor, and save into the database of our custom-tailored (Joomla!-based) CMS system. Data mining tools are then able to generate, from this dataset, mining models represented in the standardized PMML format. It is then necessary to map a particular column (attribute) from the dataset (in PMML) to a relevant ‘metaattribute’ of the BKEF representation. This specific type of schema mapping problem is addressed in my thesis in terms of algorithms for automatic suggestion of mapping of columns to metaattributes and from values of these columns to BKEF ‘metafields’. Manual corrections of this mapping by the user are also supported. The implementation is based on the PHP language. (The slides are in Czech.)