Seminář: Document Classification with Supervised Latent Feature Selection

Datum a čas 8. 3. 2012 10:30 - 12:00
Místnost 403 NB

Document Classification with Supervised Latent Feature Selection

Prezentující: Ondřej Háva

The classification of text documents to categories generally deals with large dimensionality of a structured representation of the documents. To favor generality over accuracy of the classifier some dimensionality reduction technique has to be applied.We propose a classification algorithm that utilizes the hidden structure of uncorrelated topics extracted from training documents and their known categories that may not be independent. The proposed classifier takes advantage of singular value decomposition of input and target variables and is capable of including various methods of hidden feature selection. We evaluated three feature selection procedures on two different collections of text documents. THE SLIDES ARE IN CZECH!