Seminář: Named Entity Recognition in Czech

Datum a čas 23. 4. 2009 10:30 - 12:00
Místnost 403 NB

Named Entity Recognition in Czech

Prezentující: Zdeněk Žabokrtský

The term Named Entities stands for names of people, geographical names, names of institutions etc. The talk will be focused on recognizing (detection and classification) of Named Entities in Czech texts. In the presented experiments, we use Machine Learning approach, namely the Support Vector Machines method, trained and evaluated using human-annotated instances of named entities in a sample of sentences from the Czech National Corpus. The resulting system for Named Entity Recognition is implemented in TectoMT, which is an open-source software framework allowing integration of various Natural Language Processing tools and using them in real-world applications such as Machine Translation.