Seminář: Information in Czech healthcare documents. Where is it hidden and how to extract it?
Datum a čas | 10. 3. 2011 10:30 - 12:00 |
---|---|
Místnost | 403 NB |
Information in Czech healthcare documents. Where is it hidden and how to extract it?
Prezentující: Karel Zvára
Czech healthcare documentation is usually in the form of a free text. I will give a brief overview of possible target structures (electronic health record), code lists commonly used in Czech republic/abroad and of the current state of my PhD thesis. I will show my approach to tokenization of text, morphological analysis using dictionaries (Czech iSpell-derived dictionary, Czech version of MeSH, Czech version of ICD10) and PoS tagging using regular expressions.