Seminář: RExtractor: a Robust Information Extractor
Datum a čas | 21. 5. 2015 10:30 - 12:00 |
---|---|
Místnost | 336 RB |
RExtractor: a Robust Information Extractor
Prezentující: Vincent Kríž
We have presented our initial steps towards a linguistic processing of texts to detect entities and relations between them a year ago. This work was an essential part of the INTLIB project whose aim is to provide a more efficient and user-friendly tool for querying textual documents other than full-text search. Now we present the RExtractor system that processes input documents by natural language processing tools and consequently queries the parsed sentences to extract a knowledge base of entities and their relations. A workflow of the system is designed to be language and domain independent. We demonstrate RExtractor on Czech and English legal documents. In addition, we discuss RExtractor with respect to its deployment in search engines used by customers from a particular domain.