This paper describes a new method of data retrieval from free text documents in medical domain. Proposed approach creates the document summary and highlights most important keywords in the text. To achieve this result we process the document natural language text and build a descriptor as an internal representation of the document. This descriptor is a graph with concepts, relations between them, and concept points as a metric of relevance. By means of points in the descriptor the approach performs ambiguity resolution, selects most relevant concepts to display in the summary, and votes for keywords highlighting in the text. Besides the direct representation of identified information in the summary, this work proposes a way to provide extended summary by using additional knowledge about relations between medications, procedures, diseases and anatomy. The described approach helps to speed up analysis and decision making processes by means of providing aggregated summary for a document and highlighting most meaningful parts of the document's text. Experiment results demonstrate that automatic summary generation and keywords highlighting can be successfully performed by the proposed approach to achieve meaningful and highly relevant results.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com