Background: Unstructured health documents (e.g. discharge summaries) represent an important and unavoidable source of information.
Methods: A semantic annotator identified all the concepts present in the health documents from the clinical data warehouse of the Rouen University Hospital.
Results: 2,087,784,055 annotations were generated from a corpus of about 11.9 million documents with an average of 175 annotations per document. SNOMED CT, NCIt and MeSH were the top 3 terminologies that reported the most annotation.
Discussion: As expected, the most general terminologies with the most translated concepts were those with the most concepts identified.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 email@example.com
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 firstname.lastname@example.org