Many epidemiological studies now rely on the reuse of large healthcare administrative databases. In those studies, most of the time is consumed in managing data and performing basic statistical analyses and is not available anymore for complex statistical and medical analysis, therefore the potential of such databases is sometimes underexploited. The objective of this work is to build SAF4SUHAD, a statistical analysis framework for secondary use of healthcare administrative databases, using literature-based specifications. A literature review was performed on PubMed in four different medical domains: caesarian deliveries, cholecystectomies, hip replacement surgeries and bariatric surgeries. We identified 22 papers relating analyses of large databases. They reported epidemiological indicators (e.g. mean age), that were abstracted to features (e.g. univariate description of a quantitative variable), and then were implemented through 32 functions available for the user in R programming language. For instance, a function will draw a histogram, compute the mean with confidence interval, quantiles, etc. Those functions comprehend 4 functions for data management, 9 for univariate analysis, 8 for bivariate analysis, 11 for multivariate analysis, and many other intermediate functions. Those functions were successfully used to analyze a French database of 250 million discharge summaries. The set of R ready-to-use functions defined in this work could enable to secure repetitive tasks, and to refocus efforts on expert analysis.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com