Ebook: Data Mining to Determine Risk in Medical Decisions
Decisions regarding the risks involved in medical treatments must belong to patients and their physicians – after all, it is the patient's health and life which is at stake. But patients will not be equipped for this decision-making process if they cannot be given some idea as to the risks and benefits of treatment. Such risks are generally estimated by a consensus panel of specialist physicians using supporting medical literature. Unfortunately, this literature does not always provide a good estimate of risk, particularly in the case of rare occurrences. This book demonstrates statistical techniques that can be used to investigate matters of risk. These include kernel density estimation, predictive modeling, association rules and text analysis. It also shows, through example, how these techniques can provide meaningful results, and examines current methods, discussing some of the flaws in models which may lead to misleading results. After a general introduction to the concept of medical risk, the subjects covered include the process by which rare occurrences are investigated in drugs or treatments, the trade-offs between risks and benefits, extrapolation of clinical trial results and the cost of healthcare in relation to risks. It also examines problems such as competing risks, error, and the use of group identities, as well as looking at the issue of futility. The book concludes with a chapter providing a general discussion and summary, and an appendix shows some of the processes for using SAS Enterprise Miner to perform some of the models used in the text.
I am not a medical professional; I am a statistician. I have spent over 20 years researching health outcomes and investigating medical risk using actual data rather than to depend upon consensus panels developing medical guidelines. As a statistician, I can serve as a watch dog to the health care profession, and to use data to validate or to not validate the findings of consensus panels.
Statisticians regularly investigate the concepts of risk and uncertainty. They also look to the problem of variability of response. There is no question that different patients can and do respond differently. Therefore the ratio of benefit to risk often has to be considered for individual patients. Unfortunately, the medical profession often assumes that all patients will have an average response and will react in the same way. The use of advanced data mining tools and process enables researchers to investigate individual patient response and risk. In this book, we will investigate different types of risk that involve treatment as well as policy.
Generally, consensus panels are used to estimate risk and to provide guidelines based upon those risk estimates. Several physicians in a specific field are gathered together and use supporting medical literature to estimate risk until a general consensus is reached. Unfortunately, the studies are not always clear or cannot provide good estimates of risk. This is particularly true concerning the risk of rare occurrences.
Throughout the text, we will demonstrate statistical techniques that can be used to investigate matters of risk. These include kernel density estimation, predictive modeling, association rules, and text analysis. Text analysis, in particular, is useful in defining ranks of risk. It does not require assumptions that are generally and knowingly false. We use data sets that are publicly available, but the techniques can be used on the electronic medical record. In addition, we demonstrate through example how the techniques can provide meaningful results. Moreover, we examine current methods and discuss some of the flaws in the models that can give misleading results.
Chapter 1 provides a general introduction to the concept of medical risk. It gives different types of risks that should be considered in medical research. These include individual risk based upon treatment and risk of a lack of treatment based upon social policy. In particular, we discuss the process of comparative effectiveness analysis and how it can be used to ration treatment to patients, especially patients with terminal illnesses such as cancer. The remaining chapters examine different aspects of risk and how risk is defined or estimated.
Chapter 2 examines the process by which rare occurrences are investigated once a drug, treatment, or medical device is approved for use. Currently, adverse events are voluntarily reported in databases made available by the Centers for disease Control. There are occasional studies that look at adverse events. Because these types of occurrences are rare, there needs to be a large number of people involved in the studies. To reduce the number of people involved, formal studies generally rely upon surrogate endpoints; often these surrogate endpoints are assumed to be related to the actual endpoint.
Chapter 3 discusses the importance of trade-offs between medical risk and medical benefits. It is not always important to a patient to choose the treatment or procedure with the least risk; instead, a patient might be willing to accept some risk in order to achieve improved benefits. We look at different ways to identify risk, even when the risk itself is unknown.
Chapter 4 examines the problem of competing risks, when different treatments have different risks, and the patient and physician need to choose between these competing risks. Such competing risks will often occur when choosing between a medical treatment as opposed to a surgical treatment. In particular, we discuss comparative effectiveness analysis in detail and discuss some of the unintended consequences of the use of comparative effectiveness models. One of the important aspects of comparative models is the definition of a patient's quality of life. Currently, the definition of quality is largely defined in terms of functioning. We demonstrate through text analysis how we can better understand the subjective definition of quality.
Chapter 5 examines the problem of error when assessing risk, or of unknown risk. Because of the problem of examining rare occurrences as discussed in Chapter 2, not all of the risks of a treatment or procedure will be known. The medical profession generally can also have a misperception of risk based upon consensus panels that use little data. If there is such a misperception, we discuss how outcomes research can be used to compute more accurate risk values.
Chapter 6 examines the problem of using group identities in decision making rather than to use individual identities. Groups can be based upon age, gender, race, and marital status. Group identity is necessary when small samples are used in statistical analysis. However, large databases with longitudinal data can be used with data mining and predictive modeling techniques to examine individual differences, and to make more accurate prescriptions.
Chapter 7 considers the fact that most clinical trials have very specific inclusion/ exclusion criteria for patients enrolled in the trial. Often, the results of the trial are generalized to the entire population. This is called extrapolation. Unfortunately, clinical trials often enroll those at the very highest risk and the extrapolation is to patients at much more moderate risk and there is no real surety that the extrapolation is useful or valid.
Chapter 8 looks at the issue of futility where treatments are declared to be too futile to be tried on a specific group of patients. We examine the question of how low a probability of success must be before it is considered to be futile. In addition, we examine the claim that decisions of futility are made without regard to the cost of the treatment.
Chapter 9 examines the cost of healthcare in relationship to the risks. It examines reasons for the increasing cost of healthcare and provides some suggestions to the reduction of costs that have already been implemented in some businesses and states. In particular, insurance mandates that are required in insurance policies significant increase the cost of that insurance.
Chapter 10 provides a general discussion and summary concerning risk and benefit, and how they are applied to some specific diagnoses of concern. It also provides my personal journal through medical risks and decision making.
An appendix shows some of the processes for using SAS Enterprise Miner to perform some of the models used in this text. It includes a discussion of predictive modeling and text analysis.
As we demonstrate in this text, data can be used to investigate risks and benefits without necessarily relying upon consensus panels that are currently used to define these risks. As we will also show, additional studies can clearly demonstrate that these consensus panels do not always provide accurate opinions.
Ultimately, the decisions as to treatment risks should belong to the patient and to the physician. It is, after all, the patient's health and life. However, the patient will not be equipped to support the decision making process without having some idea as to the risks and benefits of treatment. Physicians, too, need to have an accurate idea as to risks and benefits in order to provide choices to the patient in terms of treatment. It is, therefore, of extreme importance to know what those risks and benefits should be.
Note from the publisher: IOS Press regrets that Dr. Patricia Cerrito passed away a few months prior to publication of this book. Despite her illness, she worked with great diligence to complete her manuscript, and we appreciate her dedication to this task. We are confident that her work will prove to be of value to her intended audience and make a significant contribution to the field.