6 Observational Studies on Drugs Efficacy
Introduction
As early as 1864, C.A. Wunderlich, a clinician from Leipzig differentiated between “rational medicine” and “empirical medicine.” (16) This distinction is still made today, even though proponents of the former still deem themselves superior to the latter as followers of conventional medicine, while followers of empirical medicine point to past successes and to wide acceptance by the population.
In reality, these categories are more complicated. Primarily, medicine, even conventional medicine, is a science based on experience. Insights are conscious beliefs or models that we make, based on perceptions of ourselves and our environment, to help us understand our world better. The way in which we process insights, based on sensory perceptions of our experience, is primarily determined genetically. The actual specifications depend on many historically developed or newly acquired influences and circumstances. Thus, there are various ways of gaining insight, and there are different insights based on the same experiences.
This is also the reason that we have so many different schools and directions within medicine. The designation “complementary medicine” is more accurate than the term “empirical medicine,” for example. This serves to clarify that even conventional medicine is based on experience, and that complementary medicine is also based on rational reasoning. The term also means that the methods and the considerations and models behind these methods in complementary medicine do not stand in contrast to, but are an extension to the accepted and established methods and considerations in the traditional school of thought in medicine.
Therapeutic Efficacy and Proof of Efficacy
According to the verdict by the German Federal Administrative Court (Bundesverwaltungsgericht) in October 1993, therapeutic efficacy is “insufficiently established if it does not follow from the documents provided that the application of a certain medicine will lead to a larger number of therapeutic successes than its nonapplication, according to current scientific insights and knowledge.” (4) According to this, efficacy is the trait of a drug that can incur more cures in patients, or at least alleviation of symptoms, than would be possible to achieve without it.
The proof of efficacy resides on three pillars:
– Causality: changes that are expected in a patient through application are compared with changes without application.
– Universality: efficacy must not only be applicable to individually selected patients, but also for all prospective users of the agent.
– Objectivity: the procedure with which the assertions are made must be clearly delineated so that they can be repeated and verified.
In order to make a declaration on the future efficacy of a drug, the experiences and observations of previous applications and nonapplications must be taken into consideration. Statements pertaining to efficacy require an inductive reasoning from known observations to unknown, future events, which can be done through the use of probability and statistical analyses.
The probability for a certain result of an event (e. g., curing a patient by use of a particular medicine) is thus a measure of reliability with which a result can be repeatedly obtained and expected in the future. Using this measure of probability, the efficacy of a drug can be described as follows:
The proof of efficacy demands a comparison of probabilities for therapeutic success that is defined by determination of a primary outcome measure. This can be cure, or at least a visible improvement at the end of treatment. With treatment of symptoms, one can take the symptom being treated as the outcome measure, e. g., reduction of blood pressure when treating for hypertension.
Controlled Clinical Trials
Once outcome measures and effect sizes are determined, the effect sizes and confidence interval (CI) are estimated. Given an appropriate testing procedure, it is then decided whether efficacy can be claimed. For this, data from both a control group and a test group are needed.
Given these, one can only receive a valid (unbiased) estimate, if the patients in both groups are comparable (structurally similar) in relevant starting and treatment terms. This is warranted in controlled clinical trials, in which patient allocation to the test group or to the control group occurs in a randomized fashion.
Usually a similar number of patients are distributed to both the control and the test group, but differences in distribution rates are sometimes possible. The patients in the test group are treated with the drug to be tested, those in the control group with a comparable agent (either placebo or a standard drug). Otherwise the treatment terms and evaluation of results should be the same for both groups as is determined and documented in the protocol.
Through the random allocation, and given the equality in all other treatment terms, structural consistency is guaranteed. In order to eliminate the influence of awareness of the assigned treatment, distribution of drug can be performed double-blinded, in other words, neither the treating doctor nor the patient receiving the treatment is aware of what kind of medicine he or she is receiving, or what group he or she belongs to (given that this is ethical and feasible).
Cohort Studies
Particularly within the field of complementary oncology, there are numerous procedures that have been applied for years, whose efficacy and safety has never been tested in controlled clinical trials. Patient records contain information as to results (such as changes in status of disease) achieved by these methods. It lends itself to utilize this information for proof of efficacy and safety. The type of study that would come into question is the epidemiological cohort study.
Study Design
Cohort studies are epidemiological population studies. Using them should help in examining the relations between various procedures and factors that influence health (e. g., treatment measures, habits, environmental factors), and the health condition—or changes thereof—in the population.
These measures or factors are not determined in the study, but emerge out of the actual situation.
Herein lies the essential difference to controlled clinical trials, in which procedures applied to patients are determined at random. Controlled studies are “experiments” on patients in which an answer is wrenched from nature through determination of the general conditions and systematic specification of test and control group treatment. In comparison, cohort studies are “observational studies” in which no artificial situation is produced and nature is not coerced, but simply systematically observed.
Should cohort studies serve as proof of efficacy for a certain drug (test treatment) for a specific type of disease, patients must be chosen that are representative for this disease from the general population.
In addition to the test treatment, other treatments for the disease should be used in this population, and these may then serve as control treatments. Since the treatments are not predetermined, it is not necessary to include only new treatment patients into the study whose findings are prospectively collected and documented. One may also resort to files of aleady documented and completed cases and collect the data retrospectively (8). Given the increasing use of good information systems for physicians and clinics with well-structured databanks, conducting such studies in the future should become much easier.
When planning and conducting retrospective cohort studies, the same general guidelines apply as have been established by the German Federal Institute for Drugs and Medical Devices (BfArM) (3):
• The study design must specify the responsibilities (director of the study, coordinator, monitoring, biometrics, sponsor).
• The purpose and precise question of the study need to be formulated.
• The selection of patients (or patient files) has to be determined. For this purpose, a representative selection is to be made of eligible treatment facilities (practices, clinics, outpatient follow-up clinics), in which patients with the disease to be studied can receive the test treatment as well as the control treatment.
• The procedures needed to achieve representation must be described.
• Specific inclusion and exclusion criteria for all patients whose data have been collected must be given.
• Inclusion criteria require determination of period of treatment and reasons for treatment (diagnosis, indications, initial state).
• The findings that are collected through patient files must be clearly named (demographic data, medical history, diagnostic findings, performed procedures, initial findings, progressive findings of treatment course, special events, treatment results). The relevance of these findings for query of the study should be explained. It should be specified which findings are primary, and which are secondary, outcome measures (or are important for determining these outcome measures) and which are concomitant or disruptive measures.
• Test treatments, control treatments, and additional treatments must be named and justified.
• The extent and type of specifications to be documented must be determined (e. g., designation of compounds, pharmaceutical form, dosage, duration, and type of treatment (continuous therapy, intermittent, or as needed).
• Plans for concepts for evaluation as well as regulations (see below) for construction of report.
• The number of patients intended should be justified.
• Data from all patients that are included by the criteria, and do not show any exclusion criteria, are to be collected and documented. In the event that the data are saved on a databank that has been verified for completeness, accuracy, and plausibility, these data can merely be transferred to a data set to be evaluated. If the medical histories are present only in paper form, they must be transferred to case report forms (CRF). The accuracy of the transferal should be controlled through independent monitors. The data of the case report forms must then be entered into the databank system and verified for completeness, accuracy, and plausibility.
• Besides patient data, specifications of treatment centers that are relevant for the query of the study (e. g., specialization of the treating doctors, specification of the treatment center) are to be collected and saved. The type of retrieval and documentation (e. g., the databank system used) is declared in the study protocol.
Concept for Evaluation
In cohort studies, allocation of treatment to patients occurs primarily according to decisions made by physicians or patients. It can be seen as a random incident, whose distribution depends on various parameters of the treatment facility and patients (“co-variables”). These variables will generally also influence the treatment outcome, and a direct comparison between test and control group is no longer possible. Test and control group can no longer be seen as identical in structure. This is the reason, why the effect size (see above) cannot be directly assessed from the data.
One of the main problems in evaluation is to even out the influence of these co-variables on treatment outcome and thereby allow an unbiased comparison of treatment success between the two therapeutic groups.
There are two approaches to accomplish this:
• During stratification, subgroups (strata) with similar values in their co-variables are devised. The comparison between control group and test group (estimation of effect size) occurs within the subgroups. The comparative results are summarized in adequate form based on subgroups (9). The matched-pairs technique constitutes a special type of stratification in which the subgroup consists of both a patient pair and similar values for co-variables, in which one patient receives the test treatment while the other receives the control treatment.
• In analysis of co-variance, the dependency of therapeutic success on co-variables is conceived via an adequate function (usually a linear function of the scaled and transformed respective co-variables). With this function, the observed therapeutic results are converted to a common reference value of the co-variables and these (cleansed) results are then compared between the treatment groups.
These problems are surmounted by use of a balancing score. This is a function of all co-variables that could possibly influence treatment allocation, so that at a given value of this function the distribution is independent of the co-variables. To achieve adjustment, stratification according to all possible combinations is no longer necessary: one only needs to stratify according to the values of the function, or clear therapeutic efficacy with the one function of balancing score.
A balancing score that lends itself for observational studies is the probability of the allocation of test treatment as a function of the co-variable (“allocation score” or “propensity score”) (12, 13). If the function entails all the co-variables that influence treatment allocation, stratification—or better co-variance analysis—according to propensity score offers an optimal adjustment. In this case, following adjustment using the allocation score the therapeutic comparison is as unbiased, and valid, as with random distribution.
Beyond that, the propensity score offers other valuable information on the conditions and variables that induce doctors in practice to apply a certain test treatment. Generally, the advantage of cohort studies vs. randomized studies is that they offer a pristine picture of practical application of the medicine, and can also point out risks that cannot be determined in controlled trials (5).
The practicality and effectiveness of a balancing score was demonstrated in several observational studies (2, 6, 11, 14, 15). In an extensive comparison of controlled clinical trials and observational studies it was shown that therapeutic effects estimated by applied observations did not differ in quality, nor were they consistently larger than the results achieved by randomized, controlled studies (1).