Clinical Papers are written up in a standard scientific style with an introduction of the paper topic, details of the method used in the study, the study results and a discussion of these results. Often clinical papers will also have a summary at the beginning called an abstract.
The British Medical Journal has added a study to Online First ahead of publication in the paper journal that examined the content of the abstract compared to the main paper.
This study found that the p-values quoted in abstracts are not evenly distributed but are skewed towards values representing statistical significance. The study went on to check the accuracy of the p-values where the data was available. Of the 27 studies that were checked 5 studies were found to be quoting incorrect p-values with an additional 9 studies that used the wrong statistical test or altered the data before performing the test. A table summarises these results.
The conclusion of the paper is "Significant results in abstracts are common but should generally be disbelieved". So, when reading a clinical paper, skip the abstract!
Clinical Trials are constructed to statistically test a hypothesis. Once the study has been completed the results are analysed based upon the original hypothesis (primary outcome) and conclusions are drawn based upon the analysis.
It is generally accepted that the play of chance could have produced the observed results up to 1 time in 20. This is often expressed as a P value of 0.05 or less.
Where studies demonstrate that there is little likelihood that the results occurred by chance the quoted P value will be below 0.05 and the results are said to be statistically significant.
However, statistical significance may not be the same thing as clinical significance. This is because statistical differences are also reliant on the size of the population studied therefore a small difference can be statistically valid if the population size is large enough. A large difference is required if the population size is small.
For example, a 2 point difference on a 60 point depression rating scale was found to be statistically significant in comparing escitalopram (cipralex) to citalopram (cipramil). In a clinical setting detection of this 2 point difference would be virtually impossible to detect.
Clinical Trials sometimes use surrogate endpoints as the measure of the primary outcome of the study. Surrogate endpoints are a proxy for what would ideally be measured but for limitations imposed by time, cost or ethical considerations.
However, these surrogate endpoints may not be an accurate predictor for the outcome we would ideally like to assess. This can have an impact upon the relevance and credibility of the study. Additionally, it may become clear later that, while the surrogate endpoint strongly and statistically indicated a benefit, the impact on a clinically important endpoint is not beneficial.
This happened with doxazosin in the ALLHAT Study. It was found that patients treated with doxazosin rather than the other study drugs were at higher risk of mortality despite effective blood pressure lowering in this and other studies.
For example, there are many studies demonstrating the efficacy of rosuvastatin in cholesterol related outcomes such as reductions in low-density cholesterol and triglycerides. However, there are currently no hard outcome data that show reductions in heart attacks, strokes or deaths. These data are available for other statins, such as simvastatin.
Focussing on secondary endpoints has already been covered, but what do trial investigators do if both the primary and secondary endpoints are unconvincing?
Data dredging may produce a statistically more convincing result. Data dredging is the process of analysing all the trial data looking for outcomes that are statistically significant; it produces post-hoc or tertiary endpoints.
These post-hoc endpoints must be treated with caution as the study was not set up to directly collect, examine and answer any questions relating to this composite data group. The statistically acceptable level for a result occurring by chance is 1 in 20. This means that if we conduct 20 post-hoc analyses one may have been statistically significant purely by chance.
For example, a post-hoc analysis of the ISIS-2 study1 found that aspirin therapy was associated with increased harms in patients born to Gemini or Libra star signs. Clearly this is nonsense and confounds our conclusions.
A more recent study that made use of post-hoc analysis was ASCOT-BPLA2. As already discussed the conclusions reached in this trial should be treated with caution.
- ISIS-2 Collaborative Group Randomized trial of IV streptokinase, oral aspirin, both, or neither among 17,187 cases of suspected acute myocardial infarction. Lancet 1988;2:349-360.
- Prevention of cardiovascular events with an antihypertensive regimen of amlodipine adding perindopril as required versus atenolol adding bendroflumethiazide as required, in the Anglo-Scandinavian Cardiac Outcomes Trial-Blood Pressure Lowering Arm (ASCOT-BPLA): a mulitcentre randomised controlled study. ASCOT Investigators Lancet 2005;366:895-906.
All clinical trials are established to answer a clinical question; this is called the primary endpoint. It is to be hoped that the primary endpoint is demonstrated to be statistically significant at the end of the study but this is not always the case.
When a primary endpoint is not significant there is often a tendency to focus upon other endpoints that were investigated during the study or secondary endpoints as they are called. Consideration should be given to the fact hat the study may not have been designed to answer clinical questions based upon the secondary endpoints and therefore these results should be viewed more cautiously.
For example, in the PROactive study, recently covered here, the primary endpoint was not statistically significant while the secondary endpoint of death, non-fatal myocardial infarction and non-fatal stroke was significant. However, this study was designed to assess the efficacy of pioglitazone in secondary prevention based upon a composite of death, non-fatal MI, stroke, acute coronary syndrome, leg amputation, coronary revascularisation and revascularisation of the leg. Forming conclusions based on the secondary endpoint may not valid.