Critical appraisal randomized controlled trial


















Although nonrandomized controlled trials, observational cohort studies, and abstracts are also important sources of clinical information, nonrandomized studies were not included in this search strategy because of their potential for introducing biased estimates of results. It was necessary to include a variety of temporomandibular search terms broad enough to include both joint and muscle disorders since most past studies have not differentiated between these different conditions.

Not only was there a lack of diagnostic differentiation, but occasionally there was also ambiguity as to the diagnostic terms employed.

But, in a review of stabilization splint therapy, 4 the criteria listed for a diagnosis of myofascial pain included not only muscle pain but also TMJ pain on palpation. The issue is that muscle versus joint conditions in TMJD can have different outcomes associated with a given intervention. In addition, a diagnosis of tension-type headache TTHA was also included in this review for the following reasons:. The International Headache Society diagnostic criteria for TTHA with pericranial muscle tenderness and the AAOP diagnostic criteria for temporalis myofascial pain are overlapping sufficiently to suggest that they are the same condition.

The only exceptions to this would be Items 1, 2, 5, 19, 20, 21, and Since this study focused on quality of methods, these reporting criteria are important for understanding the rationale for a study but its generalizability was not assessed. Item 1 relates to the title and abstract; item 2 describes the background and rationale for the study; item 5 reports the objectives of study; item 19 reports adverse events; item 20 includes the discussion of results in light of the study hypotheses and potential systematic and random error; item 21 discusses generalizability of results; and item 22, the discussion of results in light of the current evidence.

For simplicity of reviewing and presenting the criteria, items 8, 9, and 10 were combined as part of the randomization process and items 12, 17, and 18 were combined as part of the statistical analysis process.

Discussing generalizability of the study is an important quality criteria that is dependent on the population and study design characteristics. Since it is difficult to score independently, it was not scored as part of the quality score. There are four internal validity design criteria as defined by the CONSORT criteria that have been shown to provide a unique differentiation between well-designed and poorly-designed studies and are considered Level I criteria.

They include randomization process 8—10 , blinding of outcome methods 11 , comparable groups 15 , and handling of withdrawals or dropouts in the data analysis These same criteria are also the most commonly recommended in quality assessment scales and are the same as Jadad et al 35 and Moher et al. This recommendation is also consistent with the content of Sections 6. For RCTs of TMJD, specific criteria for outcomes, subjects, subject flow, and statistical analysis are unique and need to be defined a priori.

For this reason, a description of how the criteria were used to score whether a study met or did not meet the criteria is included. Two raters of study quality, one an expert in TMJD and orofacial pain and the other a PhD candidate in Health Services Research, first discussed and arrived at a common understanding on the definition and the threshold for meeting each of the quality criteria above. A series of 15 articles were then selected from a broad range of treatment modalities.

In order to compare the scoring of these raters to scores previously published, the selection of articles for reliability testing included 14 of the articles that were reviewed by Forssell and Kalso.

The quality assessment criteria were applied to each RCT in a two-step process. First to be reviewed were the Level I criteria for minimizing systematic bias, and a notation was made as to whether these four criteria had been met.

Second, each study was evaluated as to whether it met 11 remaining criteria. Finally, a quality assessment score was calculated to reflect the percentage of all 15 criteria that was met for each study, thus, permitting an overall estimate of the quality of the evidence base for the treatment of TMJD.

In order to assess improvement in trial methodology over time, a scatter plot analysis was performed for the quality scores of the TMJ studies that qualified as published RCTs. Figure 1 presents the flow of studies that were included in the meta-analysis.

Three hundred and ninety-six studies were reviewed by the reviewers. Following the exclusion of studies that did not meet the definition of a randomized trial, studies remained that were used for quality assessment.

The mean ICC for interrater reliability was 0. This level of reliability was seen as adequate evidence that this proposed set of criteria could be consistently interpreted and applied for systematic reviews.

The confidence intervals ranged from Table 3 presents the means and ranges for the percent of criteria met by the RCTs grouped according to their treatment type. Table 4 presents the individual review criteria with the percent of studies that met them.

Figure 2 represents a scatter plot of quality scores according to the year in which the RCT was published. The overall mean quality score for all the RCTs was 0. A regression line was fitted to these data and had a slope of 0.

This indicates that for each unit year increase on the X-axis, the average quality score increased by 0. Scatter plot of RCT quality versus year of publication. The 11 study design criteria for this quality assessment, scored as present 1 or absent 0 , were determined to be the most common sources of differential systematic and nondifferential random error in the study designs for TMJD RCTs.

Since each study was evaluated only on the published report, it is possible that the specific criteria may have been scored as having not been met but not reported. Furthermore, the problem of publication bias has not been addressed in this study. Publication bias arises from the tendency for researchers and editors to publish experimental results that are positive while results that are negative or inconclusive are left out or unpublished.

This contributes to the overwhelming percent of published articles that demonstrate positive outcomes and thus, systematic reviews may not allow a true indication of the efficacy of a specific treatment. Regardless, quality reviews can still be useful to help investigators design and publish RCTs with biasing factors considered.

The results of this study suggest that many of the universally accepted criteria for clinical trials are often not applied in RCTs of TMJD. Since , the quality of RCTs in TMJD, or the reporting of such, has improved significantly, thus lending more validity to more recent studies Fig 2. Such inadequacies would predictably have contributed to bias in the study designs, thereby resulting in the heterogeneity in the observed results.

Of the four essential Level I criteria, the most common problem was criterion 8 through lack of a defined and concealed randomization process to minimize selection bias. Concealment implies that both the investigator and the subject are blind to and unable to influence the treatment assignment and, thus, the treatment results as well. Since a defined and concealed randomization process is a well-known requirement, some studies may have met this criterion but did not report it. In these studies, it is best to control all variables possible and consider that some subject bias towards better efficacy may be present.

These groups should be comparable with regard to clinician contact, medication use, and time of follow-up. It is difficult for the reader to know these details unless they are specifically reported. Ninety-three percent of studies attempted to minimize comparison group bias by controlling for baseline differences in prognostic factors.

As noted earlier, randomization cannot guarantee the absence of chance-related baseline imbalances between treatment groups that influence results, especially with sample sizes of 40 or less. It is important to measure at baseline whether groups are comparable with regard to known prognostic factors, such as gender, duration of pain, and depression, and to take between-group differences into account for the analysis.

Such protocol deviations or subjects that are lost to follow-up may produce a distortion of the estimated treatment effects. An intent-to-treat analysis needs to include all subjects randomized independent of any protocol violation so that the inherent statistical assumptions based on the randomized treatment allocation are valid. Many of the other quality criteria were also not met by most studies. This may be a contributing factor to Type II error false negative in studies that would have shown an effect if an adequate sample size had been used.

The impression may be that this error did not have a biasing effect and cannot be considered a problem in those studies, but this assumption is untrue. Studies with a small sample size are much more likely to have low power suffering from inflated Type II error, finding no difference when one does exist but the opposite can also be true. However, without taking into account the typical variation in the study factor, it is difficult to know whether a small sample size can accurately represent the target population.

Patient compliance with treatment, particularly when the patient plays a role in the active effects of the treatment such as using a splint or performing an exercise, also contributes to significant variability of results.

In reviewing criteria 6, few studies considered the ceiling or floor effect in selecting subjects whose baseline symptoms were sufficiently severe enough to detect an active effect of treatment. Parallel to this matter of symptom severity are the temporal characteristics such as frequency and duration of the signs or symptoms and the need for clinically relevant outcome measures.

These temporal clinical characteristics may change sooner and become more clinically relevant than pain intensity with some interventions for TMJD.

Without these measurements, important changes in symptoms may remain undetected, resulting again in Type II error. Quality scores can be used in systematic reviews in a variety of ways including weighing higher-quality studies, applying scores as a threshold for inclusion of a study in a review, as well as with an analysis and comparison of results with other reviews.

For example, although weighted composite scores were not used in this review, a good example of the composite scoring approach is presented by Antczak and colleagues. Their proposed quality score included three separate sections: 1 basic identification of the paper for classification purposes, 2 quality of the study protocol, and 3 data analysis and presentation of the paper.

Splints and occlusal adjustments were the two types of occlusal treatments that they examined. These authors determined that the overall quality of these RCT studies was fairly low and the results were equivocal. Although Antczak and colleagues held that meta-analysis was justifiable as the next step after their narrative synthesis of the periodontal treatment evidence, Forssell and Kalso did not, due to the heterogeneity of the TMJD studies that they reviewed.

When the present results are compared to those of Forssell and Kalso, 1 scoring agreement showed a mean ICC of 0. Although the present study included more well-defined criteria, many of the weaknesses of the RCTs found by Forssell and Kalso were also consistent with those identified in the present study. For example, few studies had appropriate randomization, many did not have blinded measurement of outcome, few measured adherence to treatment, and some did not consider the issues of sample size requirements, attention to dropouts, or the use of co-interventions not defined for the study protocol.

This difference between review results may be due to the Forssell and Kalso review being limited to occlusal treatments, whereas the present results were based on a review of six types of treatments and RCTs. Another design concern found by both reviews was the lack of a run-in period relative to prior treatments, self-care, and medications, with few studies satisfying this quality criterion.

Any extraneous treatment such as analgesic medications not defined as part of the experimental or control interventions may influence outcomes and confound the treatment effects. They need to be matched between groups, eliminated before the study begins during the run-in period, or measured and controlled for in the statistical analysis.

There are several limitations to this quality review study. However, as noted above, the use of the criteria of Antczak et al 22 to compare the present findings with those of Forssell and Kalso 1 showed good agreement between both studies.

Second, the searches used in the present study identified RCTs published in the English language but excluded studies in other languages. Thus, while this study attempted to capture the majority of the published literature, it missed some literature that would have had relevance to this review. Thus, their contribution relative to clinical treatment guidelines and recommendations is questionable. Studies are needed to test TMJD interventions both against placebo groups and other treatments to determine their true relative efficacy.

Funding agencies need to insist on standardized methodologies in the review process and ensure that funds are sufficient to conduct high-quality studies. Emerging information systems involving national registries may be appropriate for standardizing design and data collection for multicenter RCTs.

More emphasis should be made on multicenter studies to ensure adequate sample sizes and broad generalizability of the results. Inclusion of the subject flow diagram will provide a description of the progress of participants throughout the study, from the number of potentially eligible individuals for inclusion in the trial, to the number of trial participants in each treatment group who complete the trial.

Editors of journals need to require quality standards in their review processes. This not only encourages investigators to report their methods clearly, but also helps reviewers to assess bias in the study designs accurately. If appropriate design criteria are not met, the investigators should be prepared to justify why they were not applied. Many measures have been already developed and are being used across studies.

More research needs to be conducted on effective tools to improve quality and ease of conducting RCTs. Most of the best evidence stems from research. But EBP goes beyond research use and includes clinical expertise as well as patient preferences and values. The use of EBP takes into consideration that sometimes the best evidence is that of opinion leaders and experts, even though no definitive knowledge from research results exists.

Conner, The Evidence-Based Practice process has five steps. In such cases, the most important thing is whether improvement in outcome is worth the higher cost. When conflict of interest exists, validity of RCT should be in question, independent of the behavior of the investigator.

Conflict of interest can happen with sponsors like pharmaceutical companies, contract research organization, or at multiple levels. Nowadays, most of the trials are blinded, so, it is exceedingly difficult for investigator to manipulate the data and thus the result. But it is possible to alter data unintentionally or knowingly at the level of data analysis by data management team. It is important to check at this level, as most investigators would not even know if results were altered by data analyst.

In simple way, conflict of interest can be divided into non-financial type and financial type. Other classifications are negative conflict of interest and positive conflict of interest.

More common is that we concern about positive conflict of interest, but negative conflict of interest is also worth observing. Is bias present in randomized control trial? Bias is defined as systematic error in the results of individual studies or their synthesis. It is worth noticing that financial conflict of interest is not part of this but, it can be motive behind it. Is randomized control trial peer reviewed or not?

Another important thing about article publication and reliability is whether peer review done or not. Peer-review is the assessment of article by qualified people before publication. Peer-review helps to improve the quality of article by adding suggestion, and second it rejects the unacceptable poor-quality articles.

Most of the reputed journals made their own policy about peer-review. Peer-review is not free of bias. Sometimes, quality of this process depends on selected qualified faculty and their preference on article. In nutshell, critical analysis of RCT is all about balancing the strong and weak points of trial based on analyzing main domains such as right question, right population, right study design, right data, and right interpretation.

It is also important to note that these demarcations are immensely simplified, and they are interconnected by many paths. National Center for Biotechnology Information , U. Indian J Crit Care Med. Author information Copyright and License information Disclaimer. A bstract In the era of evidence-based medicine, healthcare professionals are bombarded with plenty of trials and articles of which randomized control trial is considered as the epitome of all in terms of level of evidence.

Keywords: Critical analysis, Evidence based medicine, Randomized control trial. Open in a separate window. Flowchart 1. Table 1 Factors help to formulate sound question 1 , 6. Right Study Design Experimental design considers better over observational design, as they have better grip on variables, and cause—effect hypothesis can be established. Compare data 2. Number of samples 3.

Degree of association between variables Parametric method Non-parametric Pearson correlation Spearman rank correlation coefficient Coefficient D. Table 4 Common effect size indices 26 — For continuous data. Uses mean value and standard deviation of both groups.

Odds ratio OR Ratio of 2 odds Small 1. NNT can be used for binary outcome. Does not consider magnitude of baseline mortality rate It should be interpreted with its comparison arm and depending on context. C onclusion In nutshell, critical analysis of RCT is all about balancing the strong and weak points of trial based on analyzing main domains such as right question, right population, right study design, right data, and right interpretation.

Footnotes Source of support: Nil Conflict of interest: None. R eferences 1. Aslam S, Emmanuel P. Formulating a researchable question: a critical step for facilitating good clinical research. Indian J Sex Transm Dis. Is mortality a useful primary end point for critical care trials? Chest, Elsevier Inc. Clinical Trial Endpoints. Hum Reprod. Emerging themes in epidemiology the Bradford hill considerations on causality: a counterfactual perspective. Internal and external validity: can you apply research study results to your patients?

Bhalerao S, Kadam P. Sample size calculation. Int J Ayurveda Res. An introduction to power and sample size estimation. Selection of control, randomization, blinding, and allocation concealment. Selection of appropriate statistical methods for research results processing.

Int Electron J Math Educ. Selection of appropriate statistical methods for data analysis. Subgroup analyses in confirmatory clinical trials: time to be specific about their purposes. Post hoc analyses: after the facts. Transplantation, Lippincott Williams and Wilkins.

Three simple rules to ensure reasonably credible subgroup analyses. Goodman S. A dirty dozen: twelve P-value misconceptions. Semin Hematol. Kim J, Bang H. Dental Hypotheses, Vol. Medknow Publications; Three common misuses of P values [internet]. Dahiru T. P-value, a true test of statistical significance? Ann Ibadan Postgrad Med. Jarosz AF, Wiley J. What are the odds? A practical guide to computing and reporting bayes factors. Available from: [ Google Scholar ].



0コメント

  • 1000 / 1000