The use of expert consensus opinion as strong evidence in the absence of evidence; is this sound methodology?
A discussion, using the example of the Australian Resuscitation Council (ARC) in the formulation of recommendations and guidelines relating to resuscitation.
Author: S. Gould 2016
The purpose of the paper is to discuss the issues resulting from the use and reliance on expert consensus opinion (referred to in Australian Resuscitation Council or ARC literature as Level of Evidence IV or LOE IV). Further this paper will discuss errors in methodology and interpretation that may and do result in, less than ideal, recommendations in resuscitation practice, especially at BLS level; recommendations that are then implemented verbatim by the end-user, in the mistaken belief that they all represent “best-practice”. This can result in a lack of innovation and a stagnancy or degradation in positive outcomes, without appropriate accountability.
The Australian Resuscitation Council (ARC) is an independent, private, voluntary, non-profit organisation that has representative members from key resuscitation related organisations. These members use a methodology that includes a reliance on Level of Evidence IV – Expert Consensus Opinion (both internally from the International Liaison committee on Resuscitation - ILCOR). While some of this expert opinion in harmonious, some is out of step with current international practice and expert opinion. The constitution of the organisation states that no responsibility is taken for any harm caused by the use of the consensus guidelines and that any recommendation does not mean that any other techniques are ineffective.
ARC LOE IV consensus opinion comes from at least two sources derived from international experts and LOE IV based on a re-analysis by local experts based on a local interpretation of the evidence. These two positions may be substantially different and may be contradictory but are presented as being equally rigorous in their position. This raises an important question about the status and methodology in the formulation of a consensus recommendation based on opinion and how two rigorous methodologies can result in differing end-points. For the individual or group seeking the best advice on which to base their guidelines this situation can be confusing, anomalous and illogical. Classes of LOE IV reflect degrees of confidence in the opinion, but all are presented as “best practice”. The methodology relied upon is that LOE IV is ranked by the evidence influence and therefore is of varying veracity. In actuality, the outcome is the same i.e. it is still a somewhat subjective opinion based on a review of selective research, including the decision to include or reject evidence and the personal opinions of those that make up the expert group. Therefore the LOE IV cannot take on the rigour of the considered evidence. In circumstances where the evidence is weaker the subjectivity increases and the associated rigour and veracity decreases. This variation at the end-user/implementer level is not distinguishable and one would naturally assume that all recommendations are equal in rigour.
Underlying expert consensus opinion using a specific methodology there must be to innate belief that anyone using the same methodology would arrive at the same recommendation. In a general sense this is what one should expect. Whilst this may be true where there is a wide variety of high level research and evidence, the same cannot be said at the other end of the continuum where evidence is weak or inconclusive and where a range of recommendations may reflect and suitably address the evidence. The assumption that is therefore made is that to reach any other conclusion is not scientific and therefore cannot contain any intrinsic value and therefore cannot and will not be considered. This flaw, based in attitude and belief in infallibility of process/ judgement/ methodology is central to much of the latency in the ARC BLS recommendations.
Much of resuscitation practices (particularly BLS), have very little targeted and definitive research and so recommendations are determined (for the large part) on low, poor, weak or no direct evidence. Rarely is there such overwhelming strong evidence that one approach is clearly to be used at the exclusion of all others. In this vacuum LOE IV (expert consensus opinion) is used as a substitution and/or a subjective opinion of the conflicting evidence. However, an error in methodology occurs when change or alternate opinion is raised, in that those providing the consensus opinion (LOE IV) now consider that in order to alter their position there is a requirement for strong evidence (i.e. higher than LOE IV). This flaw in process now means that LOE IV has been inappropriately elevated in the evidence hierarchy to an unjustified superior position, presumably based on the level of evidence used to influence the opinion. However, even if influenced by higher levels of evidence i.e. based on strong evidence, as LOE IV it still only represents a subjective interpretation, at a point in time, it cannot assume the status of the underlying influence. It also means that the power of veto in consideration of any evidence is apportioned to the creators of the LOE IV (consensus opinion). This does seem on any level to be a significant flaw in the use of research methodology. Expert opinion can in some instances, be based on, the “discussion” section of a research paper rather than originating from the “conclusion” section, however both are not equal as evidence. Long periods between reviews can also confound the ability of opinion to keep pace/ consider the outcomes from changes in practice recommendations. In fact, proof of efficacy in practice is rarely (if ever) considered or studied as part of this methodology as proof and is not considered necessary and until studied at a formal/ high level (e.g. RCT) it is not taken into account. The history of resuscitation practice is littered with techniques based on expert opinion based on “studies” that were obviously ineffective (and sometimes dangerous) long before expert opinion was changed. As one observer rightly pointed out there are no RCT’s to support the use of parachutes when jumping from a plane and therefore we are totally reliant on informal observation to confirm the benefit over not using a parachute.
In reality, LOE IV is only a substitute for a collection of other evidence (weak and strong) that can be interpreted and/or result in opinion-based implementation strategies. It has little intrinsic merit over any other opinion or approach resulting from a consideration of the available evidence.
LOE IV “evidence” is not at this level in all research hierarchies. For example if we look at Joanna Briggs Institute Levels of Evidence, http://joannabriggs.org/assets/docs/approach/JBI-Levels-of- evidence_2014.pdf we see that consensus opinion by a single expert or a group of experts appropriately sits at Level 5 in the hierarchy i.e. the lowest level, sitting behind any other evidence levels. Likewise in the realms of Level 5 evidence it is only the self-assessed status of those who provide the opinion that guides those seeking direction. Regardless of the source of influence of the consensus opinion, it does not change its position in the hierarchy.
A confounding issue resulting from this self-assessed “expert” status (used in the formulation of consensus view) is each “expert group” can and does claim its own consensus opinion as an absolute and fundamental truth and tends to defend the ownership of this opinion against any contrary observation or ideas that threaten the authority and status perceived to be ascribed to self-assessed “expert opinion”. This is, of course, is not true of all expert groups, however it is certainly the case in Australia. Although this behaviour can and is frequently denied, actions are a better test of character. The close observer will see as a result, contradictory actions and statements that result from this duplicity of motivation. An example from our ARC example is the organisational slogan “Any attempt at resuscitation is better than no resuscitation”. This statement would appear to encourage an individual (particularly the un-trained bystander) to utilise any means they believe is appropriate in an attempt to save a life i.e. “at least one is trying”. The reality of the meaning of this statement is somewhat different and is clearly demonstrated in defensive actions when alternate recommendations are suggested. The actual meaning is “Any resuscitation that follows the ARC recommendations, including its own expert consensus opinion should be attempted; if unable then no resuscitation is better”. This contradiction in statement vs attitude is only supported by the flawed LOE IV status interpretation which we are discussing in this document. Interestingly, the disclaimer used on each recommendation obviates any responsibility for harm or failure of the recommendations to provide the expected outcome and recognises other methods may be of equal or better efficacy. These two statements are essentially and fundamentally contradictory.
The second statement made by the ARC is a disclaimer to the effect that they (the ARC) are not saying that there recommendations are relevant for all circumstances and that individuals should seek “specific advice” in deciding on methods that are applicable to their specific circumstances. Logically, this “specific advice” cannot be from the ARC as it has already (by this statement) referred readers away from the guidelines for advice. Therefore, presumably the statement refers to other sources and/or one’s own review of evidence and recommendations i.e. opinion of relevance. So why do we see such a strong defense of ARC consensus opinion if readers are encouraged to form an opinion independently of the ARC? This statement vs action is also illogical and contradictory in the defense of low level of evidence and demonstrates a misuse of expert opinion.
The other weakness of dogma in local, consensus opinion based recommendations is that the individual agenda of those constituting the expert group can more easily become part of the consensus without the need for the same rigour expected of external potential contributors to the collective opinion. Let us review an example that has been in question for more than a decade. The example from ARC literature is contained in Section 4 – Airway, where recommendations are listed for the management of upper airway obstruction (UAO). Of particular interest is the specific management of an upper airway obstruction (UAO) in the conscious patient. Firstly let us paraphrase the consensus opinion of the ARC and compare this with that of ILCOR and the international resuscitation expert community which includes bodies such as the ERC and AHA.
1. ARC – measures for the relief of UAO include firstly back blows and then “chest thrusts” followed by CPR if unconscious. The recommendation excludes the use of abdominal thrusts and refers to a small, single, yet unrelated postmortem study comparing CPR in a supine position with abdominal thrusts in a living patient.
2. The international resuscitation community – measures for the relief of UAO include firstly back blows and then chest and/or abdominal thrusts, followed by CPR if unconscious. Abdominal thrusts are considered to the most effective technique after the failure of back blows.
Before examining how consensus opinion can have these fundamental differences it is important to note that “chest thrusts” as mentioned in both consensus statements are not equivalent. The technique described by the international resuscitation community is a modified abdominal thrust method suitable when a patient is too large (including pregnancy) to attempt abdominal thrusts. The technique is performed standing behind and against the back of the patient and using two hands pulling toward the centre of the patient’s chest. This technique, as with abdominal thrusts, is supported by respiratory studies that show pressure changes induced in the airway as a result of utilising this technique. Whereas, the “chest thrusts” as described in ARC literature is a modified, single-handed CPR compression technique, that has no supporting respiratory studies or clinical trials.
In terms of our discussion around the use of LOE IV evidence, how did we get to this position that has resulted in a differing recommendation from not only the international resuscitation community, but practice across the world and more importantly after more than a decade no documented cases of success i.e. proof of efficacy. There are several factors that have contributed to this disparate, flawed and unchangeable position.
The ability of expert opinion (LOE IV) to be the re-interpreted by other expert opinion without a need for stronger evidence to support a different outcome. A test consensus opinion does not have to achieve.
The inappropriate direct substitution of unlike terms (such as “chest thrusts) to support a personal view/ opinion and thus take advantage of the outcome and reputation of evidence relating to the original technique without the need to support (with evidence) a experimental method.
The positioning of LOE IV evidence in an artificially elevated position so that strong evidence is required for change, even though the consensus opinion is based on weak, no or poor evidence or borrows rigour from an unrelated finding. This is true regardless of the source influence.
A deliberate dismissal of abdominal thrusts, based on an incomplete consideration of the evidence and a disproportionate emphasis on a misrepresented an “unacceptable risk” (a risk mitigated appropriately and considered against benefit by the international resuscitation community, including ILCOR). The only reason for this position therefore has to come primarily from a personal view of the experts that have contributed to the local expert opinion.
The vehement defense against alternative, (but equally or better evidenced) alternative opinions on the basis that LOE IV evidence has a status above its logical and justified level in research methodology.
A lack of an experimental framework and associated methodology when recommendations are clearly speculative.
What is the result of this difference in consensus opinion? Does it make a difference at the end-user level i.e. to the patient? And what feedback loops are in place to provide proof that one opinion turned out to be better than another? If we review the case of UAO as an example of a divergent view by experts on measures to address the same problem we can see the issues. In Australia an experimental modified chest compression technique and in the rest of the world abdominal thrusts and/or modified abdominal thrust on the chest. Since its inception in 1974 abdominal thrusts in the US have been credited anecdotally by the Washington Post with more than 100,000 saves, whilst in Australia after more than a decade using a different regime there are no documented cases of success. In the recent example in Western Australia a 2 ½ year old child who died after choking on a bubble gum ball he found in his stroller. This was a tragic outcome for the family and those involved and it is uncertain if any measures on the day would have relieved this obstruction. However, what is clear is that in compliance with local expert consensus opinion, 3 separate rescuers (these included a pharmacist, St John Ambulance Volunteers and St Ambulance – State Ambulance) were restricted from using abdominal thrusts and were only able to utilise an experimental technique with little if any supporting evidence (in either clinical trials or field application). The question that should be addressed first is to where in an “all care and no responsibility” methodology does responsibility for review lie. It is doubtful that (unlike an organisation in the general health sector ) that an ARC review (or RCA) of the circumstance has or will take place or seen as necessary in accountability. Differences in recommendations are not restricted to UAO and we see high level guidelines (like BLS Resuscitation) remaining to be focused on a sentimental but uncommon form and cardiac arrest i.e. the hypoxic arrest. This strategic direction, protected by an assumed rigorous methodology, cannot demonstrate improvements in outcome over SCA centered BLS guidelines but cannot be challenged.
Finally, there may be another reason why expert opinion can be at odds with “best practice” or other expert opinion and this relates to ideological and political pressures being brought to bear in the process. Political pressure can arise from dependent organisations, which have an interest in resisting change and the associated costs and imposition of implementing change i.e. the opinion to change can be delayed or moderated as a compromise. Ideological pressures can come from an unwillingness to “confuse” a long term message with change. This assumes that the lowest common denominator (e.g. BLS) is incapable of grasping a new focus or deciding between management options in an emergency. Whilst these pressures can be real considerations in recommendation and guideline development they only gain a controlling power at the LOE IV level where they can build legacy characteristics into recommendations and support sub-optimal guidelines.
The reality in the field of resuscitation is that as we learn more, there will always be better methods being developed than the accepted status quo. They can represent equal levels of opinion and evidence base. The misapplication and defense of any low ranked LOE IV can significantly hamper progress and innovation in guideline development. More significantly it can lead to an unfortunate regression in positive outcomes, when this flawed research methodology is used in resuscitation practice. Recommendation for improvement can be summarised as follows:
1. The adoption of a more appropriate evidence hierarchy system e.g. JBI - Levels of Evidence would more appropriately relegates and classifies consensus opinion. Additionally, nomenclature must clearly differentiate between opinion status and experimental recommendations.
2. Bodies utilising an evidence hierarchy as the basis for any recommendations must exercise appropriate diligence in ensure individual opinion does not unduly affect the consensus view.
3. Ethical and procedural practices need to demonstrate the appropriate and honest representation of research and evidence. The practice of “stolen rigour” used to substitute evidence from unrelated sources must cease immediately, to ensure the maintenance of reputation of expert panels.
4. All consensus opinion based guidelines must be appropriately open to general external scrutiny and peer review and an opportunity for input into development; at any time during the revision cycle. Consultation and comment must represent more than nominal achievements but an important and necessary pathway to excellence.
5. Effective and transparent mechanisms need to be in place to ensure ethical and scientific principles are central to process i.e. that processes are not inappropriately biased toward or away from recommendations for other reasons than a “balance of evidence”.
6. Internal policies that attribute inappropriate status to expert consensus opinion i.e. demand unnecessarily high levels of evidence to affect change; are unscientific and impediments to innovation and improvement.
7. The implementation of an experimental framework and methodology to appropriately manage speculative opinion based recommendations. This framework must necessarily include processes for the evaluation of efficacy to inform change, termination or progression. The use of LOE IV rather than an experimental methodology cannot obviate all responsibility. An experimental framework could allow for the appropriate assessment of regimes or procedures that are currently experimental in nature (due to being significantly different to international consensus opinion) and provide a platform for innovation in the area of resuscitation.
8. Expert consensus opinion should be constructed in consultation with a wider and sector specific group rather than cascading down from higher clinical levels under the assumption that there is an innate understanding of lower levels and their specific implementation/ interpretation challenges. This assumes expert status by inference rather than direct and relevant experience.
9. Any defense of consensus opinion should be done through appropriate debate and dialogue. Strategies such as “refusal to engage” and the “discrediting of individuals” are not appropriate strategies in the consideration of research/ evidence and the formulation of widely utilised guidelines and demonstrate a dedication to power rather than outcome.
10. Those assuming responsibility for guideline development (particularly those based on local consensus opinion) must accept some level of responsibility for the monitoring of efficacy, particularly if unwilling to commit to continuous development. Specifically, if an instrument assumes or has claim as “the authority” which results in others being forced into compliance (regardless of methodology or appropriate evidence).
11. There is a need for the broader acceptance of multiple solutions and approaches to guidelines based on common non-specific evidence. Otherwise guidelines and opinion are presumptively rules with some basis in law. Logically, if no responsibility is accepted then there can be no objection to a contrary opinion that relies on the same available evidence. The elevation of LOE IV to evidence requiring an unjustifiable burden of proof would result in in the situation we see in this example.
Whilst the ARC (using the UK and ILCOR practices in ALS), has structured ALS in Australia to be an effective area of emergency management, the same cannot be said for BLS. The level of immediate care (driven by largely hospital-based clinicians) has been weakened by a flawed methodology and process in the use of its own expert consensus opinion. Unlike in higher levels of care, involving clinical professionals, there is no accountability or meaningful scrutiny for decisions made as a result of expert opinion on BLS.
The use of LOE IV should be approached with caution and should be recognised as the lowest form of “evidence” on any hierarchy. There are several pitfalls we see in the methodology when LOE IV – (expert consensus opinion) is used as a substitution for higher levels of evidence, even if influenced in part by stronger evidence. LOE IV cannot be protected by a requirement for higher level of evidence as it is the subjective, momentary interpretation. Protecting this interpretation is weak methodology and impedes innovation and the pursuit of excellence. Ethical and transparent processes must be in place in all considerations of research and evidence, especially when developing widely circulated guidelines. In the absence of proper processes, bias and personal opinion can easily influence group opinion and tends to be self-protective by establishing an unquestionable position. Due to its heavy reliance on LOE IV, the ARC is a good example of the where this methodology can result in sub-optimal guidelines and therefore outcomes in resuscitation. With no accountability for poor methodology and/or the application of the expert opinion and no effective external review together with the assumption of absolute authority in opinion; the example of the ARC demonstrates where evidence methodology can be used as a barrier to genuine consensus and best practice. The question of accountability has still not been resolved and needs to be addressed with a more formal and rigorous approach. Obviating responsibility for outcomes in application at one level is appropriate i.e. as there are no guarantees in any situation that any measure will result in a positive outcome. However, where the recommendation itself is flawed and/or constitutes disparate expert opinion, responsibility must be address, formally acknowledged and be supported by appropriate quality improvement systems to quickly address identified issues.