2,721
Views
23
CrossRef citations to date
0
Altmetric
Original Article

Clinical trials, ototoxicity grading scales and the audiologist’s role in therapeutic decision making

&
Pages S19-S28 | Received 07 Sep 2017, Accepted 11 Dec 2017, Published online: 23 Dec 2017

Abstract

Objectives: Define clinical trials and adverse event (AE) monitoring from the perspective of the audiologist. Rationalise the importance of audiology’s involvement before, during and after monitoring. Identify strengths and weaknesses in toxicity grading scales, and discuss factors that may influence these. Design: Literature involving commonly cited grading scales used to capture ototoxicity is reviewed. Current regulations and language associated with clinical trial implementation and AE monitoring are described. Personal observations based on a variety of clinical populations are drawn from years of experience developing and employing ototoxicity monitoring protocols in a complex medical setting. Results: Six commonly used grading scales for ototoxicity are systematically reviewed for strengths and weaknesses. Necessary considerations that inform selection of grading scales are presented. A review of and historical context for clinical trial development and AE monitoring is provided. Conclusions: The audiologist’s role in therapeutic decision making goes beyond collection of the audiogram. Clear communication to stakeholders in ototoxicity monitoring is paramount, and toxicity grading scales are one tool to facilitate this exchange. Various factors should be considered in advance of selecting the most appropriate scale to capture hearing loss, and no scale is without limitation.

Introduction

The ability of the audiologist to meaningfully communicate their results to the necessary stakeholders in the context of global and personalised therapeutic decision making is principal in nature to the purpose of ototoxicity monitoring. One tool available for these purposes is the use of grading scales, which aim to operationally capture when ototoxicity occurs and, in many cases, the degree of impairment. These scales provide a measure of objectivity and consistency in the interpretation of data and typically use a metric that is more approachable to the non-audiologist than the varied and numerous data points captured by the audiogram.

On an individual level, ototoxicity grading scales allow a referring medical team (when guided by the audiologist) to easily and quickly evaluate whether a change in hearing has occurred, if that change is related to the intervention in question, and whether that change is likely to impact daily living, requires therapeutic referral or both. Without the consultation of an audiologist to assess the validity of the data and interpret it in the context the patient’s profile, these scales in isolation are far less meaningful. Nonetheless, use of grading scales offers uniformity in how ototoxicity is reported across clinicians and departments.

Globally, there is an unmet need for consistency in reporting ototoxic effects in large clinical cohorts so that data can be effectively synthesised to improve clinical care (Chang and Chinosornvatana Citation2010; Neuwelt and Brock Citation2010). Literature on many ototoxic agents reveals a wide-ranging incidence of associated otopathology. For example, the incidence of ototoxicity from cisplatin-based chemotherapies for head and neck cancers ranges from 17 to 88%, depending, in part, on how hearing loss is defined (Schmitt and Page Citation2017). This ambiguous rate of occurrence limits the ability to prognosticate risk for patients and a lack of clear and consistently-defined outcomes across cohorts hampers efforts to determine efficacy of potentially otoprotective interventions.

Why monitor? Going beyond the audiogram

Much of the conversation surrounding ototoxicity monitoring involves the how. There are a number of excellent existing resources that address the current state of evidence-based monitoring, including several in this supplemental edition (Brooks and Knight Citation2017; Garinis et al. Citation2017; Konrad-Martin et al. Citation2017). The aim herein, however, is a recapitulation of the why. The primary aim of an ototoxicity monitoring programme (OMP) is to ensure the early identification of hearing loss (Konrad-Martin et al. Citation2014; Brooks and Knight Citation2017). This information can, at times, prevent functional hearing loss by allowing for alternative therapies or by influencing drug prescribing procedures; specifically, smaller or less frequent doses, or interruption or suspension of treatment altogether. Monitoring for ototoxicity can also lead to the provision of care and support for the patient and the family (Konrad-Martin et al. Citation2014). In this role, audiologists counsel regarding the signs and symptoms of hearing loss, recommend re/habilitation when necessary and allow for informed therapeutic decision making. This latter purpose is critical, and yet often overlooked. For a patient to meaningfully participate in their own care and make informed decisions about treatment, they must have an understanding of what their hearing loss means in the context of their current lives and the lives they hope to return to at the completion of therapy. The role of the audiologist to inform and care for patients and families is necessary whether or not an alternative therapy exists. Finally, monitoring takes place in order to evaluate drug safety and sometimes efficacy, particularly in the domain of clinical trials.

The commonality amongst all of these goals is communication. Whether explaining to the patient the need for monitoring, which may improve compliance and allow for early detection, or capturing for the referring physician the difference between a 30 dB decline in hearing at 8 kHz versus 2 kHz, or outlining the initial ototoxic profile of a new drug for regulatory agencies, these scenarios involve conveying information that is meaningful to various stakeholders. This requires a kind of professional code switching, shaping clinical data into a language that can best be consumed by the recipient. Jargon should be avoided and the presentation should be contextualised to the unique individual or situation.

The remainder of this report will focus on audiology’s role in clinical trial development and implementation, emphasising the use of grading scales as the main metric for communication. The principles and a priori considerations discussed should be applicable to most OMPs, whether grading scales are used or not. The authors’ experience with clinical trial development and implementation is based almost exclusively in the United States (US) and, therefore, applicable US regulatory institutions and procedures will be highlighted. All case examples and data presented were collected at the National Institutes of Health Clinical Centre in Bethesda, MD, and were done so via protocols approved by an institutional review board (IRB), and following ascertainment of informed consent or assent (when applicable).

Clinical trials

Studies designed to evaluate the safety and efficacy of new medical interventions in humans are designated clinical trials. Typically, clinical trials involve interventions aimed at improving detection, diagnosis, management, treatment or prevention of disease. They represent the initial efforts to study the effect in question in humans. They are often preceded, sometimes by years, of work using in vitro and in vivo models in the laboratory. Once a proof-of-concept is established during preclinical work, the most promising studies move to human cohorts in the form of clinical trials. In the US, the Food and Drug Administration (FDA) determines if there is sufficient evidence to justify trial of a new medical intervention in humans. This is achieved through a process in which the developer applies to the FDA for an investigational new drug (IND) designation. The FDA is responsible for regulating clinical trials, and works to inform and protect patients who choose to participate in them. Ultimately, it is the FDA that determines whether a new medical intervention is safe and effective to use, and that benefits outweigh potential risks. Counterparts to the FDA exist around the world and function in a similar capacity (e.g. Australian Therapeutic Goods Administration, Health Canada, European Medicines Agency, Japanese Pharmaceutical and Food Safety Bureau, Saudi Food and Drug Authority).

Clinical trials are conducted in different phases that vary by scale and scope. Phase 1 studies are generally first-to-human trials when a new intervention is initially examined. Recruitment is intentionally kept small and the focus of these studies is to establish the safety profile of the intervention and gather early information regarding the appropriate therapeutic window, whereby the maximum benefit is achieved with the least degree of toxicity or side effects. The information learned during a phase 1 clinical trial lays the groundwork for developing later phases. Phase 2 studies are an extension of phase 1 work: they are typically not large enough to determine if the intervention is working, but they further determine what side effects may exist and help to guide researchers in refining their experimental questions to design future experiments. Phase 3 clinical trials are large scale (e.g. 100–1000 of participants) and intended to determine efficacy and monitor for adverse events (AEs). These studies are also longer in duration than earlier phases and are better suited to identify side effects that may have gone undetected, or that may only occur after extended exposure. The final phase 4 occurs after a drug or device has been approved by the FDA for use, and surveils after-market safety (Food and Drug Administration Citation2017).

Adverse events

When an unintended or undesirable experience (e.g. sign, symptom, abnormal laboratory test, disease) occurs in a patient exposed to a medical product, it is labelled an AE. Such events can be expected or unexpected, temporary or permanent and can range from mild to fatal (National Cancer Institute Citation2010).

Government agencies (e.g. the FDA) rely on a common language that can identify these occurrences uniformly across clinical trials, protocols and study sites to determine the safety of new products. Human subjects research requires approval and oversight by an IRB, the members of which need to understand the impact of AEs on multiple organ systems. Similarly, consistent language needs to be used to document AEs within medical records and to facilitate accurate designation of AEs in scientific research. Reporting of such events is mandated through federal regulations.

In 1982, the National Cancer Institute (NCI) developed the common toxicology criteria (National Cancer Institute Citation1982), later named the common terminology criteria for adverse events (CTCAE), for use in reporting and summarising treatment-related AEs across studies and IND reports to the FDA, and for use in publications. The CTCAE became the worldwide standard dictionary for reporting acute AEs in cancer clinical trials and since has been translated into several languages. The most recent version, CTCAE version 4.03 (National Cancer Institute Citation2010), improved alignment of standardised terminology with the international- and clinically-validated Medical Dictionary for Regulatory Activities, known as MedDRA. In addition to its use in clinical trials, the CTCAE also serves as standardised terminology to document the occurrence and seriousness of AEs in the medical record and scientific reports. While the CTCAE covers multiple organ systems, a similar overarching scheme is used to assign grades based on predicted or observed impact to the patient. These grades range from one, assigned to mild AEs for which intervention is not indicated, to four, which is assigned to AEs with life-threatening consequences, and grade five which documents an AE-associated death. Application of the CTCAE to audiologic data is covered in the subsequent section.

Defining ototoxic change

Ototoxicity grading scales have been developed largely as instruments for consistent and accessible communication of audiometric test results. While it would be ideal for professional stakeholders to become audiologically literate, the value of simple and categorical assessments to convey information across stakeholders regardless of prior familiarity with hearing data cannot be underestimated.

Despite the inherent usefulness and availability of a number of ototoxicity grading scales, most clinicians do not use (or consistently use) these scales in clinical OMPs. Consider the audiology report that describes a 10–15 dB decline in hearing at 6 and 8 kHz bilaterally. What does this mean to the managing medical team? Such a statement is probably not useful on its own. Consider, also, the variability in reported incidence of ototoxicity across studies (Konrad-Martin et al. Citation2017; Schmitt and Page Citation2017). Some of this heterogeneity is attributed to variations in disease, dosing and treatment schemes, methods of administration, co-administration of concurrent ototoxic agents or agents that potentiate ototoxicity (e.g. radiation), patient age and other patient-related variables. How ototoxicity is captured and defined, however, remains a significant and troubling contribution to the inconsistencies between preclinical and clinical data and across patient cohorts.

Ideally, identification of ototoxicity includes determination of pre-treatment hearing thresholds. Knowing that a hearing loss exists prior to treatment provides data necessary to determine whether post-treatment hearing status reflects a treatment-related decline or pre-existing hearing loss. Reliance on a subjective report of change in hearing or the use of age- and sex-matched normative data are insufficient techniques to accurately determine if an ototoxic change occurred. For example, only 10% of the patients shown in had hearing thresholds worse than those predicted by the 95th percentile for their age and sex on a post-treatment audiogram. When these data are re-examined in the context of a baseline hearing test (), clinically-significant changes in hearing occurred in more than twice as many patients. This would have gone unrecognised in the absence of a baseline hearing test. Importantly, the presence of significant hearing loss at a pre-treatment baseline may impact counselling and help contextualise risk for the patient and managing medical team.

Figure 1. Hearing sensitivity in females being treated with the aminoglycoside, amikacin, most commonly for mycobacterium infection or cystic fibrosis. Circles represent ear-specific thresholds at 4 kHz. Lines represent sex and age-matched normative data (ISO, Citation2000); light grey is the 95th percentile, dashed dark grey is the 50th percentile, and black is the 5th percentile. Left panel (A) thresholds obtained at the end of audiometric monitoring reveal that 10% of ears fall outside the normative range of hearing. However, when change in hearing over time is considered, right panel (B), over twice as many ears showed change (>10 dB) in hearing. Over half of these cases would not have been identified as having ototoxic change if normative ranges alone were used.

Figure 1. Hearing sensitivity in females being treated with the aminoglycoside, amikacin, most commonly for mycobacterium infection or cystic fibrosis. Circles represent ear-specific thresholds at 4 kHz. Lines represent sex and age-matched normative data (ISO, Citation2000); light grey is the 95th percentile, dashed dark grey is the 50th percentile, and black is the 5th percentile. Left panel (A) thresholds obtained at the end of audiometric monitoring reveal that 10% of ears fall outside the normative range of hearing. However, when change in hearing over time is considered, right panel (B), over twice as many ears showed change (>10 dB) in hearing. Over half of these cases would not have been identified as having ototoxic change if normative ranges alone were used.

ASHA criteria for ototoxicity

Before the severity of a decline in hearing can be qualified, it is necessary to determine what constitutes a significant change in hearing. One widely-used set of rules developed specifically for ototoxicity monitoring was established by an ad hoc committee of the American Speech Language and Hearing Association (ASHA) (ASHA Citation1994). These criteria define the following as a significant change in hearing: 20 dB decline in hearing at any single test frequency, or a 10 dB decline at two adjacent frequencies or loss of response at maximum audiometer outputs for three consecutive frequencies where there was previously measurable hearing. Additionally, these changes need to be confirmed on a follow-up test (ASHA Citation1994). The ASHA guideline stresses the importance of including extended high-frequencies in an identification paradigm in order to facilitate early identification of ototoxicity. These criteria are binary and conservative. As such, they are an excellent starting point but they quickly exhaust their utility to quantitatively describe the degree of toxicity, as shown in .

Figure 2. Two case examples of decline in hearing sensitivity from ototoxicity. Panel A shows decline in hearing one year after cisplatin chemotherapy, and panel B shows decline one year after exposure to the aminoglycoside, amikacin. Baseline pre-exposure hearing levels are represented by grey circles and black circles represent thresholds following therapy. The amount of change and range of frequencies affected is notably different between the two cases, and yet ASHA criteria for ototoxicity treats both cases the same; affirming, yes, ototoxicity occurred but making no other distinction. In both cases the change in hearing was sensorineural (bone conduction data not plotted) and bilateral, however, only a single ear from each patient is shown.

Figure 2. Two case examples of decline in hearing sensitivity from ototoxicity. Panel A shows decline in hearing one year after cisplatin chemotherapy, and panel B shows decline one year after exposure to the aminoglycoside, amikacin. Baseline pre-exposure hearing levels are represented by grey circles and black circles represent thresholds following therapy. The amount of change and range of frequencies affected is notably different between the two cases, and yet ASHA criteria for ototoxicity treats both cases the same; affirming, yes, ototoxicity occurred but making no other distinction. In both cases the change in hearing was sensorineural (bone conduction data not plotted) and bilateral, however, only a single ear from each patient is shown.

Ototoxicity scales

A number of ototoxicity grading scales were developed to distinguish nominal changes in hearing with a minor functional impact from substantial decline necessitating intervention(s). Of the currently available ototoxicity scales, some require a baseline in order to calculate change and others consider functional hearing status only. Still others were established specifically for paediatric populations. Here, several of the more commonly employed scales are highlighted, although the list is not exhaustive. The intention is to provide a sample of existing scales that vary in their approach to capturing ototoxicity and highlight some benefits and limitations of each. The reader is referred to Crundwell, Gomersal, and Baguley (Citation2016) for a comprehensive and detailed review of 13 ototoxicity classification systems employing pure tone thresholds as the outcome measure. These authors address the strengths and weaknesses of scales that use absolute thresholds as compared to those based on changes from baseline thresholds, intended patient populations, functional significance, and application of the scales in clinical settings.

Common terminology criteria for adverse events (CTCAE)

The initial CTC included grading of hearing loss in combination with tinnitus into categories based on broad terminology that lacked specificity and objectivity (National Cancer Institute Citation1982). Subsequent versions, including the most recent CTCAE version 4.03 () (National Cancer Institute Citation2010), have benefited from input from audiologists and otologists in establishing criteria for ear-related changes. The resulting grading schema includes criteria for adults enrolled in an OMP, adults not enrolled in a monitoring programme (i.e. absent baseline examination), and paediatric patients. When a baseline is available, definitions for grade changes consider the amount of pure tone shift, number of involved frequencies (adults) or the lowest frequency at which change was observed (paediatric), and frequency range for testing up to 8000 Hz. The CTCAE does not include provisions for changes in the extended high-frequency range. Although progression between grades 1, 2 and 3 are finely distinguished, grade three represents a broad range of potential hearing threshold shift, which can limit sensitivity to further functional change (see ).

Figure 3. Two audiograms documenting ototoxic change in the same individual. Panel A shows an early and clinically significant change from an ototoxic agent; Panel B shows later change in hearing in the same person after continued exposure. Both audiograms meet criteria for a CTCAE version 4.03 grade 3, despite the fact that one (B) represents significantly more change in hearing and a predicted increase in functional severity with the inclusion of 2 kHz compared to the other (A). Baseline pre-exposure hearing levels are represented by grey circles and black circles represent thresholds during the course of therapy. In both examples, the change in hearing was sensorineural (bone conduction data not plotted) and bilateral, although only a single ear is shown.

Figure 3. Two audiograms documenting ototoxic change in the same individual. Panel A shows an early and clinically significant change from an ototoxic agent; Panel B shows later change in hearing in the same person after continued exposure. Both audiograms meet criteria for a CTCAE version 4.03 grade 3, despite the fact that one (B) represents significantly more change in hearing and a predicted increase in functional severity with the inclusion of 2 kHz compared to the other (A). Baseline pre-exposure hearing levels are represented by grey circles and black circles represent thresholds during the course of therapy. In both examples, the change in hearing was sensorineural (bone conduction data not plotted) and bilateral, although only a single ear is shown.

Table 1. Ototoxicity classifications and grading scales.

In lieu of absolute threshold shift, this scale also considers subjective complaints as well as the need for hearing aids and cochlear implants. While this addresses a functional impact of hearing loss, it adds a subjective element to grading that may not be consistent from clinician to clinician (Gurney and Bass Citation2012). How is hearing aid candidacy defined? How is the patient who was a hearing aid candidate at baseline treated when they have minimal threshold shifts, equivalent to a grade 1 or 2, that intensify their need for therapeutic intervention? The reader is invited to consider as a pre-treatment audiogram, and how a 15–25 dB decline in hearing might impact this individual.

Figure 5. Baseline (grey circles) and follow up (black circles) audiogram from an adolescent female undergoing high dose therapy with the loop diuretic, furosemide (Lasix). Ototoxic grading scales that emphasise high-frequency change in hearing (e.g. CTCAE version 4.03 paediatric version) would not be sensitive to capturing this significant decline that occurred early in the course of treatment. The change in hearing was sensorineural (bone conduction data not shown) and bilateral, although data from only a single ear is shown.

Figure 5. Baseline (grey circles) and follow up (black circles) audiogram from an adolescent female undergoing high dose therapy with the loop diuretic, furosemide (Lasix). Ototoxic grading scales that emphasise high-frequency change in hearing (e.g. CTCAE version 4.03 paediatric version) would not be sensitive to capturing this significant decline that occurred early in the course of treatment. The change in hearing was sensorineural (bone conduction data not shown) and bilateral, although data from only a single ear is shown.

The CTCAE is a descriptive scale meant to capture AEs associated with the use of a medical treatment or procedure. Patients with pre-existing hearing loss who are enrolled in OMPs are at risk for having a change in hearing go undocumented, as in the case with scales focussed on specific high-frequency changes, or underappreciated by referring physicians who may not recognise that minimal changes in the face of pre-existing disease can be a tipping point into a functional deficit. Is the scale being used to objectively quantify cases of ototoxicity related to a given intervention? Or is the purpose to communicate change in function at early and clinically significant stages in order to guide informed therapeutic decision making? For many clinicians and researchers employing these scales, the answer winds up being both. And in such cases, which purpose trumps the other? These questions and their answers create an inherent ambiguity in many of these scales when applied to patients with pre-existing disease. The CTCAE has been developed, seemingly, with both of these caveats in mind, rendering it a flexible but imperfect tool.

Adult ototoxicity scales

TUNE Scale: Theunissen et al. (Citation2014) designed the TUNE scale for use with adult populations in an effort to develop an ototoxicity grading scale with greater applicability to everyday life, including speech understanding and sound quality (e.g. nature and music appreciation). This scale considers patient complaint, threshold shift, absolute threshold and thresholds for the extended high-frequencies of 8, 10 and 12.5 kHz. Grades 1 and 2 are determined by threshold shifts from a baseline of ≥10 and ≥20 dB, respectively, for the pure tone average (PTA) of 8, 10 and 12.5 kHz (1a, 2a) and the PTA of 1, 2 and 4 kHz (1b, 2b). Additionally, to acknowledge the significance of tinnitus or difficulty hearing in the absence of a threshold shift, subjective complaints are assigned grade 1a. In contrast, grades 3 and 4 are assigned based on absolute thresholds for the 1, 2, 4 kHz PTAs of ≥35 and ≥70 dB HL, respectively, when these hearing levels occur as a de novo finding. The cut-point of 35 dB HL was selected as an indicator of the level at which there would be a 50% loss of speech intelligibility at conversational levels based on the count-the-dots version of the articulation index (Mueller and Killion Citation1990). Consequently, grades 3 and 4 could be useful for providing an indication of when aural rehabilitation may be indicated, whereas grades 1 and 2 are more aligned with early detection of ototoxic changes. It remains unclear how to grade the patient with a pre-existing PTA of 35 dB HL that progresses to 50 dB HL on a post-treatment test. Is this a grade 1b? It meets the change criteria. Is this a grade 3? It is not a de novo hearing loss as specified by the grade 3 category; hence, it does not technically meet the stated criteria.

Paediatric ototoxicity scales

Development of grading scales specific to the paediatric population was largely motivated by concern for the unique listening needs of children. While an adult may be able to tolerate a mild high-frequency hearing loss, this is not the case for children who are actively developing speech, language and social skills, and expected to function in acoustically challenging classroom settings (Knight, Kraemer, and Neuwelt Citation2005; Brooks and Knight Citation2017). Use of a scale that does not take frequency or age into account may underestimate the functional impact of hearing loss on paediatric patients (Knight, Kraemer, and Neuwelt Citation2005). All three of the paediatric scales described below consider the functional impact of hearing loss, do not require a baseline audiogram, and do not provide guidance for grading ototoxic effects when there is a known pre-existing hearing loss.

Brock Scale: The first and most widely-used paediatric-specific ototoxicity scale was designed by Penelope Brock, a paediatric oncologist, and colleagues (Brock et al. Citation1991). As the scale was developed, considerations included the practical difficulties in obtaining a full audiogram at all frequencies in a child who may be too ill or fatigued to fully cooperate and the potential for fluctuant middle ear disease that primarily affects the low-frequencies. This scale is based on absolute hearing thresholds and not change from a baseline. It has four grades and uses 40 dB HL as a boundary level differentiating significant from non-significant changes. It considers the frequencies involved, giving more weight to hearing loss at the mid-frequencies than the high-frequencies, such that a 40 dB HL threshold limited to 8 kHz is classified as grade 1, and a 40 dB HL threshold at 2 kHz and above is grade 3 ().

Chang Scale: Chang and Chinosornvatana (Citation2010) noted the deleterious impact of minimal hearing loss for children and the need for a paediatric scale capable of capturing the functional significance of ototoxicity. They modified the Brock scale to include both 20 and 40 dB HL cut-offs, added the interoctave frequencies of 3 and 6 kHz to achieve greater alignment with clinical interpretations, and included 12 kHz to increase sensitivity for identifying early hearing changes (). This corresponds with the frequencies at which ototoxic hearing loss most often appears in its initial stages. This scale added sub-grades (1a & 1b and 2a & 2b) in recognition that a 25 dB hearing loss in the mid-frequencies may be more disadvantageous than a 45 dB hearing loss above 4 kHz. Chang stressed the need to measure bone conduction when the tympanogram is abnormal or when there has been a change in hearing to ensure that middle ear dysfunction is not a confounding factor (Chang Citation2011). While the finer detail of the Chang scale may increase sensitivity, it is complicated to apply and requires additional threshold data that may be difficult to obtain in an ill or uncooperative patient.

Boston SIOP Scale: The International Society of Paediatric Oncology (SIOP) grading system was developed by a working group of international stakeholders with expertise in ototoxicity and was initially presented at the 2010 Congress of the SIOP (Brock et al. Citation2012). The developers adapted the concepts of previous paediatric scales to achieve a grading system that is simple to understand and apply, sensitive to ototoxic changes with a focus on the high-frequencies, and functionally relevant. It takes into account the possibility of fluctuating middle ear disease common in children and requires bone conduction thresholds when there is abnormal tympanometry or a clinical suspicion of a conductive component to a hearing loss. The scale is based on absolute thresholds and also uses cut-offs of 20 and 40 dB HL with more weight, and higher ototoxicity grades, given to hearing loss in the mid-frequencies than the high-frequencies (). It is designed to be applied at the end of a treatment trial for the purpose of identifying and comparing incidence and severity of hearing loss across clinical trials.

Sensitivity, reliability and validity of ototoxicity grading scales

The success and utility of any ototoxicity grading scale depend on the scale’s sensitivity, validity and reliability. The ASHA definition of ototoxicity is inherently designed to capture small changes in hearing that just exceed clinically-accepted test-retest variability (5–10 dB). Scales that include subjective complaints and extended high-frequency thresholds are more likely to result in a classification of ototoxicity than those that consider standard frequency pure tone thresholds only. Conversely, scales that use absolute thresholds of 40 dB HL as the cut-point for ototoxicity identification will identify fewer cases as having ototoxic hearing loss than a scale that uses 20 dB HL as the defining level.

Sensitivity of paediatric grading scales in detecting any ototoxicity was initially addressed by Knight, Kraemer, and Neuwelt (Citation2005) in a comparison of the ASHA definition with the CTCAE version 3 and Brock scales in a group of children treated with cisplatin. ASHA and CTCAE version 3 had similar sensitivity to any hearing loss (both 61%), while the Brock scale was less sensitive (40%). Subsequently, Landier et al. (Citation2014) observed a similar prevalence of any hearing loss detection across ASHA, Brock, CTCAE version 3, and Chang scales in a group of 333 children and young adults with neuroblastoma after treatment with cisplatin only or a combination of cisplatin and carboplatin following one (64–71%) or two (86–90%) exposures. In another paediatric cohort of 37 children with medulloblastoma who were treated with craniospinal radiation and cisplatin, the SIOP scale was more sensitive than the Chang scale to any change in hearing, identifying 74 and 66%, respectively (Bass et al. Citation2014). More recently, Knight et al. (Citation2017) compared the ASHA, Brock, CTCAE version 3 and SIOP scales in a large, multinational cohort of 284 children and young adults treated for the first time with a cisplatin-containing regimen. Sensitivity in detecting any ototoxicity was comparable for SIOP (55%), ASHA (56%) and CTCAE version 3 (51%), while it was slightly lower for Brock (40%).

In a comparison between outcomes of four ototoxicity scales in 319 adult patients treated with chemo-radiation or radiation therapy alone for head and neck cancer, the prevalence of ototoxicity was rank ordered, lowest to highest, as: CTCAE version 4.0, ASHA up to 8 kHz, TUNE and ASHA up to 12.5 kHz (Theunissen et al. Citation2014). As expected, scales that included high-frequency testing above 8000 Hz (TUNE and ASHA up to 12.5 kHz) were the most sensitive to identification of ototoxicity.

To evaluate the validity of a grading scale, it is necessary to consider the sensitivity of the scale to a hearing loss likely to create a communication handicap. When considering only a clinically significant hearing loss of grade 3 or worse in a group of children, the CTCAE version 3 was more sensitive (25%) than the Brock scale (19%) (Knight, Kraemer, and Neuwelt Citation2005). In another paediatric cohort, the SIOP and CTCAE version 3 were comparable in their rates of assigning ototoxicity grade 3 and above (22 and 18%, respectively), whereas the rate for the Brock scale was 8% (Knight et al. Citation2017). In a group of children exposed to cisplatin whose hearing loss warranted hearing aid referral, the Brock scale graded only 49% as severe, whereas the Chang and CTCAE version 3 graded 91 and 100%, respectively, in the severe category (Landier et al. Citation2014). The Chang scale was more specific in identifying and differentiating among those children whom audiologists referred for hearing aid evaluation and FM systems than the Brock and CTCAE version 3 scales (Chang Citation2011), whereas the SIOP and Chang scales were equally sensitive (35%) in identifying those with hearing loss sufficient to warrant hearing aid use (Bass et al. Citation2014).

While a sensitive scale is desirable, this must be balanced against the need for specificity to avoid false positive test results. Theunissen et al. (Citation2014) defined false positives as a higher ototoxicity grade at the time of the last treatment as compared to follow up testing several weeks after completing treatment, which ranged from 12% for the TUNE scale, 11% for CTCAE, 3% for ASHA up to 12.5 kHz and 0% for ASHA up to 8 kHz in a group of adult patients. Similarly, in a paediatric group, false positive findings defined as identification of ototoxicity at one time point during the course of monitoring followed by no ototoxicity on a subsequent evaluation, occurred at rates of 7.4% for ASHA, 6.7% for SIOP, 4.6% for CTCAE version 3 and 2.1% for Brock (Knight et al. Citation2017). This highlights the need for a confirmatory test following first detection of ototoxic changes.

Multi-institutional clinical trials depend on consistent interpretation of data across settings and providers. Knight et al. (Citation2017) compared inter-rater reliability in a large clinical trial between examining audiologists at test sites and two centrally located audiologists. Agreement between the examining and centrally located audiologist in detecting any ototoxicity ranged from 91% for the Brock scale to 87% for CTCAE version 3, and 84% for ASHA criteria. When identification of ototoxicity severity was compared, agreement between reviewers was 85% for the Brock scale as compared to 69% for CTCAE version 3 (Knight et al. Citation2017).

Other pitfalls encountered in ototoxicity monitoring in multi-institutional clinical trials may lead to variability in the quality and completeness of data submitted to a central reviewing agency (Landier et al. Citation2014). These include failure to obtain a baseline audiogram when the scale requires one, and missing data for scales requiring specific frequencies. These audiograms are considered “unevaluable” and do not effectively contribute to establishing a safety profile. Notably, these pitfalls are not unique to the clinical trial setting, and are also common barriers to meaningful monitoring in a clinical setting. It is necessary to engage frontline clinical care providers, and to do so early on, to ensure timely and accurate collection of necessary data in both clinical trial and clinical care settings.

Selecting or developing an ototoxicity scale

In selecting or developing a grading scale for a particular population or application, several factors should be considered a priori, for an individual, patient population or in the development of a clinical trial.

  • Is the scale sensitive to the predicted ototoxic hearing loss? The majority of ototoxic agents cause hearing changes in the high-frequencies, and these changes may appear first in the extended high-frequency range above 8000 Hz. Scales that include extended high-frequencies, or allow for specific weighting or focus on the high-frequencies may be more sensitive to early indications of ototoxicity. Conversely, if the change does not follow a typical pattern for high-frequency loss, as depicted in , will the scale be effective?

  • Are grading criteria clear or is there ambiguity in the definition, and how might that impact therapeutic decision making? For example, describing a hearing loss sufficient to “indicate amplification” is open to interpretation, and will change as amplification technology evolves. If grading criteria are not clearly defined, there is opportunity for inconsistent application and poor inter-rater reliability. Moreover, clinical trials are often developed with stopping criteria, for either an individual or the trial, if an AE becomes too serious or occurs too frequently. Is the protocol written in such a way that a patient’s continued participation is contingent on an ambiguous definition of toxicity? In the case of life-threatening disease, would this decision impact access to a potentially effective intervention?

  • Is it preferable for the scale to specify change from the pre-exposure baseline, or to emphasise absolute threshold and functional status? Scales based on change from baseline require a pre-treatment hearing test and may not, in and of themselves, address the functional needs of the patient. Scales based on absolute hearing thresholds do not differentiate change in hearing from pre-existing hearing loss. Rather, they focus on the functional status of the patient at any given time, ignoring the amount of treatment-related hearing change. The reader is directed to for an illustration of this scenario.

  • How does pre-existing hearing loss impact use of the scale? Scales that confine the change in hearing to specific frequencies may be less useful in a population with pre-existing hearing loss. This may vary by protocol depending on whether the goals of monitoring are focussed on identifying functional needs versus quantifying toxicity.

  • Are the guidelines intended for adults, children or both? Should paediatric scales be sub-divided into those applying to children who are still developing speech and language skills and those aimed towards older post-lingual children (Chang Citation2011)? The impact of minimal hearing loss on a child who has emerging speech and language and who functions and learns in acoustically challenging environments such as a classroom is greater than the impact of a minimal hearing loss on an adult (Littman, Magruder, and Strother Citation1998; Brooks and Knight Citation2017).

  • Does the scale include provisions for grading when there is incomplete or suprathreshold data? For example, a paediatric or very ill adult patient may not provide a full audiogram, give true threshold responses, or tolerate earphones, necessitating reliance on minimum response levels obtained during sound field testing at just two frequencies (Brooks and Knight Citation2017). How can limited information be incorporated into a grading scale, or drive test strategy to ensure that the most important data are collected first?

  • Should there be guidelines for grading ototoxicity based on otoacoustic emissions (OAEs) or auditory brainstem response (ABR) derived thresholds? While pure tone thresholds are the current gold standard for ototoxicity monitoring, it may not be possible to obtain these data at each visit due to health status or other factors affecting ability to cooperate. Is it legitimate to substitute ABR thresholds for behavioural thresholds in grading ototoxicity? The authors developed an ABR-derived AE scale (), modelled after the CTCAE, meant to capture and segregate minimal change in hearing from functionally significant change, for use in populations who require ABR threshold assessment in whom the use of AE monitoring is necessary. This scale has yet to be validated. To date, it has been used to monitor hearing safety across multiple phases in a clinical trial. OAEs may prove more challenging in that they do not estimate threshold, but they do afford an opportunity to document high-frequency change and may be more sensitive to early identification of ototoxicity (Littman, Magruder, and Strother Citation1998; Brooks and Knight Citation2017). Absent OAEs in the setting of transient or permanent changes in middle ear function further complicates their consistent application and contribution as a monitoring tool.

  • How should conductive hearing loss be factored into ototoxicity grading? On the one hand, middle ear effusion unrelated to treatment may cause conductive hearing loss and a higher ototoxicity grade. Should bone conduction thresholds supplant air conduction thresholds in this case? On the other hand, cranial radiation in conjunction with cisplatin (chemo-radiation) is a common therapeutic regimen for some cancers. The effects of radiation on hearing are varied, and there may be a resultant conductive or mixed hearing loss (Gurney and Bass Citation2012). In this scenario, should the conductive component be ignored or factored into grading?

  • Above and beyond use in clinical trials, how should ototoxicity grading drive clinical decision making? While clinical trials standardly identify stopping criteria based on toxicity scales, the same is not true in routine clinical practice. Application of these scales in a clinical setting may be a useful way to communicate hearing changes and assist the conversation regarding the significance of the hearing loss to the patient and managing physician. Ultimately, the decision to continue or change treatment is based on a number of factors, including available treatment options and overall patient health status. Nonetheless, the use of ototoxicity scales can make hearing data more accessible and facilitate therapeutic decision making.

  • Do studies of putative otoprotectants need stricter monitoring criteria than those provided by grading scales? Keeping in mind that some ototoxicity scales have grades that span wide ranges, have a subjective element, reduce data to ordinal numbers, and do not include the extended high-frequencies, it is possible that grading systems may miss or obscure effects of otoprotectants. In this case, more finely tuned analysis (e.g. high-frequency pure tone thresholds or pure tone averages) will better capture protective effects, and ototoxicity grading scales may be used for supplemental analyses.

Figure 4. Baseline audiogram representing air conduction hearing thresholds from one ear of an adolescent female prior to exposure to a potential ototoxic medication. Both the SIOP and Brock scales do not account for pre-existing hearing loss; this audiogram would be graded a 3 on both scales prior to any ototoxic exposure.

Figure 4. Baseline audiogram representing air conduction hearing thresholds from one ear of an adolescent female prior to exposure to a potential ototoxic medication. Both the SIOP and Brock scales do not account for pre-existing hearing loss; this audiogram would be graded a 3 on both scales prior to any ototoxic exposure.

Table 2. Proposed ABR-derived adverse event schema. This application requires data collected using air-conducted tone burst stimuli from 0.5 to 4 kHz and assumes normal middle ear function.

Conclusions

Audiologic monitoring for ototoxicity is not a routine session in which hearing thresholds are established and reported in isolation; it is a purpose-driven consultation with multiple goals and stakeholders. These goals include early identification of hearing changes, communication with the patient and family, prevention or mitigation of functional hearing loss and establishing and monitoring of drug safety and efficacy.

Clinical trials are the vehicle by which we translate basic science into human applications in order to improve health and reduce disease. They inform clinical practice on the front lines of medicine, in part, by establishing the balance of toxicity and benefit for new therapeutic interventions. The severity of disease and the availability of alternative therapies drive how we tolerate the exchange of safety for efficacy. Grading scales serve a critical need in the successful fulfilment of a clinical trial as a tool to uniformly monitor AEs. There are, however, distinct advantages to more widespread use of standardised grading scales beyond their application in clinical trials. These include consistency in the interpretation of data and greater simplicity of the metric relative to the entire audiogram. Each of the scales, perhaps inherently, offers benefits and limitations, which can vary by population and setting. Ultimately, grading scales applied in isolation do not carry sufficient meaning about progression, clinical impact or clear candidacy for re/habilitation.

Hearing data are complicated: they involve a wide range of test frequencies, multiple transducers and techniques, and stem from a bilateral, heterogeneous sensory system. Furthermore, the seasoned clinician is well aware that identical audiograms from two patients can impact individual lives in widely different ways. This means that capturing and contextualising risk and toxicity for an individual or a cohort is challenging, which may be the reason that a uniform system to convey this information has remained elusive. Moreover, while the current emphasis in defining toxicity relies almost exclusively on pure-tone hearing thresholds, additional effects on hearing, such as speech in noise, remain largely unexamined and potentially, overlooked.

Audiology falls at the intersection of scientific evidence and clinical circumstance in the process of therapeutic decision making. Audiologists are uniquely suited to inform patients as they establish their own preferences to guide these decisions. The audiologist’s role before, during and after ototoxic intervention is dynamic and important.

Abbreviations
AE=

Adverse Event

ASHA=

American Speech Language Hearing Association

ABR=

Auditory Brain Stem Response

CTCAE=

Common Terminology Criteria for Adverse Events

FDA=

Food and Drug Administration

IND=

Investigational New Drug

IRB=

Institutional Review Board

MeDRA=

Medical Dictionary for Regulatory Activities

NCI=

National Cancer Institute

OAEs=

Otoacoustic Emissions

OMP=

Ototoxicity Monitoring Program

U.S.=

United States

Declaration of interest: No potential conflict of interest was reported by the authors.

Funding for this work was supported by the Intramural Research Program of the National Institute on Deafness and Other Communication Disorders, National Institutes of Health and Department of Health and Human Services [NIH intramural grant DC000064 to CCB].

Acknowledgements

The authors are grateful to Marilyn Dille, Dawn Konrad Martin, Katharine Fernandez and Nicole Schmitt for their careful review and feedback.

References

  • American Speech-Language-Hearing Association (ASHA). 1994. “Guidelines for the Audiologic Management of Individuals Receiving Cochleotoxic Drug Therapy.” American Speech-Language-Hearing Association 36: 1–19. doi:10.1044/policy.gl1994-00003.
  • Bass, J. K., J. Huang, A. Onar-Thomas, K. W. Chang, S. P. Bhaga, M. Chintagumpala, U. Bartels, et al. 2014. “Concordance between the Chang and the International Society of Pediatric Oncology (SIOP) Ototoxicity Grading Scales in Patients Treated with Cisplatin for Medulloblastoma.” Pediatric Blood & Cancer 61: 601–605. doi:10.1002/pbc.24830.
  • Brock, P. R., S. C. Bellman, E. C. Yeomans, C. R. Pinkerton, and J. Pritchard. 1991. “Cisplatin Ototoxicity in Children: A Practical Grading System.” Medical and Pediatric Oncology 19 (4): 295–300. doi:10.1002/mpo.2950190415.
  • Brock, P. R., K. R. Knight, D. R. Freyer, K. C. Campbell, P. S. Steyger, B. W. Blakley, S. R. Rassekh, et al. 2012. “Platinum-induced Ototoxicity in Children: A Consensus Review on Mechanisms, Predisposition, and Protection, Including a New International Society of Pediatric Oncology Boston Ototoxicity Scale.” Journal of Clinical Oncology 30 (19): 2408–2417. doi:10.1200/JCO.2011.39.1110.
  • Brooks, B., and K. Knight. 2017. “Ototoxicity Monitoring in Children Treated with Platinum Chemotherapy.” International Journal of Audiology 24: 1–7. doi:10.1080/14992027.2017.1355570.
  • Chang, K. W. 2011. “Clinically Accurate Assessment and Grading of Ototoxicity.” The Laryngoscope 121 (12): 2649–2657. doi:https://doi.org/10.1002/lary.22376.
  • Chang, K. W., and N. Chinosornvatana. 2010. “Practical Grading System for Evaluating Cisplatin Ototoxicity in Children.” Journal of Clinical Oncology 28 (10): 1788–1795. doi:10.1200/JCO.2009.24.4228.
  • Crundwell, G., P. Gomersal, and D. M. Baguley. 2016. “Ototoxicity (Cochleotoxicity) Classifications: A Review.” International Journal of Audiology 55 (2): 65–74. doi:10.3109/14992027.2015.1094188.
  • Food and Drug Administration (FDA). 2017. “The Drug Development Process: Clinical Research.” Accessed 1 July 2017. https://www.fda.gov/ForPatients/Approvals/Drugs/ucm405622.htm
  • Garinis, A., A. Kemph, A. M. Tharpe, J. H. Weitkamp, C. McEvoy, and P. Steyger. 2017. “Monitoring Neonates for Ototoxicity.” International Journal of Audiology 22: 1–8. doi:10.1080/14992027.2017.1339130.
  • Gurney, J. G., and J. K. Bass. 2012. “New International Society of Pediatric Oncology Boston Ototoxicity Grading Scale for Pediatric Oncology: Still Room for Improvement.” Journal of Clinical Oncology 30 (19): 2303–2306. doi:10.1200/JCO.2011.41.3187.
  • ISO. 2000. ISO 7029-1, Acoustics–Statistical Distribution of Hearing Thresholds as a Function of Age. Geneva: International Organization of Standardization.
  • Knight, K. R., L. Chen, D. Freyer, R. Aplenc, M. Bancroft, B. Bliss, H. Dang, et al. 2017. “Group-wide, Prospective Study of Ototoxicity Assessment in Children Receiving Cisplatin Chemotherapy (ACCL05C1): A Report from the Children’s Oncology Group.” Journal of Clinical Oncology 35 (4): 440–445. doi:10.1200/JCO.2016.69.2319.
  • Knight, K. R., D. F. Kraemer, and E. A. Neuwelt. 2005. “Ototoxicity in Children Receiving Platinum Chemotherapy: Underestimating a Commonly Occurring Toxicity That May Influence Academic and Social Development.” Journal of Clinical Oncology 23 (34): 8588–8596. doi:10.1200/JCO.2004.00.5355.
  • Konrad-Martin, D., G. Poling, A. Garinis, C. Ortiz, J. Hopper, K. O'Connell-Bennett, and M. Dille. 2017. “Applying U.S. national Guidelines for Ototoxicity Monitoring in Adult Patients: Perspectives on Patient Populations, Service Gaps, Barriers and Solutions.” International Journal of Audiology 20: 1–16. doi:10.1080/14992027.2017.1398421.
  • Konrad-Martin, D., K. M. Reavis, G. McMillan, W. J. Helt, and M. Dille. 2014. “Proposed Comprehensive Ototoxicity Monitoring Program for VA Healthcare (COMP-VA).” Journal of Rehabilitation Research and Development 51 (1): 81–100. doi:10.1682/JRRD.2013.04.0092.
  • Landier, W., K. Knight, F. L. Wong, J. Lee, O. Thomas, H. Kim, S. G. Kreissman, et al. 2014. “Ototoxicity in Children with High-risk Neuroblastoma: Prevalence, Risk Factors, and Concordance of Grading Scales–A Report from the Children’s Oncology Group.” Journal of Clinical Oncology 32 (6): 527–534. doi:10.1200/JCO.2013.51.2038.
  • Littman, T. A., A. Magruder, and D. R. Strother. 1998. “Monitoring and Predicting Ototoxic Damage Using Distortion-product Otoacoustic Emissions: Pediatric Case Study.” Journal of the American Academy of Audiology 9: 257–262.
  • Mueller, H., and M. C. Killion. 1990. “An Easy Method for Calculating the Articulation Index.” Hearing Journal 43: 1–4.
  • National Cancer Institute (NCI). 1982. “Common Toxicity Criteria.” Accessed 21 August 2017. https://www.ucdmc.ucdavis.edu/clinicaltrials/StudyTools/Documents/NCI_Toxicity_Table.pdf
  • National Cancer Institute (NCI). 2010. “Common Terminology Criteria for Adverse Events (CTCAE, v 4.03.)” NCI, National Institutes of Health, Department of Health and Human Services. Accessed 21 August 2017. https://evs.nci.nih.gov/ftp1/CTCAE/CTCAE_4.03_2010-06-4_QuickReference_8.5x11.pdf
  • Neuwelt, E., and P. Brock. 2010. “Critical Need for International Consensus on Ototoxicity Assessment Criteria.” Journal of Clinical Oncology: Official Journal of the American Society of Clinical Oncology 28 (10): 1630–1632. doi:10.1200/JCO.2009.26.7872.
  • Schmitt, N. C., and B. R. Page. 2017. “Chemoradiation-induced Hearing Loss Remains a Major Concern for Head and Neck Cancer Patients.” International Journal of Audiology 20: 1–6. doi:10.1080/14992027.2017.1353710.
  • Theunissen, E. A., W. A. Dreschler, M. N. Latenstein, C. R. Rasch, S. van der Baan, J. P. de Boer, A. J. Balm, et al. 2014. “A New Grading System for Ototoxicity in Adults.” Annals of Otology, Rhinology & Laryngology 123 (10): 711–718. doi:10.1177/0003489414534010.