Meeting 5- December 1st, 2020

 

Meeting Date: 12/1/2020

Meeting Time: 10:00am-1:00pm PDT

Meeting Location: Virtual Conference via Zoom

Approval Date: 12/18/2020

Recorded by: UCSF Team

MEETING MINUTES: 

Project Overview:

The Centers for Medicare & Medicaid Services (CMS) has granted an award to the University of California San Francisco (UCSF) to develop a measure of computed tomography (CT) image quality and radiation safety. The project is a part of CMS’s Medicare Access & CHIP Reauthorization Act (MACRA)/Measure Development for the Quality Payment Program. The project title is “DR CTQS: Defining and Rewarding Computed Tomography Quality and Safety”. The Cooperative Agreement number is 1V1CMS331638-02-00. As part of its measure development process, UCSF convened groups of stakeholders and experts who contributed direction and thoughtful input to the measure developer during measure development and maintenance.

Project Objectives:

The goal of the project is to create a quality measure for CT to ensure image quality standards are preserved and harmful effects of radiation used to perform the tests are minimized. Radiation doses delivered by CT are far higher than conventional radiographs (x-rays), the doses are in the range known to be carcinogenic, and there is a significant performance gap across health care organizations and clinicians which has consequences for patients. The goal of the measure is to provide a framework where health care organizations and clinicians can assess their doses, compare them to benchmarks, and take corrective action to lower them while preserving the quality of images so that they are useful to support clinical practice. The measure will be electronically specified using procedural and diagnostic codes in billing data as well as image and electronic data stored with CT scans, typically stored within the Picture Archiving and Communication Systems (PACS) – the computerized systems for reviewing and storing imaging data – or Radiology Information Systems (RIS).

TEP Objectives:

In its role as a measure developer, the University of California San Francisco is obtaining input from a broad group of stakeholders to develop a set of recommendations to develop a radiology quality and safety measure. The proposed measure will be developed with the close collaboration of the leadership from diverse medical societies as well as payers, health care organizations, experts in safety and accreditation, and patient advocates. A well-balanced representation of stakeholders on the TEP is intended to ensure the consideration of key perspectives and obtain balanced input.

Scope of Responsibilities:

The TEP’s role is to provide input and advice to the measure developer (University of California San Francisco) related to a series of planned steps throughout the 3-year project. The specific steps will include developing and testing a risk-adjusted measure which can be used to monitor CT image quality in the context of minimizing radiation doses while maintaining acceptable image quality. The TEP will assist UCSF in conceptualizing the measure and any appropriate risk adjustment of it. The TEP will assist UCSF with identifying barriers to implementing the proposed measure and test sites in which the developer can assess the feasibility and performance of its use. The TEP will assist UCSF with interpreting results obtained from the test sites and in suggesting modifications of the measure. The TEP will provide input and advice to UCSF to ensure that the measure is valuable for a wide range of stakeholders and CMS.

Guiding Principles:
Participation on the TEP is voluntary. Individuals participating on the TEP understand that their input will be recorded in the meeting minutes. Proceedings of the TEP will be summarized in a report that may be disclosed to the general public. If a participant has disclosed private, personal data by his or her own choice, then that material and those communications are not deemed to be covered by patient-provider confidentiality. Questions about confidentiality will be answered by the TEP organizers.

All TEP members must disclose any significant financial interest or other relationships that may influence their perceptions or judgment. It is unethical to conceal (or fail to disclose) conflicts of interest. However, the disclosure requirement is not intended to prevent individuals with particular perspectives or strong points of view from serving on the TEP. The intent of full disclosure is to inform the TEP organizers, other TEP members and CMS about the source of TEP members’ perspectives and how that might affect discussions or recommendations.

All TEP members should be able to commit to the anticipated time frame needed to perform the functions of the TEP.

Estimated Number and Frequency of Meetings:

TEP is expected to meet three times per year, either in-person or via a webinar.

This meeting was originally set to occur in-person but was changed to a virtual meeting as mandated by federal social distancing measures and state-wide Shelter-in-Place orders.

Table 1. TEP Member Name, Title, and Affiliation

 

Name

Title

Organization

Attendees

Mythreyi Bhargavan Chatfield, PhD

Executive Vice President

American College of Radiology

Niall Brennan, MPP

CEO

Health Care Cost Institute

Helen Burstin, MD, MPH, FACP

Executive Vice President

Council of Medical Specialty Societies

Melissa “Missy” Danforth

Vice President of Health Care Ratings

The Leapfrog Group

Tricia Elliot, MBA, CPHQ

Director, Quality Measurement

Joint Commission

Jeph Herrin, PhD

Adjunct Assistant Professor

Yale University

Hedvig Hricak, MD, PhD

Radiology Chair

Memorial Sloan Kettering Cancer Center

Jay Leonard “Len” Lichtenfeld, MD, MACP

Independent Consultant

Formerly Deputy Chief Medical Officer American Cancer Society, Inc.

Leelakrishna “Krishna” Nallamshetty, MD

Associate Chief Medical Officer

Radiology Partners

Matthew Nielsen, MD, MS

Professor and Chair of Urology

UNC Gillings School of Global Public Health

Debra Ritzwoller, PhD

Patient Advocate, and Health Economist

Patient Representative

James Anthony “Tony” Seibert, PhD

Professor

University of California, Davis

Arjun Venkatesh, MD, MBA, MHS

Associate Professor, Emergency Medicine

Yale School of Medicine

Todd Villines, MD, FSCCT

Professor and Director of Cardiovascular Research and Cardiac CT Programs

University of Virginia

Kenneth “Ken” Wang, MD, PhD

Adjunct Assistant Professor, Radiology

University of Maryland, Baltimore

Not in Attendance

Lewis “Lew” Sandy, MD

Executive Vice President, Clinical Advancement

UnitedHealth Group

Mary Suzanne “Suz” Schrandt, JD

Patient Advocate

Patient Representative

Ex Officio TEP

Amy Berrington de Gonzalez, DPhil

Branch Chief & Senior Investigator

National Cancer Institute; Division of Cancer Epidemiology & Genetics, Radiation Epidemiology Branch

Mary White, ScD

Chief, Epidemiology and Applied Research Branch

Centers for Disease Control and Prevention

CMS & MACRA/CATA Representatives

Marie Hall

CATA Team

Health Services Advisory Group

Minet Javellana

Measure Development Specialist

RELI Group Inc.

Janis Grady

Project Officer

Centers for Medicare & Medicaid Services

UC Team

Rebecca Smith-Bindman, MD

Principal Investigator

University of California, San Francisco

Patrick Romano, MD, MPH

Co-Investigator

University of California, Davis

Carly Stewart

Lead Project Manager

University of California, San Francisco

Sophronia Yu

Data Analyst

University of California, San Francisco

Susanna McIntyre

Research Assistant

University of California, San Francisco

Not in Attendance

Andrew Bindman, MD

Advisor

Kaiser Permanente, former Co-Investigator with the University of California, San Francisco

 

Technical Expert Panel Meeting

Prior to the meeting, TEP members received a copy of the agenda, presentation slides, link to DR-CTQS study website that contains minutes from the prior TEP meetings, honorarium documentation, and a conflict of interest form. The meeting was conducted with the use of PowerPoint slides and Zoom Video Conference.

10:00 AM:      Call meeting to order by TEP Chair                                  Dr. Helen Burstin

Dr. Helen Burstin called the meeting to order. She noted that the meeting will last for 3 hours and will include a discussion period after each presentation.

10:05 AM:      Roll Call and Updated Conflicts                                                     Dr. Burstin

TEP members’ and ex officio members’ attendance is listed above.

Conflict of interest defined as you, your spouse, your registered domestic partner, and/or your dependent children:

1. Received income or payment as an employee, consultant or in some other role for services or activities related to diagnostic imaging

2. Currently own, or have held in the past 12 months, an equity interest in any health care related company which includes diagnostic imaging as a part of its business

3. Hold a patent, copyright, license or other intellectual property interest related to diagnostic imaging

4. Hold a management or leadership position (i.e., Board of Directors, Scientific Advisory Board, officer, partner, trustee, etc.) in an entity with an interest in diagnostic imaging

5. Received and cash or non-cash gifts from organizations or entities with an interest in diagnostic imaging

6. Received any loans from organizations or entities with an interest in diagnostic imaging

7. Received any paid or reimbursed travel from organizations or entities with an interest in diagnostic imaging

COIs were disclosed to UCSF prior to this TEP meeting via paperwork. No members had financial conflicts that precluded their participation. TEP members were also asked to verbally disclose any COIs when introducing themselves for the purpose of group transparency. TEP members re-stated their affiliations and any existing conflicts.

  • Dr. Helen Burstin stated her affiliation as the CEO of the Council of Medical Specialty Societies and had no new or existing conflicts of interest.
  • Dr. Mythreyi Chatfield stated she is Secretary Vice President for quality and safety at the American College of Radiology and had no new or existing conflicts.
  • Niall Brennan stated he is CEO of Health Care Cost Institute and had no new or existing conflicts.
  • Dr. Krishna Nallamshetty, a new TEP member, is replacing Dr. Jay Bronner who stepped down from the panel upon retirement. Like Dr. Bronner, Dr. Nallamshetty works at Radiology Partners where he serves as Associate Chief Medical Officer and chair of the patient safety committee. He is associate faculty at the University of South Florida in radiology and cardiology. Dr. Nallamshetty’s conflicts include: he is employed by Radiology Partners, the largest radiology practice in the US; he is faculty at the University of South Florida; he is equity owner in Tower Radiology, an outpatient imaging center network in Florida; he is equity owner in Radiology Partners; and he is equity owner and on the advisory board of Hyperfine MRI.
  • Missy Danforth stated she is Vice President for healthcare ratings at the Leapfrog Group and had no new or existing conflicts.
  • Tricia Elliot stated her role as Director of Quality Measurement at The Joint Commission and had no new or existing conflicts.
  • Dr. Jeph Herrin stated his affiliation with Yale University and no new or existing conflicts.
  • Dr. Leonard Lichtenfeld stated he is an independent consultant, formerly Deputy Chief Medical Officer at the American Cancer Society. He had no new conflicts to report but noted he is no longer with the ACS.
  • Dr. Matthew Nielsen reported he is Chief of Urology at University of North Carolina. He had no new conflicts but mentioned his existing relationship with the American Urological Association and American College of Physicians as a consultant.
  • Dr. Debra Ritzwoller stated she is an economist with Kaiser Permanente Colorado and is serving in the capacity of a patient advocate. She restated her role as MPI on an NCI-funded lung cancer screening research center, and her involvement on other PCORI and NCI-funded lung cancer screening grants. She had no new conflicts to disclose.
  • Dr. Anthony Seibert stated his role as a medical physicist at UC Davis Health and had no conflicts to declare.
  • Dr. Arjun Venkatesh stated he is an emergency physician on faculty at Yale University, where he is chief for his administrative section, and a scientist at the Center for Outcomes Research and Evaluation. He disclosed he serves as a consultant for the American College of Radiology and leads quality measure development work with the American College of Emergency Physicians.
  • Dr. Todd Villines reported he is a cardiologist in the area of multimodality imaging at the University of Virginia. He had no new conflicts but restated prior conflicts, including prior president of the Society of Cardiovascular CT, in which role he is a non-voting member of the Board of Directors. He is also current Editor-in-Chief of the Journal of Cardiovascular CT.
  • Dr. Kenneth Wang stated he is a musculoskeletal radiologist at the University of Maryland in Baltimore. He had no new conflicts.
  • Dr. Amy Berrington introduced herself as Branch Chief of Radiation Epidemiology at NCI and had no conflicts.
  • Dr. Mary White stated she is Chief of the Epidemiology and Applied Research Branch in the Cancer Division at the Centers for Disease Control & Prevention and had no new conflicts.

TEP member Dr. Hedvig Hricak joined the call after Roll Call.

10:15 AM:      TEP Goals, Personnel Updates, CMS Updates                   Dr. Smith-Bindman

Dr. Rebecca Smith-Bindman briefly stated the objectives of the meeting, including reviewing risk-adjusted radiation dose thresholds, image quality thresholds, and early beta testing results.

She introduced Carly Stewart, new UCSF project manager. She shared Dr. Andy Bindman – who could not be on the call – retired from UCSF and started a new position elsewhere but would continue as an advisor on this project. Lastly, she shared Dr. Jay Bronner’s retirement from Radiology Partners and from the TEP, and welcomed Dr. Nallamshetty in his place.

Next, she restated CMS’s continued interest in developing a measure for the hospital programs (Inpatient Quality Reporting, Outpatient Quality Reporting, and Critical Access Hospital) corresponding to the MIPS measure. She reiterated the challenge, mentioned at several prior TEP meetings, of physicians using data controlled by hospitals to report in the MIPS program. She mentioned we have come up against this firsthand in our testing, where physicians sometimes were unable to access to the data where they worked as the data were owned by hospitals. Data ownership aside, practically speaking, CT scans are controlled and run jointly by the hospitals and physicians. The technologists who operate the machines are usually paid by the hospitals, so having this corresponding measure in place would align incentives between physicians and hospitals and maximize quality improvement.

The final update Dr. Smith-Bindman shared is the recent advisement from CMS that the agency is moving away from adopting new qualified clinical data registry (QCDR) measures in the MIPS program. We have thus far been developing our candidate measure as a QCDR measure and must now pivot to developing an electronic clinical quality measure (eCQM). She explained the measurements themselves are not so different in a QCDR vs. eCQM, but how data are collected and reported is very different. eCQMs are computer automated measures that use data extracted electronically from the electronic health record to measure healthcare quality. Data elements used in eCQMs must map to a value set, which is a set of allowable codes and descriptors sanctioned and maintained by the National Library of Medicine, Value Set Authority Center (VSAC). Our hope is that an eCQM measure(s) would be applied both in the MIPS and hospital-based programs.

She shared several challenges in creating an eCQM (slide 15). First, our primary measurement, radiation dose, uses data formatted in the DICOM standard; however, these data elements are not currently included in existing value sets that are available for writing eCQMs. Thus, we are working with CMS and NLM to get value sets created for DICOM variables.

The second challenge to using an eCQM has to do with the image quality component, which is not derived from the same kind of digitized data. Our method is designed to assess pixel data on CT images, and there appears to be no way to add these to value sets. This issue threatens the inclusion of image quality assessment in the measure. We are exploring several workarounds that we think will be successful, but if not, an alternative is to not include the image quality part of our measure, at least in the initial rollout.

Dr. Smith-Bindman explained changes to the timeline (slide 16). We're currently 26 months into the 36-month project. Moving towards an eCQM will require UCSF to accelerate the timeline, because we would have to submit the measure to the Measures Under Consideration (MUC) List in May 2021. CMS will provide us with an extension and allow us to submit our final testing data as late as the fall of 2021. We believe we can complete most MUC list requirements by May, with final testing data submitted around September 2021, as permitted by CMS. Our timeframe for submitting the measure for NQF endorsement remains unchanged.

10:25 AM:      Discussion: TEP Goals, Personnel Updates, CMS Updates             Dr. Burstin

Dr. Burstin introduced the discussion questions:

  • Do you see a benefit of having a parallel hospital measure to the proposed MIPS measure?
  • Do you support our transition to an eCQM even if it requires that we pause on the image quality assessment?
  • If we need to pause on the image quality assessment, would that change how low we should set the upper radiation dose threshold? 

Dr. Nielsen said he thinks the pivot towards an eCQM makes sense. He’s heard similar messages from colleagues about moving away from medical societies having their own niche QCDR measures, and moving towards measures in the MVP framework. He feels the hospital-based measure will be valuable for a lot of organizations, and an eCQM measure makes sense in this framework. Dr. Venkatesh seconded support for aligned MIPS and hospital-based measures, citing increased likelihood of data-sharing. As an Emergency Medicine physician (a hospital-based specialty similar to radiology), he thinks alignment between hospital measures and physician measures is desirable as there are a large number of hospital-employed radiologists in the US. "So, to me, anytime you have a chance to have a very similar measure with identical incentives living in parallel programs, that's how you make it possible for hospitals to say, ‘Share data with clinicians so they can report it in MIPS,’ and the hospitals get the benefit of taking on some of that burden of data collection reporting on the hospital side. To me, it seems like only a win, and I think that it'll actually help the MIPS measure to be successful by the fact that there's a concurrent hospital measure."

The panel discussed the complexity of eCQMs, particularly measuring at the individual clinician-level. Dr. Burstin suggested focusing on the hospital-based measure first, setting aside the MIPS measure. Dr. Chatfield expanded on the complexity: we can create the value sets, and build the machinery so to speak, but making sure data needed for measure calculation migrates properly will be a challenge (though further beta testing should shed light on data availability and usability). Any amount of automated data would be helpful. We must also consider receptiveness of hospitals to ingest the software, which is sure to be impacted by the transparency and understandability of the model.

Dr. Burstin asked for clarification on whether data will come to UCSF. Dr. Smith-Bindman explained that a new eCQM would be created, and that the details are not yet final but that the registry model will be abandoned. Dr. Burstin noted that the process of vetting the measure through the MUC list and NQF process would raise issues of the appropriateness of the measure’s use in different settings.  

On the basis of this feedback from the TEP, UCSF plans to continue to plan, build, and test the eCQM measure for the MIPS programs. As soon as we have official notice of award from CMS, we will begin to build and test the hospital-based eCQM in parallel.

10:40 AM:      Setting and Risk-Adjusting Upper Dose Thresholds        Dr. Smith-Bindman

Dr. Smith-Bindman reiterated from previous presentations the goal of setting an upper radiation dose threshold as low as possible to support patient safety, but not so low it compromises image quality (slide 19). Dose thresholds are adjusted for patient size (as larger patients require higher doses), and thresholds are specific to CT category. Different anatomic regions require higher doses (e.g. brain vs. abdomen), and different clinical indications in the same region require higher doses (e.g. lung cancer screening vs. characterizing lung cancer). Patient characteristics (age and sex) and CT machine make and model contribute very little to dose variation and are not factored into the measure.

Dr. Smith-Bindman summarized the logic for setting the upper dose threshold. It is the radiation dose at which >90% of physicians in the Image Quality Study rate image quality as acceptable (slide 20). Doses above this level do not contribute to greater image quality and therefore contribute to unnecessary cancer risk from excess radiation. For categories in which >90% of physicians were satisfied with quality at every observed dose, we set the threshold at the median of the category, derived from the UCSF CT International Dose Registry (hereafter referred to as “UCSF Registry”), as previously recommended by the TEP.

Using this approach, the proportion of exams that are out-of-range will be different across categories (slide 21). For example, 5% of routine head CTs will be out-of-range, whereas 50% of routine dose abdomen CTs will be out-of-range. More cases will be rated out-of-range in the CT categories in which radiologists are satisfied with image quality at every dose, which suggests in these categories, CT operators are probably using doses that are higher than needed.  The routine dose head category has the fewest number of exams judged as out-of-range, suggesting those doses are about right, currently.

The project team has only analyzed out-of-range rates in the UCSF Registry thus far, and only at the facility level (not the physician or physician group level). In the UCSF Registry, there was a strong correlation at the facility level in the proportion of scans that are out-of-range across the different CT categories (slide 22). That is to say, facilities that have out-of-range values in one category are likely to be out-of-range in other categories as well. As a caveat, the UCSF Registry does not contain any facilities that focus on a single CT category, though this could certainly occur in actual practice. For example, a urology practice may only see high-dose abdomen CT scans. If a physician sees a limited distribution of patients, it may be easier or harder to have CT examinations within range, depending on the category.

Risk-Adjustment

A mid-scan measurement of diameter measured on either the axial or scout image is used to calculate patient size; the adjustment used for the size-adjusted dose measurement is log linear (slides 23-25). Our method compares size-adjusted dose against the threshold for that category. Without size-adjustment, the likelihood of a CT scan being out-of-range will be driven by patient size. If size is appropriately being adjusted for there would be a similar number of out-of-range values across the different size categories. Dr. Smith-Bindman showed an example of this based on about 1,000,000 routine abdomen scans from the UCSF Registry, stratified by size decile. Without adjustment, 21% of smaller patients had out-of-range doses versus 94% in the largest patients. With adjustment, approximately 50% of CT scans were out-of-range at every size.

11:00 AM:      Discussion: Risk-Adjusting Upper Dose Thresholds                        Dr. Burstin

Dr. Burstin introduced the discussion questions:

  • For categories in which image is excellent or adequate at every dose level: 
    • Should we continue to use the median (50th percentile) as the upper dose threshold, or
    • Should we align out-of-range rates across categories by setting the dose threshold at the average out-of-range rate? (e.g.27%)
  • Does the TEP endorse the risk-adjustment approach based on patient size?

Dr. Venkatesh asked why the measures focus on categories where performance is already optimum (routine dose head has 5% out-of-range, for example) and suggested UCSF ought to target categories where the opportunity to reduce dose is greatest. To this, Dr. Smith-Bindman explained that there is opportunity to reduce dose in all 19 categories, and when you apply our thresholds to all categories combined, the average dose reduction is 38%. Dr. Venkatesh countered, saying including routine head in the denominator is a bit unfair; for example, a facility with a high proportion of head CTs would be at an advantage, and this impacts your ability to compare facilities against one another. Dr. Burstin summarized the point: if there is no improvement gap in head CTs, why include it in the measure?

Dr. Smith-Bindman explained that she had presented the two extreme values for how many CTs are out-of-range across the 19 categories. The doses for routine dose head demonstrate the lowest number of out-of-range values (5%) and the smallest quality gap across the 19 categories; it was selected to demonstrate the broad range. The other head categories have far greater out-of-range values. The low dose head category has 39% of CT scans judged as out-of-range, while the high dose head category has 50% out-of-range. Thus, it is important to include head CT scans in the measure. She also noted that the average does not give a complete sense of the quality gap. Many facilities are extreme outliers in dosing for head CT, including the routine dose head category, so while on average you won’t make a big impact, for individual practices you could achieve a large reduction in this category.

Dr. Seibert asked to clarify if UCSF is using water-equivalent diameter. Dr. Smith-Bindman confirmed UCSF is not, because it is missing most of the time in our data. UCSF is calculating the average diameter measurements derived from patient circumference based on the CT scans. Dr. Smith-Bindman pointed out that since all CTs are compared within a single body region, adjustment for water equivalent diameter should give the same results as adjusting for diameter.

Next the panel addressed the first discussion question around setting the dose maximum at the median or the average out-of-range rate from the other categories (27%). The essential question is: do we want a more aggressive or more lenient approach to dose reduction in the categories where dose is currently adequate for image quality at all levels? Dr. Herrin wanted to know how representative the 27% mark was, and would it change if tested on a larger group of patients or hospitals? Dr. Smith-Bindman responded the 27% is based on risk-adjusted analyses from a sample of about 8 million records from the UCSF Registry, which is likely very similar to national averages of patient size and dose. 

Dr. Venkatesh explained what he viewed as a tension between implementation and scientific acceptability. The 50th percentile approach would be easier to explain, easier for clinicians to understand, and potentially lead to greater improvements. On the other hand, he argued a more lenient threshold based on the 27% rate would be more acceptable to clinicians, many of whom may feel penalized for making dose decisions appropriate to the clinical indication. Dr. Wang later seconded this opinion. Dr. Venkatesh explained, in light of this tension, the historical precedent has been to choose the more modest approach that is more likely to have higher clinical face validity. He further noted that UCSF must consider, plan, and communicate how the measure benchmarks will be updated over time, referencing how this has been challenging with measure OP-9 (mammography follow-up rates) in the outpatient reporting program. Will UCSF calculate benchmarks every year? If benchmarks are based on the UCSF Registry, is that widely accepted within the radiology community? Are there resources to maintain the UCSF Registry for this purpose in the long-term, as this is not something CMS will take on?

Dr. Chatfield seconded the point of having transparency in our benchmarks, for example, publishing them each year so that clinicians know what to aim for. She offered to calculate the measure using ACR registry data to lend support and credibility to our proposed benchmarks.

Dr. Romano favored the median approach but stressed the need to reassess quality at some designated point in the future, as the median would likely move lower over time (“moving the goalposts”). NQF will require UCSF be upfront about its long-term plans to assess and maintain the thresholds. Dr. Smith-Bindman acknowledged the need to re-assess image quality over time, particularly as machines get better. Dr. Burstin agreed this could be dealt with as a three- or five-year update.

Dr. Smith-Bindman clarified that the benchmarks would be created and set, and not modified each year based on that same year's data. In other words, the thresholds would not be a moving target.

Dr. Burstin steered the discussion to the second question: does the TEP agree with the risk-adjustment approach using patient size for adjustment? Dr. Hricak strongly backed the proposed size-adjustment approach and was particularly supportive of an approach that protects against over-dosing cachectic (smallest) patients. Dr. Chatfield agreed the size-adjusted approach makes sense, but emphasized the need for transparency in how we are using size: e.g. how is size defined; how is it calculated; how does size vary; how will size be applied for adjustment?

Dr. Villines asked about inclusion of age in risk-adjustment. Dr. Smith-Bindman explained that while age does have a small association with dose, there is no reason that it should. Dr. Romano noted that doses should not be based on age, so we have not accounted for it in the risk-adjustment model; the same goes for patient’s sex. Dr. Nallamshetty suggested that age does matter to a mild degree – for example, in some preoperative planning for TAVR, you’d use a higher dose in an 80+ year old individual to get better visualization of smaller structures, though he acknowledged this is a small population of patients. Dr. Villines added that clinicians are more sensitive about dose and aggressive about dose reduction in younger people than older, and suggested perhaps dose thresholds should be lower in younger patients. Dr. Smith-Bindman responded that our CT-categories are designed to accommodate these dosing needs; for example, TAVR imaging falls into the high dose cardiac category. The adjustment is intended to equalize for factors that are important predictors of dose, not to adjust for situations where physicians may be more sensitive for the need to optimize dose. Ideally dose would be optimized for all patients.

On the basis of this feedback from the TEP, for CT categories where the image quality was rated as acceptable (excellent or adequate) for >90% of physicians at every observed dose, UCSF plans to set the dose maximum at the average out-of-range rate from the other categories. While this threshold is more difficult to explain, it is a more lenient threshold and would be more acceptable to clinicians. On the basis of this feedback from the TEP, UCSF will use the size-adjustment approach as described during the meeting.

11:15 AM:      Break                                                                                                                         

11:25 AM:      Approach to Assessing Image Quality                                Dr. Smith-Bindman

Dr. Smith-Bindman re-introduced the rationale for including image quality in the measure: the purpose is not to maximize image quality, but to protect against untoward effects of lowering radiation dose. It is included as a balancing measure (slide 29).

Dr. Smith-Bindman again summarized the Image Quality Study that produced 25,000 physician-graded CT studies, which were used to establish upper dose and minimum image quality thresholds (slide 30). Our automated approach to identifying exams of inadequate quality sets the threshold at the level where 25% or more physicians rate the exam as poor or marginally acceptable (slide 31). However, these poor or marginally acceptable cases are rare; in fact we observed no CT category where 50% of physicians rated images as unacceptable. Thus, while we would like to set a higher threshold (i.e. greater than 25% of physicians rating images as unacceptable), this would result in a very small number of studies failing on image quality. 

Dr. Smith-Bindman then presented the categories where the project team found images of unacceptable quality using this categorized rule of >25% rating exams as poor or marginally acceptable: low dose head; low dose chest; routine dose chest; low dose abdomen; and spine (slide 32). Collectively these account for approximately 40% of all CTs performed. For all head, chest, and abdomen categories where the project team did not observe unacceptable images, the plan is to apply the minimum quality threshold from their corresponding low dose categories. For example, the established minimum quality requirement for low dose abdomen CT would also be used as the minimum quality requirement for routine and high dose abdomen CT.

Dr. Smith-Bindman described how, in developing the automated approach, the project team tested four candidate measurements of image quality: noise, noise texture, resolution, and contrast (slide 33). The project team found noise alone was as good as all four of these measures in predicting image quality, not to mention simpler to describe. Thus, the project team is pursuing an approach that relies on noise, derived from image pixel data, to identify images as inadequate based on the 25% satisfaction threshold. Area under the curve (AUC) analyses for using noise to predict unacceptability ranged from 90-95%. AUC analyses show a trade-off in testing, with sensitivity on one axis and false positive rate on the other. The higher the AUC, the better the test’s ability to discriminate cases from non-cases. AUC values in the range of 0.8-0.9 are considered excellent and above 0.9 are considered outstanding.

Our approach using a false positive rate of 5% (meaning, 5% of truly adequate cases are flagged as unacceptable) gives a sensitivity ranging from 28-38% across the anatomical areas.

Though the image quality assessment has not yet been tested in our beta testing sites, Dr. Smith-Bindman shared results of testing on a sample of 3,759 cases from UCSF Health System (slide 35). The project team was able to successfully run the code on 99% of images, and about 6% were judged out-of-range.

11:45 AM:      Discussion: Approach to Assessing Image Quality                           Dr. Burstin

Dr. Burstin introduced the discussion questions:

  • How important is it to retain image quality as a component of the radiology measure?
  • Is an image quality threshold based on >25% of radiologists rating images as poor or marginally acceptable a sensible cut, considering out-of-range cases are rare with a higher threshold?
  • Are you satisfied with using noise as the basis for judging image quality?
  • Is a 5% false positive rate acceptable?

Missy Danforth shared her opinion that it’s important to retain some assessment of image quality as a balancing measure. She said the pediatric CT dose measures on their Leapfrog surveys take a similar approach to scoring and setting performance thresholds. She feels the lack of image quality assessment has been a shortcoming of Leapfrog’s work, and if available, would lend validity to dose scores. They have worked with ACR to come up with a proxy measure of quality, but having it integrated into the UCSF candidate measure would make it stronger and more successful in NQF review. Dr. Seibert seconded this opinion, noting the importance of catching unacceptable quality scans, which are a greater risk to patient safety than any dose at all. Dr. Nallamshetty was in favor of keeping the image quality assessment, suggesting many radiologists work out of multiple locations where they do not have direct control over radiation decisions. He noted radiologists tend to adapt to reading various levels of quality. He worries, without a dose floor, we’ll incentivize practitioners to focus on lowering doses and radiologists will make do with the scans, potentially risking diagnostic accuracy. Dr. Hricak stressed the importance of image quality being included in the measure in light of potential malpractice suits.

Dr. Smith-Bindman seconded this idea that radiologists adjust to image quality, citing evidence from the Image Quality Study, which found radiologists scored only 11% of images unacceptable. She noted further that we considered and tested using low radiation dose as a proxy for noise in assessing image quality, but that using the measurement of noise was twice as accurate as dose. She explained we are using a noise measurement developed by a recognized expert, Dr. Ehsan Samei at Duke University, who is allowing the measurement to be used free-of-charge for the quality measure.

Dr. Chatfield and Dr. Herrin voiced the opposite preference for eliminating the image quality assessment. They argued that since image quality was factored into the dose thresholds, it’s already inherently built-in to the model. Their larger point, though, was about the complexity, practicality, and validity of folding image quality assessment into the paired measure. Dr. Chatfield cautioned that there might not be consensus around Dr. Samei’s noise calculation being used as the standard as others might want their own measures of quality included.

On the question of 5% false positive rate, Dr. Hricak agreed this was acceptable. Dr. Seibert agreed.

On the question about setting the image quality threshold at >25% physicians dissatisfied, Dr. Romano proposed a different line of thought: if we think radiologists are too permissive with quality, perhaps we should set a lower image quality threshold (10-20% dissatisfied as opposed to 25%). Dr. Smith-Bindman emphasized the difficulty of finding the right balance: 50% seems an attractive threshold (as it would potentially identify the worst quality CT cases), but there are almost no unacceptable cases at this level. Going lower than 25% (e.g. 10%)  would mean failing exams that the vast majority of physicians (90%) deemed acceptable. Dr. Romano also raised the idea of considering noise as a continuous measure, such that too much noise is considered to reflect poor quality.

Dr. Wang asked Dr. Smith-Bindman to elaborate on how image quality assessment can be incorporated into the eCQM. She said the plan is not final, but described a plan to build a free, external website or software to calculate image quality, and the eCQM would be built to receive that calculation as an input. She cited other CMS programs, such as use of decision support under PAMA, that require use of an external website and used inputs that were outside the eCQM. Additionally, there are examples of eCQMs that also use calculations too complicated to include inside the eCQM itself. Ongoing work on how to do this will be shared at a future TEP.

Dr. Burstin noted the complexity of this approach and floated the idea of pursuing image quality as a separate measure, rather than a paired measured. She proposed the idea of replacing the paired image quality assessment with an audit function, for example, doing a 5% national audit to understand if image quality is dropping. Dr. Smith-Bindman explained that CMS was insistent that image quality is a very important component of the measure.

In sum, the panel was strongly aligned in measuring image quality to protect to against lowering doses too much. Some panelists expressed concern over the complexity of including it in the eCQM and were satisfied knowing that it was factored into the dose thresholds, while other panelist felt it was essential to include in the paired model.

On the basis of this feedback from the TEP, UCSF plans to include a measure of image quality in the final measure specification but will continue to explore additional ways to identify CT examinations that are out-of-range on quality. The different approaches will be explored in beta testing and presented at the next TEP meeting.

12:05 PM:      Beta Testing Results                                                             Dr. Smith-Bindman

Dr. Smith-Bindman summarized the aims of beta testing (slide 38). In beta 1, the project team is assessing whether it can: assemble the required data from testing sites; apply inclusion/exclusion criteria; determine CT category based on procedure and diagnostic codes; calculate size-adjusted dose; and determine out-of-range rates at the physician and physician group levels. In beta 2, the project team will add image quality assessment to the testing. Dr. Smith-Bindman gave an overview of how UCSF software is installed on local servers, collects data from a one-month period (around 5,000 CT studies), and exports de-identified data to UCSF for analysis (slide 39). Data is derived from three sources: PACS (RDSR, image pixel data, additional variables on why CT was performed, and linkage variables); EHR (diagnostic codes); and billing claims (procedure codes). The six testing sites use diverse EHR and PAC systems; their data is in various stages of completeness, and what is presented today is preliminary analysis of beta 1 data from five of the six sites. Dr. Smith-Bindman explained the plan to build, test, and modify the software will occur through an iterative fashion with successive testing rounds.

Dr. Smith-Bindman shared the following preliminary results:

One of the major lessons learned is the unavailability of RDSR data, despite federal law requiring CT manufacturers to generate this data, which UCSF uses to calculate dose (slides 43-44). The project team has found it is generated but not saved for most CT scans; 83% to 95% of CT scans were missing the RDSR across our testing sites. The UCSF team has workarounds in place to get RDSRs for the testing sites, but this is a concern affecting future implementation.

Likewise, the project team found a quarter of the sites’ claims data were missing CPT codes (slides 45-46). The project team is working with sites to understand why they are missing. The following analyses and results did not include scans with missing CPT codes.

The project team was able to identify and exclude CT procedures that met exclusion criteria, for example: scans in children; scans in conjunction with radiation oncology; scans for biopsies; etc. (slide 47). The project team identified 3% of scans across the sites meeting the exclusion criteria. In the next round of testing, the project team will validate its success in identifying all such scans. 

To study assignment of CT scans to CT category based on billing data, the project team compared the distribution of CTs within CT categories obtained from the beta testing sites to what has been observed in the UCSF Registry, expecting the distribution to be similar (slides 48-52). This is indeed what was found: the categorizing of CTs based only on body region (e.g. extremity, spine) were characterized accurately and the distribution of them from the beta testing sites matched that of the Registry. Head, chest, and abdomen accounted for the bulk of scans in testing sites (84% collectively) and the Registry (84% collectively). The project team also expected routine-dose head, chest, abdomen, and cardiac scans to be more common than high- or low-dose, and this was true both at our testing sites and in the Registry. Finally, the project team measured sensitivity of the CT category assignments using a composite referent standard (based on gold standard manual chart review) that the project team has previously shown to be 91% accurate in UCSF Registry data. The goal was to maximize accuracy for high- and low-dose categories to avoid penalizing for appropriately high radiation doses and low image quality. Comparisons of the sensitivity of beta 1 data for head, chest, and abdomen high- and low-dose categories (ranging from 70-98%) with the UCSF Registry (ranging from 79-100%) were satisfactory but will be reassessed when full beta 1 data is compiled.

Dr. Smith-Bindman reviewed next steps in beta testing: analyzing size-adjusted dose and out-of-range rates; measuring image quality; and conducting analyses at the physician and physician group levels.

12:35PM:       Discussion: Beta Testing Results                                                         Dr. Burstin

Dr. Burstin introduced the discussion questions:

  • Do you have any recommendations as UCSF explores missing data?
  • What analyses of beta testing data would improve your confidence in the measure?

Dr. Nallamshetty asked how difficult it will be to obtain RDSRs. Dr. Smith-Bindman explained that in general it should not be difficult. Most of the testing sites did not know they were not saving RDSRs, confusing it with the dose sheet, which was saved. The technical solution is usually a simple matter of reprogramming the machines to save the RDSR in the PACS. If sites utilize a DICOM-router, it is very easy. This could be more difficult for some sites if they need to have to make the change at each CT scan or if they need to get the machine vendors involved. Dr. Seibert shared, for example, at UCD they learned GE would have to come on site to set this up on their older machines. He cautioned that vendors may charge for this service. He agreed that the RDSR should by default be saved.

Dr. Chatfield said the ACR has grappled with this problem in their own dose index registry, and they have been working with facilities to transition to saving them. She asked when UCSF would be ready to share our measure specifications so that they can try running analyses in their registry. Dr. Smith-Bindman responded that the measure is still under development and would likely not be finalized until the MUC List submission in May.

Dr. Nallamshetty asked how we’re handling mis-categorization of CT studies and how we avoid penalizing a clinician based on a misclassified scan. Dr. Smith-Bindman proposed this would likely be a manual process: ideally, physicians or physician groups would be able see their scored data and adjudicate out-of-range scans. UCSF needs to learn how to make this possible in an eCQM framework. 

Niall Brennan inquired about scalability and replicability of our measure, based on the limited number of testing sites. Dr. Smith-Bindman responded that most of the testing sites are large health systems. Mt. Sinai and Henry Ford are each large systems comprised of multiple hospitals, outpatient facilities, and practice groups, including some with different EHRs. The UCs are likewise complex health systems. Despite struggles, beta testing is occurring successfully. eCQM testing will look different, and while it’s hard to predict implementation while the approach is still under development, UCSF expects scalability won’t be a major issue because the eCQM has stricter requirements than the previous model. Dr. Burstin added to this point that a core feature of eCQMs is availability of structured data, and encouraged UCSF work with sites as well as with ACR to ensure RDSRs are saved. She encouraged UCSF pilot the measure as much as possible. She suggested UCSF explore a regulatory solution to capturing the RDSRs.

Dr. Wang asked Dr. Smith-Bindman to describe how we are treating combined categories, for example, head, neck, chest. She responded that UCSF has created three combination categories: chest, abdomen, pelvis; head and neck; and thoracic and lumbar spine. These combinations are relatively common, while other “oddball” combinations that are extremely infrequent are excluded from evaluation. Because they are quite nuanced and not very common, thresholds for dose and image quality are generous in the combination categories. She explained, moving forward, UCSF will explore combination scans more extensively in our testing data and present more information at the next TEP meeting. 

On the basis of this feedback from the TEP, UCSF plans to explore the ease with which health systems can store the RDSR, as well as any factors that might hinder their storage. Further, UCSF will reach out the FDA to understand the requirements for hospitals to generate and store the RDSR.

12:55PM:       Wrap Up and Next Steps                                                     Dr. Smith-Bindman

Dr. Smith-Bindman thanked the panel for “kicking the tires” and sharing invaluable insights. The major next step for UCSF is to begin developing the eCQM software. She mentioned she would invite the panelists to participate in a TEP for the hospital-based measure when that is cleared by CMS.

Dr. Burstin wrapped up by encouraging the UCSF team to test the measure as much as possible and to get early input from stakeholders and through public commenting.

1:00PM:         Adjourn                                                                                                 Dr. Burstin