Risk Based Monitoring: The Role of Statistical Programming

By Carey Smoak, DataCeutics, Inc. & Parag Shiralkar, Eliassen Group

Two senior level statistical programmers provide their perspective on a programmer’s role with Risk-Based Monitoring (RBM) of clinical data.

Carey Smoak recently moderated a panel discussion on RBM. The panel discussion mainly included clinical people and, therefore, focused on the processes of risk based monitoring. What follows is Carey’s perspective on the tasks for which the programmer is involved in the process.

One of the interesting points in the panel discussion was that technology / tools are not necessary to implement RBM at a company. It truly is possible to perform RBM by implementing processes and policies.

RBM usually consists of cross-functional team input from Project Management, Clinical Operations, Safety (Pharmacovigilance), Data Management, Biostatistics and Statistical Programming. From a Statistical Programming perspective (Carey’s profession), one piece of RBM is to have a monitoring plan (of the data collected in a clinical trial) that defines what is critical to monitor for the statistical analysis, i.e., not all data is equally important. This helps to ensure that the work done by a statistical programmer is of high quality with respect to the data to be analyzed.

Statistical programming certainly can be an integral part of RBM. Programming of monitoring reports is only a piece of what statistical programmers can do to help in the RBM process. For example, Carey helped set up a real-time data monitoring system. More information can be read about here -->(http://www.lexjansen.com/wuss/2014/110_Final_Paper_PDF.pdf).

Briefly, what the real-time data monitoring system did was to extract data nightly (at mid-night) from two sources:

  • Blood screening lab instruments (screening blood samples for HIV, Hepatitis B and Hepatitis C)
  • Clinical e-CRF database

The extracted data was processed (automated) on the SAS server to create SAS datasets; and In the morning, SAS programmers did final data processing to create SAS datasets to be used by Clinical people (e.g., CRAs) to run monitoring reports

The other part of this real-time data monitoring process is to set-up a multi-tier server environment (http://www.lexjansen.com/pharmasug/2014/AD/PharmaSUG-2014-AD08.pdf). Briefly, a multi-tier server platform means two servers are involved:

  • A physical SAS server to process data and create reports using SAS.
  • A virtual meta-data server to set up users, roles and permissions:

In the meta-data server, the SAS programmer role gets full permissions (read-write access) to the SAS server and the non-SAS programmer role (e.g., Clinical people, CRAs) gets limited permissions (read-only access) to parts of the SAS server.

The key product here is a SAS Add-in to Microsoft Office (AMO). The SAS AMO adds a SAS button to the ribbon of MS Office products (Excel, Word, Outlook, etc.). SAS programmers can write monitoring reports using SAS and then a server administrator makes that monitoring report available to the CRA.

Therefore, in Excel, a CRA can click on the SAS button and they will see the monitoring reports that are available to them and run the reports in Excel. The CRA can also open (read-only) SAS datasets in Excel and once the data is in Excel, they can use Excel to monitor the data.

A programmer’s responsibilities for this real-time data monitoring system includes:

  • Contacting SAS about licensing of the SAS AMO;
  • Overseeing the validation of the multi-tier server platform;
  • Overseeing the system which did nightly extracts and processed the data. (This part of the project can be global – involving personnel in all parts of the world)
  • Overseeing the SAS programmers which programmed the monitoring reports;
  • Attending SAS administration classes; and,
  • Hiring a SAS administrator.

This type of project is beyond what SAS programmers typically do in terms of programming monitoring reports. But more and more sponsor companies are looking to these types of solutions to help with monitoring of data.

From Parag Shiralkar’s perspective, statistical programmers play an important role in developing data representations based on analyzed data. Such representations and analyses are used for statistical interpretations and major decisions pertaining to continuity and outcomes of the clinical trial.

As the pharmaceutical industry experienced more and more standardization of clinical data, and data reporting processes, statistical programmers started assessing the data quality more thoroughly before analyzing such data.

Before the data gets into the database, the process of ‘source data verification’ or SDV at sites is specifically designed to monitor data quality and eliminate any risks that may appear at that stage. As per the survey conducted by Transcelerate Biopharma member companies, 78% of on-site data monitoring activities are attributed to source data verifications. The same research also pointed out that only 2.4% of the queries initiated by the data management function results in identification of critical data issues.

While more and more focus of regulatory bodies and sponsors is to ensure patient safety by minimizing data quality risks, the risk-based monitoring truly becomes a cross functional responsibility of all business functions involved in data operations.

From Parag’s literature review, research, and experience he found that the data trends and observations like adverse event rates, dose allocation changes, inclusion/exclusion deviations, and therapy-specific assessment scores are some of the key risk indicators which statistical programmers can easily keep an eye on.

Statistical programmers must do a thorough review of case report forms (CRFs) and clinical protocols before they dive into analyzing data and the statistical analysis plan. In RBM terminologies, protocol compliance, trend analysis, CRF adherence, and data visualization is regarded as ‘source data review.’ So, thinking more about this, statistical programmers are indirectly involved in these processes prior to conducting actual programming tasks.

As Carey outlined earlier, RBM efforts do not necessarily require implementation of tools or systems. Sponsors must have a vision to orient the data operations processes to assess any data anomalies and risks in a timely manner. This requires training and mentoring of staff working in each function to identify risks and issues pertaining to data quality, and requires escalating such issues through appropriate channels and addressing such issues in a timely manner.

While functions like site monitoring, EDC, biostatistics, and Data Management are heavily involved in RBM efforts, statistical programmers can have a major share of the responsibilities through supportive remediation. This can be realized through their responsibilities pertaining to:

  • Data analysis and modeling;
  • Report generation process;
  • Statistical surveillance and programming for trend analysis; and
  • Edit check programming.

From my perspective of supporting larger programming operations, the critical aspect to support RBM efforts is to train and orient the staff to identify and report the data anomalies and risks through their routine programming operations. This approach requires programmers to develop the ability to scrutinize the data by understanding compliance of data structure, and trends as ‘expected’ through CRF and protocol. I would like to mention one such case here:

A statistical programmer found that patients reported through table production were less than the patients randomized and treated. The programmer performed the following key steps through data assessments to investigate this issue before escalating to cross functional teams:

  • CRF data review showed patients reported were less than the number of patients randomized and treated.
  • A closer look at the CRF design showed the drug is dispensed on visit-1, day-1 and findings were gathered on visit-1, day-2.
  • The statistical programmer escalated the matter to cross functional teams and further investigation was triggered.
  • After more assessments through the AE log and other data, it was found that some patients were discontinued after day-1 but received the dispensed drug. Moreover, those patients did not report at on-site on day-2.

It is evident from this example that statistical programmers can monitor the risks pertaining to data quality through their assigned duties. It is crucial that sponsors and CROs take necessary efforts to train and orient the programming staff to contribute to RBM.