The Importance of Collecting Real-World Patient Data for Trial Design, Patient Selection and Site Recruitment

According to Novotech CRO, Real-World Data (RWD) is defined as data which has been collected from a number of sources that are associated with outcomes from a diverse range of participants in a real-world setting. Analysis of RWD generates valuable insights which can improve interventional pathways and the clinical and economic impact on participants and the healthcare system. 

Understanding the participant perspective is crucial to delivering the best patient-centered care. In the research industry, social media has become an increasingly popular way to capture the patient voice. Publicly available social media can be accessed and analysed with ease, skipping any logistical issues which can be associated with the traditional approaches of collecting data and allowing for a streamlined data collection process. 

Experience has proven that Real-World Data (collected from social media platforms) can offer credible insight into a participants experience during the clinical trial and also allows opportunities for patients and their carers to create and exchange health-related information. Past experience has shown that participants use social media to gain knowledge, to get support, to exchange advice and to improve self-care and physician-patient communication. This has resulted in a rich, but also analytically confusing source of Real-World Data. The ability to organise this data for medical research purposes has been assisted over the last several years by the use of advanced analytics. Language processing and machine learning can organise the data extracted from social media. Machine learning algorithms have also been developed which can accurately and automatically identify features of posted content like adverse events (AEs). 

Trial Design 

In the past, data collected from randomised clinical trials has been considered the “gold standard for drug development”, however, over the last decade regulators have begun to realise that Real-World Data is an improved way to inform drug approval and label expansion decisions. RWD can be either retrospective (dealing with past events and situations) or prospective (dealing with future events and occurrences). When a clinical trial includes prospective randomisation, they are known as being a ‘pragmatic trial design’. Pragmatic clinical trials evaluate the effectiveness of interventions in real-life practice conditions. They produce results that can be applied in routine practice settings. 

In the early clinical trial phase, Real-World Data can be used to identify participants with unmet clinical needs and a greater chance of benefitting from new therapy. This data may help refine the trial’s inclusion/exclusion criteria to improve the capture of target patients. RWD can help identify the best clinical trial sites and enable a more streamlined recruitment process and retention. These interventions can shorten the trial length while increasing the statistical power. 

Excessive data collection has been blamed for causing clinical trials to be delayed and over budget. Real-World Data can help eliminate these problems by giving trial designers a more clear idea of what variables are most often used clinically, which are informative and which are redundant. 

New clinical trial designs in the clinical setting (adaptive platform trials) allow for dynamic evaluation of multiple interventions and can be a valuable source of  Real-World Data with the potential to increase the trial’s efficiency and reduce costs. 

Large pragmatic trials are increasingly becoming a Real-World Data source. They’re designed to show the real-world effectiveness of an intervention in a large participant group. 

  • They incorporate a prospective, randomised design and collect data from a wide range of health outcomes in a diverse participant population. 
  • Pragmatic clinical trials are run in routine practice settings, have a participant population that is relevant for the intervention and a control group who are treated with an acceptable standard of care, or a placebo. Outcomes that are meaningful to the participant population in question should be discussed. 
  • They may focus on a specific type of participant or treatment. Study coordinators may select participants, Sponsors, Investigators and clinical trial sites that will increase validity. 
  • Pragmatic trials are able to provide data on a range of clinically relevant real-world considerations, including various treatments, participant-physician treatment algorithms and cost-effectiveness, which may even resolve policy relevant issues. The clinical trial focuses on the outcomes which are most important and take into account the real-world treatment adherence and compliance on the direct impact of a medication or treatment. 

Well-designed interventional pragmatic trials have the potential to overcome many of the limitations of observational and retrospective real-world studies. However there are still some challenges that may arise with a pragmatic trial design: 

  • There may not be any limitations placed on participants and clinicians which could result in some inconsistent or missing data. 
  • While proving how effective real-world practice can be, some clinical trials may trade aspects of internal validity for higher external validity. 

Patient Recruitment 

Real-World Data is being increasingly utilised to personally reach out to potential clinical trial participants which improves the efficiency and effectiveness of the recruitment process. Real-World Data is collected from multiple sources including social media, patient health records, claims data, patient registries, and face-to-face interaction between potential participants and the clinical trial staff. 

Using Real-World Data for eligibility criteria means that the data needs to be evaluated against the needs of the clinical trial. Some factors that need to be considered include the identification of the eligibility criteria in RWD and how the generalizability (how useful the results are for a large group of people) of the available data can impact the clinical trial. 

Real-World Data recruitment strategies (email campaigns and electronic health records) can increase the effectiveness of recruitment. Combining RWD recruitment and traditional recruitment measures is often considered best practice. 

Some things to consider when using Real-World Data for patient selection include: 

  • Being able to contact potential participants while keeping their information confidential and giving them privacy. 
  • Assessing an eligibility criteria that is necessary and feasible. 
  • Expecting a certain degree of error (patient information isn’t always going to be a perfect match to the inclusion and exclusion criteria). 
  • Having access to thousands of potential participants means that the Sponsor needs to work with stakeholders, institutions and Institutional Review Boards (IRBs) to plan what approaches are most effective, while still respecting the privacy of the potential patients. 
  • Participants, stakeholders, institutions and IRBs should work together to set appropriate communication channels (how much communication is too much/not enough?) 
  • Referring to previous trials that implemented similar recruitment approaches is encouraged. 
  • Clinical trial Sponsors should start gathering information from RWD before Phase I trials so they can get a good idea of the patient populations and understand unmet needs. This can facilitate data-driven decisions without delaying start-up timelines. 
  • A cross-functional team (clinical, operations, biostatistics, data science and informatics) will ensure that the information collected from RWD will be used to plan effective recruitment strategies. 
  • Sponsors should use RWD for trials which may face recruitment challenges (small target population, short timelines, etc). 
  • Low and slow recruitment from unrealistic eligibility criteria can still be a problem with Real-World Data. 

Site Selection 

One of the earliest and most important decisions to be made before a clinical trial gets underway is selecting a site to conduct the trial. Factors including the location, evidence of Good Clinical Practice (GCP), the staff’s experience and the available equipment are all taken into account before making a final decision. 

Real-World Data can be useful in selecting a clinical trial site as Sponsors and Principal Investigators can access information including diagnosis codes, laboratory tests, and histologies to predict the number of participants that can be enrolled in each potential site and locations where clinical trials can be held. 

Having access to a database of information about the patient population and studying observations made by physicians allows for patient populations to be analysed in potential clinical trial sites and to be matched based on their characteristics. It can also be used as a verification step to getting a participant’s consent and a physician’s work-up. Using Real-World Data in this process can assist in bringing a smaller number of participants into a work-up with an increased enrollment ratio.


Nishi Singh is a professional journalist and editor in New Delhi. She has studied Mass Communication from National Institute of Mass Communication.