datadata.” ~ daniel keys moran, computer programmer and science fiction writer some may find that...

25
DATA BASICS A PUBLICATION SUPPORTED BY AND FOR THE MEMBERS OF THE SOCIETY FOR CLINICAL DATA MANAGEMENT, INC TO ADVANCE EXCELLENCE IN THE MANAGEMENT OF CLINICAL DATA Volume 24 | Issue 3 / 2018 Fall This Issue 2 Letter from the Chair 3 Letter from the Editors 4 Machine Learning, Artificial Intelligence and Analytics in Clinical Research By Namita Rajput 11 Big Data, Too Big? - Effective Strategies to Get to the Answers You Really Need from your Large Datasets By Steve Shevel 14 Opinion: Artificial Intelligence – the Popeye of Clinical Trials By Charan Kumar 18 Connecting to New Technologies to Improve Data Review Efficiency By Sravankumar Basani 22 The Clinical Data Management Skills Trinity By Maria Fernanda Posada, Antonio Rivas

Upload: others

Post on 24-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

  • DATABASICS

    A PUBLICATION SUPPORTED BY AND FOR THE MEMBERS OFTHE SOCIETY FOR CLINICAL DATA MANAGEMENT, INC

    TO ADVANCE EXCELLENCE

    IN THE MANAGEMENT

    OF CLINICAL DATA

    Volume 24 | Issue 3 / 2018 Fall

    This Issue2Letter from the Chair

    3Letter from the Editors

    4Machine Learning, Artificial Intelligence and Analytics in Clinical Research By Namita Rajput

    11Big Data, Too Big? - Effective Strategies to Get to the Answers You Really Need from your Large Datasets By Steve Shevel

    14Opinion: Artificial Intelligence – the Popeye of Clinical Trials By Charan Kumar

    18Connecting to New Technologies to Improve Data Review Efficiency By Sravankumar Basani

    22The Clinical Data Management Skills Trinity By Maria Fernanda Posada, Antonio Rivas

  • 2 DATA BASICS 2018 Fall

    LetterFrom the Chair

    Shannon Labout

    Dear SCDM Community,

    In just a few days our community will gather in Seattle-Bellevue, Washington for the 2018 SCDM Annual Conference. This year, the Annual Conference Committee is working hard to bring us an innovative, informative and exciting program. In the Annual Conference program, you will find leading edge topics, such as:

    Direct to Patient / Virtual clinical trials,

    eSource and eConsent

    Internet of Things, Artificial Intelligence and Machine Learning

    Our ground-breaking line-up comes alongside plenty of opportunities to learn practical information about risk based data management and monitoring, cloud-based technologies, and working in a new environment of privacy.

    We will also hear from not just one, but two excellent keynote speakers this year.

    Our opening Keynote on Monday will be Dr. Steven E. Kern from the Bill and Melinda Gates Foundation, who will talk about Global

    Health Clinical Trials: Challenging Work in Challenging Situations.

    Our closing Keynote on Wednesday will be Kay Fendt who will speak about Clinical Data Integrity : The Central Role of the Data Manager

    From panel sessions to posters, pre-conference tutorials to networking events, there is something for everyone at the SCDM Annual Conference.

    Volunteer opportunities: There are several ways to get involved and contribute to the work of SCDM before the end of the year. A new eSource

    Consortium is forming and will soon begin working together. You can join our online course committee as a content developer or instructor. Sign up to present

    a webinar. Volunteer to help research and write the next GCDMP chapter. Help us develop the next certification exam as a beta tester. Throughout the year, there are

    many ways you can learn from and network with your peers, contribute your knowledge and experience, and work together to advance the professions within the data science

    disciplines.

    As data science continues to evolve, SCDM will continue to lead the way in information sharing, developing evidence-based best practices, delivering accredited world-class education,

    offering opportunities for learning and professional growth, and demonstrating mastery of CDM competencies through certification. Everything SCDM does is powered by volunteers and we need

    you - won’t you join us?

    Warm regards and I hope to see you at the conference!

    Shannon Labout CCDM 2018 SCDM Board Chair

    Shannon Labout Chair Interim Chief Standards Officer & Vice President of Education CDISC

    Linda King Vice Chair

    Jaime Baldner Past Chair Manager, Clinical Data Management Genentech

    Jonathan R. Andrus Treasurer COO & CDO Clinical Ink

    Jennifer Price Secretary Director, Clinical Solutions BioClinica

    Carrie Zhang Trustee CEO eClinwise, Panacea Technology Co.

    Michael Goedde Trustee Vice President Clinical Database and Statistical Programming PAREXEL International

    Arshad Mohammed Trustee Senior Director, CDM IQVIA

    Peter Stokman Trustee Global Clinical Data Sciences Lead Bayer

    Reza Rostami Trustee Assistant Director, Quality Assurance and Regulatory Compliance Duke Clinical Research Institute

    Sanjay Bhardwaj Trustee Global Head, Data & Analytics Management Biogen

    Deepak Kallubundi Trustee Clinical Functional Service Provider and Analytics – Associate Director Chiltern

    2018 SCDM Board of Trustees

  • 3 DATA BASICS 2018 Fall

    Letter fromthe Editors

    Wouldn’t it be great, Dear Readers, if a clinical database could build itself and manage its own queries?

    Although it may sound like sci-fi but many aspects of Artificial Intelligence (AI) have already been integrated into modern life of a Clinical Data Manager.

    Living in a digital age, where the speed of advances in robotics outpace personal education, one may wonder if AI will ever reach a tipping point when it surpasses human intelligence. This Fall Issue of Data Basics will explore the topic of “Human versus Machine in Clinical Data Management”. We invite you to derive your own conclusions by reading the great selection of articles gathered here.

    “I may be synthetic, but I’m not stupid.” ~ Bishop (AI character from the movie “Aliens (1982)).

    Discover concepts of Machine Learning (ML), Deep Learning (DL) and Data Analytics in the article “Machine Learning, Artificial Intelligence and Analytics in Clinical Research” by author Namita Rajput.

    “Artificial intelligence has the same relation to intelligence as artificial flowers have to flowers.” ~ David Parnas, software engineer

    In his opinion article “Artificial Intelligence – the Popeye of Clinical Trials”, the author, Charan Kumar, proposes that each living being have a digital record of their body performance that could be managed by AI.

    “Artificial Intelligence is a tool, not a threat.” ~ Rodney Brooks, robotics entrepreneur

    Author Sravankumar Basani explores data centric technologies in his article “Connecting to New Technologies to Improve Data Review Efficiency.”

    “You can have data without information, but you cannot have information without data.” ~ Daniel Keys Moran, computer programmer and science fiction writer

    Some may find that AI could come handy when managing big data or “lots of data”.

    Author Steve Shevel provides recommendation for Big Data concept application in clinical trials in his article “Big Data, Too Big? - Effective strategies to get to the answers you really need from your large datasets.”

    “Someday a computer will give a wrong answer to spare someone’s feelings, and man will have invented artificial intelligence.” ~ Robert Breault, musical artist.

    The soft skills (“people skills”) will continue to be attributable to humans only. Authors Maria Fernanda Posada and Antonio Rivas highlight courage as one of the three major skills of a Clinical Data Manager in their article “The CDM Skills Trinity”.

    Artificial Intelligence holds promise to improve efficiency of CDM activities and provide support to the humans who identify themselves as Clinical Data Managers. We hope you enjoy discovering the perspectives presented within this Fall Issue.

    “In the event of a water landing, I have been designed to serve as a flotation device.” ~ Data (AI character from “Star Trek: Insurrection”).

    Claudine Moore and Margarita Strand Data Basics Editors

    Margarita StrandClaudine Moore

    Editorial Board

    Claudine Moore, CCDMFall edition [email protected]

    Margarita Strand, CCDMFall edition Co-EditorGilead Sciences, [email protected]

    Sanet Olivier, CCDMPublication Committee [email protected]

    Stacie T. Grinnon, MS, [email protected]

    Lynda L. Hunter, CCDMPRA Health [email protected]

    Elizabeth [email protected]

    Nadia [email protected]

    Arshad MohammedPublications Committee Board [email protected]

    Michelle Nusser-Meany, CCDMMutare Life [email protected]

    Derek Petersen, [email protected]

    Janet Welsh, [email protected]

    mailto:clmoore%40celgene.com?subject=mailto:margarita.strand%40gilead.com?subject=mailto:sanet.olivier%40iqvia.com?subject=mailto:Stacie.grinnon%40iqvia.com?subject=mailto:hunterlynda%40prahs.com?subject=http://[email protected]://[email protected]://[email protected]://[email protected]://[email protected]://[email protected]

  • 4 DATA BASICS 2018 Fall

    Machine Learning, Artificial Intelligence and Analytics in Clinical Research By Namita Rajput

    The world is progressing at an increasingly faster pace, and so is clinical research. In collaboration with Data Analytics, Artificial intelligence (AI), Machine Learning (ML) and Deep Learning (DL) are transforming the clinical trial process rapidly than anyone can predict. AI is an opportunity to personalize and optimize health care. We are making use of AI almost every day without even knowing it. For example, Amazon’s Echo and Apple’s Siri use machine learning for speech recognition. AI attempts to mimic human intelligence or behaviours. ML, on the other hand, attempts to analyse, map and associate ‘patterns’ and ‘behaviours’ in multiple data sets. Hence, we can say that AI can add another arrow in the quiver of clinical decision making and help boost human intelligence to answer the Sponsor companies’ long-term concerns of high research and development costs, slow delivery of new treatments and high drug prices [1].

    Below is the graphical representation of how Data Analytics, AI, ML and DL collaborate for Clinical Trials:

    Figure 1 - Different fields of Artificial Intelligence and each of their functions.

  • 5 DATA BASICS 2018 Fall

    Machine Learning, Artificial Intelligence and Analytics in Clinical ResearchContinued from page 4

    Figure 2 - Collaboration of Artificial Intelligence, Machine Learning and Data Analytics

    There are many benefits of the AI and Data Analytics Tools collaboration. Several of them are outlined in Table 1 and broken by categories.

    Table 1 – Benefits of Artificial Intelligence and Data Analytics Tools Collaboration

    Category Benefits

    Trial Quality

    • Can improve trial quality and reduce trial times.

    • Incorporates user-friendly efficiency and leverage data.

    • Analyze historical data.

    • Can reduce operational costs for trials and increase patient adherence.

    • Provides ‘Analytical insights’ to help enable querying the data and run ML algorithms securely and at scale on the data collected.

    Genomic Data (Precision Medicine)

    • Looks for patterns in genomic data to find causal relationships with specific / rare diseases. [2]

    • Uses fuzzy matching to detect patients with undiagnosed symptoms and characteristics, such as cognitive decline, chronic pain, etc. This AI technique assists sponsors aspiring to test new drugs and other disorders to find symptomatic, but undiagnosed, patients and to offer them a chance to enroll in a trial.

  • 6 DATA BASICS 2018 Fall

    Machine Learning, Artificial Intelligence and Analytics in Clinical ResearchContinued from page 5

    Category Benefits

    Patient Recruitment

    • Boosts eligible patient recruitment in clinical trial and predict site performances by using algorithms based on both historical and incoming data.

    • Helps in personalized treatment / behavioral modification.

    • Keeps a check on protocol compliance with regards to drug intake.

    • Helps in patient retention, especially for long duration trials, by using visualizations, speech and patient engagement tools.

    • Trial administrators and CROs can track adherence in real-time, intervening to encourage compliance with treatment protocols.

    • Clinical teams can monitor for adverse / serious events and intervene to help improve care.

    Natural Language Processing (NLP) [3]

    • Can read volumes of text (from source notes) in seconds.

    • Can aggregate the data and devise hypotheses beyond human ability.

    • Provides graphical information and match appropriate patients with clinical trial criteria, potentially reducing recruitment time from years to just days or weeks.

    • Analyzes and extracts meaning from narrative text or other unstructured data sources such as “big data”.

    • Answers unique free-text queries that require the synthesis of multiple data sources.

    • Conducts speech recognition to allow users to dictate clinical notes or other information that can then be turned into text.

    Predictive Analytics

    • Makes predictions in trials to detect trends and outcomes [2].

    • Can measure responses to drugs.

    • Predicts whether taking a drug will result in a positive or negative outcome and finally if a trial will be successful.

    • Helps in disease identification/diagnosis.

    Rules• AI does not argue with strict ideas. It does not drive

    personal theories or have judgmental thoughts.

    Bring your own device (BYOD), Wearable Technology.

    • Provides real-time access to patient data.

    • BYOD can be integrated with the edit checks. As soon as the patient input their data, the online checks would assist them by providing the required data.

    • Data can then be integrated with EDC database and server, thus makes processing of data faster.

    • Patients can get feedback on certain symptoms and manage their medication intake, as well as track and share other information in accordance with the study protocol.

    • For patients who stay far away from major medical trial centers, electronic reminders and check-in can help a lot. These will not only help them patients to avoid long distance travel, but also provides the opportunity to participate in trials virtually.

    Risk Based Monitoring (RBM)

    • Reduces cost to trial incurred from frequent monitoring visits and focus only on specific sites.

    • Clinical Data Management Risk Based Monitoring can work in collaboration for a better quality trial.

    • Data Managers, from their experience of handling queries from site, can help monitor predict trends that can occur in clinical trials and eventually avoid frequent site visits for issue resolution.

    Table 1 – Benefits of Artificial Intelligence and Data Analytics Tools Collaboration (continued)

  • We have seen how AI has the potential to reduce cost and deliver fresh insights for faster drug development; however, there are still many challenges in achieving the outcome, especially wherein AI and wearable technology for data analytics meet. Challenges and potential solutions are presented Table 2.

    Table 2 – Challenges and Potential Solutions where AI and Wearable Technology Collide

    Challenges Potential Solutions

    Internet availability: There are many locations in this world wherein internet availability is sparse or nil. To get the potential patients enrolled in the trial with technology is challenging. In scenarios, wherein a patient unexpectedly encounters non-availability of internet or Wi-Fi, and data has been entered in the device; a question of data storage and data loss arises.

    The device used in the trial should have a feature of data storage whether internet is available or not. Once a patient is at a location wherein internet or Wi-Fi is available, the device should automatically store data at trial-related server. AI and ML are not expected at this time to question the rules set. Hence, this solution can be possible.

    Use of Wi-Fi: Certain Sponsors allow usage of Wi-Fi available in public areas. However, this poses a risk for data security.

    The device used in the trial should have a feature of automatically checking the risk of using public Wi-Fi for trial related information. If possible, the device should allow to select Wi-Fi available only at medical centers.

    Critical data fields: There are many types of data that is being collected traditionally, without a review whether their collection is necessary/ critical for trial analysis and outcome.

    Sponsors need to have focused review of data and implement only critical/ key data fields in case report form.

    Complex case report form design: For trials with complex case report form (CRF) design, the challenge is encountered for efficient data collection in device. Additionally, for a type of data that needs to be cross-checked with another set of data, in real-time, efficient cross-domain checks need to be in place within device. Many times for more data collection in device, too many data checks are formulated. Sponsors need to keep in mind effective data collection without frustrating patients with too many checks and data fields. There might be a risk of patient simply recording data just to dodge data checks and this poses a major risk to trial data integrity.

    The devices used in clinical trials should be able to handle data complexity without compromising on simplified navigation or data collection fields. Pre-programmed checks can be formulated continuously and stored in a library to be used and re-used as needed. AI and ML will need to continuously evolving and can help in real-time data analysis.

    Data collection: Sponsors should try and use more simplified language / data dictionary options for accurate data recording in device. If required, provide free-text fields (minimally though) for questions wherein there are chances of inappropriate data collection in device. But again, if using AI and ML, sponsors are advised to keep in mind the limitations of the Natural Language Processing (NLP) tool. For example, spelling mistakes or errors can confuse tool and hence data might be interpreted inappropriately.

    AI and ML need to be constantly evolving / upgraded to be in-sync with trial needs. This is going to take a lot of effort for device companies, but will be well worth the effort.

    7 DATA BASICS 2018 Fall

    Machine Learning, Artificial Intelligence and Analytics in Clinical ResearchContinued from page 6

  • DATA BASICS 2018 Fall8

    Machine Learning, Artificial Intelligence and Analytics in Clinical ResearchContinued from page 7

    Challenges Potential Solutions

    Bring-Your-Own-Device (BYOD) or Wearable technology: BYOD or Wearable Devices are progressively emerging in clinical trials. These are pretty famous with tech-savvy patients. The challenge here is that at times certain trial-related apps might be deleted by patients. Alternatively, patients may download too many apps, which might hamper the efficiency of the trial-related apps and even pose a risk of overriding the actual purpose of device provided.

    AI and ML included in BYOD or Wearable Devices used in trials can have features to ensure that trial-related information is secured and there is no data loss in any instance. Additionally, sponsors may want to consider validating open-sourced tools (e.g., Hadoop) and languages (e.g., R, Python), as these can be more helpful than traditionally used devices, software or platforms.

    Data governance: Data governance is one of the most pressing issues to address at present. Medical data is personal and needs to be protected from inappropriate party access. It seems logical to assume that most of the public is wary of releasing data in lieu of data privacy concerns.

    Devices used in clinical trials should be compliant with applicable regulatory guidelines and policies. Under no condition should the personal data of patients should be used for purpose that are not informed to patients. This will help build confidence in patients, which will help with higher numbers of patients participating in these types of trials. Sponsors can take a leaf from other industries wherein the latest technology (e.g., Block Chain) is already in use and working well.

    Clinical notes: Often clinical notes are fragmented, consisting of ‘bullet point’ telegraphic phrases with limited context, and lack complete sentences. Many times clinical notes use lots of acronyms and abbreviations, making it highly ambiguous for interpretation. Hence, physician intervention is often needed.

    NLP needs to be updated and in sync with multiple languages that are existing in the world. The device company need to remember that the data recorded also need to be translated accurately and appropriately for correct interpretation and analysis.

    Limitations for set of patients: For trials requiring older patients, it becomes difficult to help them get acquainted with technology. Many times they find it too stressful and due to certain medical conditions, also find it difficult to operate the device.

    AL and ML can play a large part in such trials. It may require introducing these patients slowly to the technology. However, once they become accustomed to it, the technology can assist in keeping the patients engaged with expected tasks.

    For all the challenges discussed above, the solutions provided may or may not work. These are experimental. One way to confirm the feasibility of any solution is to try and check using small scale pilot trials. Thus, playing a pivotal role in changing the clinical trial landscape.

    Table 2 – Challenges and Potential Solutions where AI and Wearable Technology Collide (continued)

  • DATA BASICS 2018 Fall9

    Machine Learning, Artificial Intelligence and Analytics in Clinical ResearchContinued from page 8

    CONCLUSION

    So are we ready to embrace this change? Can we take a leaf from other industries? Can machines take precedence over humans?

    We really don’t know. But, what we know is this: if we continue to follow redundant processes, which we have been doing for a long time, we can never see a better future. To make things happen and to make lives better, it will be imperative to work in collaboration and take control of things well ahead in time. This technology challenges the traditional, reactive approach to healthcare. In fact, it’s the exact opposite: predictive, proactive, and preventative—life-saving qualities that make it a critically essential capability in every health system.

    REFERENCES

    [1] http://www.clinicalinformaticsnews.com/2017/09/29/the-intelligent-trial-ai-comes-to-clinical-trials.aspx

    [2] https://www.techemergence.com/machine-learning-in-pharma-medicine/

    [3] https://healthitanalytics.com/features/what-is-the-role-of-natural-language-processing-in-healthcare

    ABOUT THE AUTHOR

    Namita RajputI am presently associated with Syneos Health as Senior Clinical Data Manager II. I hold a Bachelor’s Degree in Microbiology and PG in Medical Lab Technology from Mumbai University and PG Diploma in Clinical Research and Patents.I have been associated with SCDM/ISCR in below ways:

    • Presented a topic in SCDM Hyderabad Conference 2016 on ‘Wearables-Mobility-Data Explosion’.• One of my article on ‘SCDM Winter 2017 Data Basics Issue_Emerging Trends in Clinical Data Capture and

    Clinical Research Technologies’ was published in Winter Data Basics Issue of SCDM.• Presented a topic in ISCR Bangalore Conference 2018 on ‘e-PRO’.• Currently a member of SCDM Innovation Committee.

    In my free time, I love reading books of Robin Cook (which is one reason that inspired me to be in Clinical Research), Wilbur Smith, Jackie Collins, Sidney Sheldon and the list is endless.I have been in Clinical Data Management since 14 years. It is a wonderful exciting world that holds a great future for the improvement of patient health.

    http://www.clinicalinformaticsnews.com/2017/09/29/the-intelligent-trial-ai-comes-to-clinical-trials.aspxhttps://www.techemergence.com/machine-learning-in-pharma-medicine/https://healthitanalytics.com/features/what-is-the-role-of-natural-language-processing-in-healthcare

  • pfizercareers.com

    Career Opportunities

    There’s no end to the work we do. That’s what keeps us going. Pfizer owns endurance.

    At Pfizer, you can join the world-class scientists and leaders in all fields of healthcare and business who are dedicated to bringing therapies that will significantly improve patients’ lives. We are globally known for excellence, philanthropy and diversity.

    In Data Monitoring & Management our mission is to provide best-in-class delivery of high quality clinical data to enable the timely Clinical Development decisions that positively impact patients’ lives.

    We’re growing rapidly and are currently recruiting for:

    • Clinical Data Scientist (Manager) ref: 4695522

    • Data Management Reporting Analyst (Manager) ref: 4707736

    • Data Management Reporting Analyst (Sr. Associate) ref: 4707730

    • Data Manager ref: 4695121

    • Senior Data Manager ref: 4695124

    In return, we offer competitive compensation, medical, dental and vision and prescription drug coverage, life insurance, 401K match, educational assistance and much more.

    To apply, visit www.pfizer.com/careers and search by the job reference. Research and development is only one part of a medicine’s journey. Get the full story at pfizer.com/discover

    83849_Pfizer_DATA MANAGEMENT_SCDM Data Basics_11x8.5in_V6.indd 1 22/08/2018 17:21

  • 11 DATA BASICS 2018 Fall

    INTRODUCTION

    The term “Big Data” has been coursing through the veins of the clinical research industry in recent years and is a common topic at many industry conferences as well as in related publications. Over and over we hear the merits of Big Data being touted, along with mobile health (mHealth) and artificial Intelligence (AI). We are told they will transform the way clinical research operates and will allow for possibilities that were previously thought to be unachievable.

    But, let’s take a step back and focus on the term “Big Data” itself, and simplify the meaning into layman’s terms. Big Data is collecting vast amounts of data which are directly, or indirectly (or tangentially) related to whatever is the topic of interest. These topics of interest (for our purposes), are typically studying the safety and efficacy of compounds, biologics and devices in development, but can and do extend beyond that to the myriad of operational and developmental procedures and practices that make up the vast scope of clinical development and beyond. What has changed from years past that has facilitated the emergence and use of Big Data? Mostly, innovation in and adoption of new information technologies. We now have simpler ways of collecting and analyzing vast amounts of data due to the logarithmic advances in computing power and the emergence of powerful data analysis tools that are affordable and simple to use. That’s the processing side, but we also have innovation on the input side, with a plethora of wearable devices that can transit vast swaths of data in seconds.

    This is all good news, right? Collect all the data we can, and from there, we will extract those nuggets of value that will propel drug development into the next era of efficiency and success. Except for the fact that we are missing two crucial aspects of the term Big Data as we view this through our lens of optimism: 1) the context of the term Big Data within the investigational portion of clinical development and, 2) the overwhelming nature of the volume of data and the associated “noise” it produces.

    BIG DATA OR LOTS OF DATA?

    One issue we are struggling with in the industry is using the term Big Data without taking the context into consideration. This generalization of the term has led to some confusion about both the definition and the application of the term. The term Big Data in health research typically refers to the collection of a vast number of datapoints across many individuals with no formally defined target in mind, and with that data sharing three common artifacts, namely Volume, Velocity and Variability [2]. In a regulated, defined, clinical trial setting this definition does not hold true as none of these three criteria are met in a meaningful way. Therefore, we should begin to think about the term differently depending on the context in which we use it. For example, the term Big Data may very well be applicable in the exploration and observation stages of research, when we are looking for new targeted areas for treatment, approaches to personalized medicine, or mechanisms of action. This data would help us develop a hypothesis that we can then test in a controlled clinical trial environment that is the basis of the scientific method (more on this later). Most of the work of data management is within this controlled clinical trial environment and so maybe a better term to use than Big Data would be “Lots of Data.” A focused approach to analyzing “Lots of Data” is what we should be targeting, and a practice that will yield beneficial results in both clinical data for efficacy and safety, as well as a research organization’s operational effectiveness. So, for the purposes of this article we will refer to the collection of large amounts of data in a clinical trial environment as “Lots of Data,” since that is the arena in which we assume our readers are operating.

    Big Data, Too Big? - Effective Strategies to Get to the Answers You Really Need from your Large DatasetsBy Steve Shevel

  • 12 DATA BASICS 2018 Fall

    THE “LOTS OF DATA” TEMPTATION: STARTING WITH THE DATA

    While we now have the ability to gather and analyze vast amounts of data with the help of technology, this collection and analysis, while more efficient compared to prior years, is still expensive not only in terms of collection but also in terms of “meaningful” analysis. There is a key aspect that seems to be pervasive in the way in which many companies approach the concept of “Lots of Data”. I will refer to this approach as the “filter” approach. In this example, we collect vast amounts of data that may or may not be related to our primary target. Once the data is collected we then begin to ask ourselves some questions to which we think the answers are important (the valuable data). We then take our very large dataset and begin to parse and filter down to try and find the answer to those relevant questions. Sometimes we are successful and sometimes we are not, but in either case a significant amount of effort is expended in getting to the answer, or lack thereof, for one simple reason, our dataset is vast and unfocused. The philosophy of “collect it just because we can” is one approach, but an inefficient and expensive one. This method does not scale as the datasets expand, even with the advances in computing and analytical power, because our lack of focus may have us mining data in the wrong area. In addition, this approach may lead us down the long and expensive road of false positives. Statisticians will confirm that the more data you look at, the higher the chance that you will detect some sort of pattern that may be real or may just be an artifact. If we don’t adjust these observations for multiplicity and incorporate other sound statistical methods, any of the observations we see may be questioned.

    CHANGE THE MINDSET: STARTING WITH THE ANSWER

    A different approach to “Lots of Data” that can help us address the excessive effort and cost is to change our mindset and instead of beginning with question, start with the answer. This may seem counterintuitive, but the concept is rooted in science and specifically the scientific method itself (as referred to in the first paragraph of this article). In the scientific method we begin with the answer, then develop a hypothesis to test if that answer is true, and finally conduct the experiment collecting the data that either proves or disproves our hypothesis. The “starting with the answer” approach borrows from the scientific method in that we have an idea of the answer we are seeking to find and then tailor our data collection methodology and analysis accordingly. We look primarily at the data surrounding our foci as well as a small percentage of ancillary, but related, data to allow for discovery. In this way, we do not limit ourselves to only the data of focus but allow ourselves some leeway for opportunistic discovery.

    Louis Pasteur’s iconic quote “Chance favors the prepared mind” is an apt example of this method. If you are looking to identify a new species of butterfly in the rainforest, then you focus your research, data gathering and exploration on butterflies and the area of the rainforest where butterfly populations tend to be the highest. Since you are looking for a new species, you may be tempted to look at the entire Amazon rainforest and sift through all of the animal kingdom’s phlya, species, etc., hoping to find new ones. And with enough time and effort you may eventually recognize a new butterfly species along with other animal and insect species. Except you would have expended vast amounts of time, energy and cost to get these answers while forgetting one simple but important fact: You are only in the butterfly business.

    This is the blessing and curse of “Lots of Data,” the temptation to collect as much as you can because you are already spending a lot of time and money to collect some data already. This practice propels us down the road of diminishing returns and is not sustainable as you grow. The “Starting with the Data” approach is like searching the entire Amazon rainforest, while, the “Starting with the Answer” approach is more akin to the focused butterfly habitat example (refer to Figure 1). Sure, you may miss out on a “potential” important discovery by not collecting every piece of data you can and analyzing it, but one has to weigh the cost and effort of pursuing the potential opportunity versus the focused opportunity.

    This focused approach to large, technology-leveraged datasets, is one that regulatory agencies are watching closely. FDA representatives are cautioning companies of the pitfalls of collecting too much, unfocused data, “A dazzling new array of wearable and other mobile technologies can provide a more complete clinical trial picture, increase efficiency, and reduce the burden on patients, says Ken Skodacek, part of the U.S. Food and Drug Administration’s

    Big Data, Too Big? - Effective Strategies to Get to the Answers You Really Need from your Large Datasets Continued from page 11

  • 13 DATA BASICS 2018 Fall

    Big Data, Too Big? - Effective Strategies to Get to the Answers You Really Need from your Large Datasets Continued from page 12

    (FDA) Clinical Trials Program and Payer Communication Task Force at the Center for Devices and Radiological Health (CDRH). However, he also cautioned collecting too much data or flawed data will undermine all those potential gains.”[1]

    IMPLICATIONS

    Big Data is a valuable concept that will serve our industry now and into the future as long as we recognize effective ways to apply this concept and the context in which to apply it. In terms of clinical trials, Big Data in its truest form does not apply but rather the term “Lots of Data” is more apt. Lots of data however is still significant in the potential volume of datapoints. If we are not careful in how we approach its application, it can, and will become too big to be of value. Data Managers are perfectly suited to be stewards of a focused approach to “Lots of Data” given their inherent knowledge of how data works and how much effort it takes to collect, organize and analyze that data effectively. In fact, this application can extend beyond clinical data into the collection and analysis of data to support operational effectiveness and efficiencies in the conduct of those clinical trials, where again data management expertise is indispensable.

    BioPharma companies stand at the precipice of a new opportunity in discovery and operational improvements utilizing the concept of “Lots of Data.” The question is, will we jump into a vast ocean or pick a more manageable pond?

    REFERENCES

    [1] Causey, Michael (2018/07/16). Don’t let Technology Bury Your Clinical Trials with

    Frivolous Data. Retrieved from www.acrp.org/2018/07/16

    [2] Volume, Velocity, Variety: What You Need to Know about Big Data. O’Reilly Media. https://www.forbes.com/sites/oreillymedia/2012/01/19/volume-velocity-variety-what-you-need-to-know-about-big-data/#3b077c3d1b6d

    “Starting with the Data” example: collect everything in the circle

    “Starting with the Answer” example: Target data collection to the blue circle and some

    data surrounding that foci to allow for secondary discovery.

    ABOUT THE AUTHOR

    Steve Shevel Mr. Shevel has been working the biopharma industry for over 20 years. He is currently a Senior Associate with Waife & Associates, a biopharma consultancy where he has been working for the past 10 years. Mr. Shevel holds a BS in Microbiology with emphasis in genetic engineering from University of California Santa Barbara, and a MBA from University of California Irvine. Prior to joining Waife & Associates, he held various operational and leadership roles at Quintiles, Eli Lilly, Pfizer, Allergan and other biopharma companies. Mr. Shevel has spent the last few years focused on helping a long list of pharmaceutical and biotech companies recognize efficiencies and growth, through process improvement, technology adoption and strategic analysis. He has been a consistent presenter at industry conferences including DIA, ACRP, SCDM, BioIT and others, and has authored numerous articles on clinical research practices, challenges and solutions.

    Figure 1 - Approach to Data Collection.

    http://www.acrp.org/2018/07/16https://www.forbes.com/sites/oreillymedia/2012/01/19/volume-velocity-variety-what-you-need-to-know-ahttps://www.forbes.com/sites/oreillymedia/2012/01/19/volume-velocity-variety-what-you-need-to-know-a

  • 14 DATA BASICS 2018 Fall

    Opinion: Artificial Intelligence – the Popeye of Clinical TrialsBy Charan Kumar

    PREFACE

    This planet is wonderful – each living being here has a will to fight, fight for its life – be it microscopic or giant creatures. And the struggle to live was, is and will be on forever. We all look at big fights - major wars – Darwin’s theory, the Waterloo fight, World Wars, etc. Let’s just sit back and ponder - life within. Apart from the mental battles we fight, our body cells too have their own minds, their own armies, have their own fights and they constantly learn, unlearn and wage wars. Against whom? It could be invaders, could be internal issue creators (e.g., auto immune disorders – cells waging battle against its own counterparts) or due to unfortunate instances – in all these cases one will be pushed to fight for life.

    What is a human body? An immensely complex creature involving extraordinary levels of coordination, command, control, learning, and collaboration which are all essential for the functioning of any small tissue, organ and the entire body. Internally the cells, tissues, organs have their own sets of war and love. And the body fails – one last time for everyone and that is death.

    This is life at the micro or nano level. At the macro level, we humans are constantly experimenting to make ourselves equipped better to fight different sets of challenges we face as living beings in this planet; to prove our supremacy; to prove that we are the intelligent beings on the surface; to ensure we don’t go extinct and the fight will continue.

    Ammunitions to fight this war are varied; we started off with natural means – plants, leaves, fruits. Then gradually, with scientific discoveries and the advent of machines and other technologies, we are also armed with more advanced means of fighting. Artificial Intelligence is an especially promising means on the horizon.

    BIOLOGICAL INTELLIGENCE (BI)

    We talk a lot about – the buzz word – ‘ARTIFICIAL INTELLIGENCE’... Interestingly, each living being is blessed with its own ‘BIOLOGICAL INTELLIGENCE’. Every cell reads the intruder, conveys relevant information and fights back to overpower the foreign body. If the fight is lost, the body or cell doesn’t accept the loss instead it reorganizes itself and fights the change. This intelligence of a cell/tissue/organ– to read the mechanism, analyze, reorganize, and fight back can be termed ‘BIOLOGICAL INTELLIGENCE (BI)’. Probably the term ARTIFICIAL INTELLIGENCE (AI) might have come into being deriving inspiration from living organisms’ intelligence, i.e., BI.

    ARTIFICIAL INTELLIGENCE (AI)

    AI is the ability of a computer or machine to simulate human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages. It is dependent upon three elements: massive amounts of data, sophisticated algorithms, and high performance parallel processors. Machine learning (ML) is an aspect in AI which can rapidly assess multiple texts, graphs, and other data simultaneously.

    In our pursuit to make lives better, we have been relying on advances in technology and applying it in various phases of our research and trials ensuring the best-in-class drug/vaccine is the outcome. AI is one such methodology that might be a game-changer for clinical research domain. Arming the domain experts with right data, information regarding patients and diseases leads the industry experts in arriving at a right decision. This aspect of AI application can be compared to Popeye.

    Popeye - a cartoon character created by Elzie Segar - is a tough, hard fighting sailor with a heart of gold. Popeye will often say “that’s all I can stands and I can’t stands no more!” before he is forced to fight. He eats spinach to gain super-strength. After Popeye eats spinach, his enemies do not stand a chance. Analogously, AI’s spinach is data from all relevant sources and AI plays the role of sailor, guiding the clinical research ship in the right direction.

  • 15 DATA BASICS 2018 Fall

    AI technology aided by big data could lead to trial efficiency through better protocol design, improved patient enrolment and retention, better trial quality, and shorter trial times. AI also has great potential to help identify biomarkers that causes diseases. Let us look at the possible areas of application and its impact.

    DRUG DISCOVERY AIDED BY AI

    In today’s world, each living being can be considered a database – home for huge data about one’s body.

    Proposing creation of Digital records

    Each living being (here human) can have a digital record of his/her body performance. Each calorie eaten, each activity performed can be transformed and shown in the form of digital records. The genetic makeup of each living organism, especially human beings, can be charted. The data related to body’s biomarkers and genes can be hosted on a personal secure cloud with the ability to grant access to relevant research teams.

    There could be two sets of digital records

    BioRECORDs with collated information available till date (i.e., all data from the hospitals, medical institutions, surveys, clinical trials, death data, birth data. etc.) can be termed P-BioRECORD (‘P’ here standing form Past); and

    All current living humans’ data being mapped can be termed the C-BioRECORD (‘C’ here standing for Current).

    AI based deep learning and machine learning methodologies can run the algorithms on data available both on the P-BioRECORD and C-BioRECORD – learning from the past data and triggering alarms related to present data.

    These alarms are then further charted with the disease database or relevant records available to arrive at the possible areas of concern related to cell/ tissue/ organ functionality and to look for the nearest solution. This likely will have a profound impact on ‘patient-centric’ drug design and will also impact the methodology of clinical trials, as well as the cost and time involved in drug research.

    Creation of generic drugs

    Any drug solving a problem is recorded. Drug research has the potential to be a catalogue of research wherein a plethora of combinations of chemical formulas shall be recorded. This will lead to repository of ‘drug database’ which can then be used for personalization using AI. Data here plays a vital role in assessing and sharing possible outcomes. Managing this data will be a humongous effort which also requires computational skills coupled with knowledge of biology. The permutations and combinations, patterns of this chemical entity database are run against the ‘erroneous’ genetic map to provide the possible chemical entity to solve the issue.

    Once the possible solution(s) are predicted using AI-based tools, then in-vitro tests can be performed by extracting the individual’s body cells in labs. The result yielded will pave way for the drug/vaccine to be administered or procedure to be followed. This can be a patient-centric trial design thought, which will lead to subject-based trial design methodology considering factors such as dosage, Pharmacokinetic (PK) and Pharmacodynamic (PD) information. This can also eliminate the need for comparison-based PK/PD trials. Any new diseases also can be cured with the help of AI’s learning and deciphering abilities.

    The magnanimity of accuracy in providing opportunities for right decision making could lead ‘drug discovery’ industry to a new level. Precision medicine being the new future, the array of possibilities AI and its tools have shown is amazing and pathbreaking. Doctors and medical professionals would find this technology extremely helpful in reasoning and providing solutions with less time lag thus aiding in early diagnosis with greater accuracy. “CRISPR” (pronounced “crisper”) stands for Clustered Regularly Interspaced Short Palindromic Repeats, which are the hallmark of a bacterial defense system that forms the basis for CRISPR-Cas9 genome editing technology, along with AI might lead to new frontiers in disease management portfolio.

    The above concept being at the drug design level, we are also aware about the research related to designer babies using in-vitro tests especially with focus on removing faulty genes in embryos. This is based on family history or

    Opinion: Artificial Intelligence – the Popeye of Clinical TrialsContinued from page 14

  • 16 DATA BASICS 2018 Fall

    Opinion: Artificial Intelligence – the Popeye of Clinical TrialsContinued from page 15

    ethnic genetic make-up. Like designer babies, the outcome of the application of AI in clinical research is not without controversy and raises several ethical questions:

    Will AI lead to immortality? It could if it enables solutions for all health problems. This is an unanswered question and a controversial one too; however, one’s lifespan and way of living could be made longer and better due to AI.

    Will AI make it possible for human diseases to be cured in the initial stages (i.e., in the embryo itself) and thus leading to healthier humans? If so, what further ramifications will this employ?

    CHALLENGES

    BI sometimes fails – invasion by other foreign bodies leading to cell death and infection; also leading to end of life in extreme cases.

    The question that also plagues AI is the same – will it fail or will it surpass normal human intelligence? These questions do not currently have accurate answers. However, ethical considerations, geographical attributes, regulations and laws will be binding and guiding factors as AI continues to develop and be applied to clinical research. Also, once the methodology turns a reality, the strategy of handling this scenario will lie with us, the most intelligent being on the planet.

    CONCLUSION

    The arms of AI are far reaching and can have a thunderous impact in the life sciences domain, especially in the clinical research field. In the ship called “Clinical Trial”, AI is the sailor donning the role of Popeye, providing guidance and data is the spinach, providing power for the journey. Popeye’s relevance to this article is where the guiding factor to clinical industry in the form of Artificial Intelligence would see portentous results powered by ‘data’, the new spinach for Popeye. The goal is to improve lives and bring smiles on faces which can be achieved in an artistic way coupled with research and technology.

    ABOUT THE AUTHOR

    Charan Kumar Kuyyamudira Janardhana, currently serving Ephicacy Lifescience Analytics Private Limited as Director for Strategic initiatives and Quality. He is based in Bangalore India. In the current organization he is responsible for managing information security initiatives, quality, ISO certifications at Ephicacy. He has 13 years of domain experience having served in companies such as Novartis, Quintiles (now IQVIA), Accenture services. He has authored few papers major ones being the PharmaSUG conference submissions related to Industry 4.0 and Metrics. He holds a master’s degree in biotechnology from Bangalore University, India and is an alumni of Haas School of Business, University of California, Berkeley, USA. He has successfully led companies through ISO certficiations both QMS and ISMS. He has expertise in Service Management and Risk Management implementation. His areas of interest include AI, automation, robotics, green energy and its implications in operations.

  • Image: Close-up of a human eye—pupil and iris

    A COMMITMENT TO CONTINUOUS IMPROVEMENT, SIMPLIFIES.

    At PAREXEL, our experts apply pioneering innovations and problem solving at every step in order to simplify the journey between science and new treatments. All so that new, life-saving products can reach patients sooner.

    201

    8 P

    AR

    EXE

    L In

    tern

    atio

    nal C

    orpo

    rati

    on. A

    ll ri

    ghts

    res

    erve

    d.

    www.PAREXEL.com

  • 18 DATA BASICS 2018 Fall

    Adapting to new technologies in data sciences has turned out to be a necessity rather than an alternative to the conventional approach. Major Sponsors and CRO’s are investing in new technologies in data review for analyzing high volumes of data from multiple trial sources to enable speedy, accurate, and reliable clinical decisions. The emergence of multiple technologies has helped the data science discipline make use of these in improving data review effectively in less amount of time. This allows organizations to find new relationships and complexities that were hard or impossible to detect earlier, like drug-drug interactions or potentially problematic sites [1]. It also helps researchers to identify and analyze data sub sets for planning trial designs and/or deriving valuable medical or product information.

    In the past, clinical trials have used only structured, clinically-sourced data, which was relatively easy to organize and mine. But, with the advent of the trial in the cloud, connected mHealth devices for remote monitoring of trial patient participants, and advanced technologies to analyze very large amounts of data quickly, things have changed [2]. Utilization of these technologies help study teams analyze trends in data inflow and error patterns. Further, these technologies help researcher to identify critical factors affecting quality of the trial and to make quicker, informed decisions on appropriate corrective actions. Many of the technologies are designed “data-centric” like Spotfire, Big data, Artificial intelligence, Advanced EDC’s, Business intelligence, Internet of Things (IOT) and can be applied to enable various clinical research processes and methodologies (e.g., Risk Based Monitoring, Data Surveillance, Metadata repository…).

    APPLICABILITY AND ADVANTAGES

    Ending error-prone repetitive tasks

    When data scientists or reviewers are forced to do the same tasks in a repetitive manner, the process becomes prone to errors, and this process becomes expensive and burden of re-work and time constraints arise. New technologies are becoming available that can automate the tiresome processes. Instead of having to automatically repeat the same processes, data reviewers can utilize modern insight platforms to track or review data and to help them drive inferences. [3] These technologies can identify the changes in the underlying data and notify the users automatically when data changes. Data reviewers can be notified in real time when they have a new patient to review, when a site comes online, or whatever else they are interested in knowing.

    Building efficiency from the start

    For improved efficiency over the course of the trial, a well-designed database at the beginning of the trial plays a significant role in promoting efficient data review and helps with decision making regarding customizing or modifying the trial accordingly. An eClinical CDISC Metadata Repository (MDR), can be used to coordinate and continually update data transformations and mapping between different terminology and data sources even when the data sources change over time. Use of CDISC data standards along with an MDR, allow sponsors to design a trial effectively and in a standard format easy for analysis and traceability. A well-designed MDR typically contains data far beyond simple definitions of the various data structures. Typical repositories store dozens to hundreds of separate pieces of information about each data structure [4]. Of note, smaller sponsors can collaborate with major CRO’s with greater technological advances to help simplify the journey of the clinical trial from the beginning.

    RISK BASED MONITORING

    Risk Based Monitoring (RBM) is a clinical trial-monitoring technique that fulfils regulatory requirements but moves away from 100% source data verification (SDV) of patient data. It employs various tools, platforms and dashboards

    Connecting to New Technologies to Improve Data Review EfficiencyBy Sravankumar Basani

  • 19 DATA BASICS 2018 Fall

    Connecting to New Technologies to Improve Data Review Efficiency Continued from page 18

    to identify signals, which indicate potential issues with (for example) trial conduct, safety, data integrity, compliance and enrolment. This allows the study team to concentrate on high value tasks and focuses resources on specific trial-related matters. RBM Strategies comes up in multiple approaches like Statistical, Centralized, Remote, Reduced & Triggered monitoring. [5]

    Centralized data monitoring and RBM can much more readily detect anomalies, including fraud or potential fabrication of data. There are straightforward tools available that can be applied to identify potential fraud. Such tools can look to see if there is a non-random distribution of data, and if there is reduced data variability at a particular site, which could indicate that data is being copied or reproduced with few modifications. By understanding the distribution of data collected from a site versus how it was collected from other sites, data reviewers can target potential sites that have issues. [6]

    HOLISTIC DATA Review versus just DATA exploration

    A key part of data review is being able to efficiently identify data quality issues. Data review efficiency can be accelerated using dynamic visualizations to either start as a summary view and then generate statistical tables, reports from summary view, or use that as a tool ongoing throughout review [7]. Statistics cannot be ignored in the initial stages of clinical trials where statistically driven analyses that provide aggregate view and then allow a user to drill down to the patient level to understand all the data, at either patient level, or at a trial level in terms of evaluating safety and efficacy but also detecting trends or anomalies in data. Just exploration of the data without a standardization (e.g., CDISCS standards) can miss the key signals of clinical significance, corrupted data, unformatted data, fabricated data and other potential data fraud [8].

    Pinnacle 21 software can be utilized for managing data quality issues and CDISC compliance. It is a web-based and cloud hosted application that encompasses six quality dimensions as pictured in the Figure 1. This application helps in identifying the gap in the data review for making appropriate corrections for a quality data review and expediting the process of FDA submission readiness.

    Figure 1 - Data Fitness Scorecard

  • 20 DATA BASICS 2018 Fall

    Connecting to New Technologies to Improve Data Review EfficiencyContinued from page 19

    The tool provides the latest validation on data against an older validation performed and clearly identifies changes that occurred since the last time data is validated [5]. This simplifies the process enormously when review is performed on large volumes of data. The more data you have, typically the greater the complexity, and identifying the key signals is a challenge.

    BIG DATA CONCEPT

    Today, the cloud allows companies to include terabytes of unstructured data from many different, real-world data sources (e.g., EMRs, genetic profiles, phenotypic data, and mHealth devices). Technology exists to collect this unstructured, real-world data from a myriad of systems, organize it into comparable formats, analyze it, and visualize the results. Technology enables reviewers to more easily explore the data for unexpected patterns. These unexpected patterns can lead to new hypothesis that can be validated by the trial data [2]. As big data analytics resource systems, are integrated with multiple technologies like electronic health records (EHRs), Radiation oncology information system (ROISs), Treatment planned systems (TPSs) and applied to all patients treated, they present as strong resources for improving trial design [10]. Successful carrying of the integrations require coordination with multiple disciplines and stakeholders to attain appropriate access and to implement standardizations.

    INTERACTIVE CLINICAL DATA VISUALIZATIONS

    Nowadays, vast amounts of data are collected during any clinical trial and it is essential for pharmaceutical sponsors to understand these data in detail to make accurate decisions. Business intelligence reporting tools, like Spotfire, are available to help visualize trial progress with data driven dashboard features. Data reviewers and researcher can see and understand the trial progress and status in near real-time. Clinical data visualizations (like Anscombe’s Quartet, in figure 2) allow us to investigate and to see relationships and patterns in data, while tables can be used as a supportive tool to describe summary patterns. Spotfire, and other BI tools, can visualize domains that are often not presented graphically, e.g. disposition data. They are powerful tools to investigate data and share detailed insights in an efficient way. They also allow slicing and drilling through the data and interactively changing the level of detail one wants to see. This then allows users to thoroughly investigate and understand Study Data Tabulation Mode and Analysis Data Model datasets in an efficient manner and to easily identify potential sources of data inconsistencies. [7] [8].

    Figure 2: Anscombe’s Quartets.

  • DATA BASICS 2018 Fall21

    Connecting to New Technologies to Improve Data Review EfficiencyContinued from page 20

    CHALLENGES

    The emergence of technologies that can be successfully applied to clinical trials has not been without challenges. Data privacy and data confidentiality are critical factors that need to be considered when implementing new technologies. Further, “Big Data” solutions have high costs - financial, technical, and resource allocation – and require significant process changes and political capital to implement. And also the support needed for the implementation of the new technologies depends on addressing cost vs. benefit to PQI (practice quality improvement) and clinical translational research efforts [9]. Validation is also a significant undertaking when implementing a new technology. Validation of new technologies must thoroughly be reviewed and confirm for regulatory compliance.

    CONCLUSION

    The recent surge in new technologies initiatives in reviewing data for analysis and medical decisions is expected to have a positive impact on clinical trials. Increased standardization of common data elements and terminology can assist in streamlined trial design and exchange of data. Standardization between trials will allow easier multi-study analysis. Standardization and quality improvement efforts go hand in hand with a maturing big data infrastructure providing collateral benefits to data curation for randomized trials [10]. The quality and power of clinical research studies has the potential to increase tremendously with use of new technological approaches.

    REFERENCES

    1. Morrison, Rick. (2013). The Impact of IT Innovation on Clinical Data Management: SCDM-Data-Basics_Summer-Issue-2013

    2. James, Streeter: Role of Big data in Clinical Trials: http://www.appliedclinicaltrialsonline.com/role-big-data-clinical-trials

    3. Risk Based Monitoring: https://www.quanticate.com/risk-based-monitoring

    4. Metadata Repository: https://en.wikipedia.org/wiki/Metadata_repository

    5. Pinnacle 21: https://www.pinnacle21.com/products/validation

    6. Miclaus, Kelci. Efficient Data reviews and Quality in Clinical trials: https://www.quanticate.com/blog/bid/106754/efficient-data-reviews-and-quality-in-clinical-trials-video

    7. Anscombe, F. J. (1973). Graphs in Statistical Analysis: https://www.quanticate.com/blog/interactive-clinical-data-visualizations

    8. Chatterjee, S.; Firat, A. (2007): Generating Data with Identical Statistics but Dissimilar Graphics: A Follow up to the Anscombe Dataset: https://www.quanticate.com/blog/interactive-clinical-data-visualizations

    9. Mayo CS, Kessler ML, Eisbruch A, Weyburne G, Feng M, Hayman JA, et al. The big data effort in radiation oncology: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5514231/

    10. Big Data in Designing Clinical Trials: Opportunities and Challenges: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5583160/#B24

    ABOUT THE AUTHOR

    Sravankumar Basani I completed master’s degree in Formulation sciences from University of Greenwich, UK, and bachelor’s degree in Pharmacy from Kakatiya University INDIA. Currently working in PAREXEL international INDIA pvt. Ltd as Clinical Data Analyst III. I have nine years of experience in CDM.Worked out on multiple therapeutics areas of clinical trials, significantly worked for Oncology studies. Experienced in multiple CDMS Platforms’ Data labs, RAVE, INFORM, OC-RDC. Argus.I am involved in all the three phases of the trial (set up, conduct and Lock). Worked extensively on study design, vendor reconciliation activities and coordination with multiple functional groups.

    http://www.appliedclinicaltrialsonline.com/role-big-data-clinical-trialshttps://www.quanticate.com/risk-based-monitoringhttps://en.wikipedia.org/wiki/Metadata_repositoryhttps://www.pinnacle21.com/products/validationhttps://www.quanticate.com/blog/bid/106754/efficient-data-reviews-and-quality-in-clinical-trials-videohttps://www.quanticate.com/blog/bid/106754/efficient-data-reviews-and-quality-in-clinical-trials-videohttps://www.quanticate.com/blog/interactive-clinical-data-visualizationshttps://www.quanticate.com/blog/interactive-clinical-data-visualizationshttps://www.quanticate.com/blog/interactive-clinical-data-visualizationshttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5514231/https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5583160/#B24 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5583160/#B24

  • 22 DATA BASICS 2018 Fall

    When searching for a new job or a different learning opportunity, the data management world is a bit of a mystery. It is difficult to find a clear explanation of the overall occupation and, above all, the impact it will have on your professional and personal life.

    It is important to communicate the most valuable abilities, which will be developed once you decide to become part of the data management world. Understanding these will hopefully engage the younger workforce, which is full of talent and willing to work with passion to bring creativity and new energy to the business.

    Although some of the work is currently done on the computer, a clinical data manager cannot be replaced by programs or applications due to the challenges that this job represents. Each piece of information needs to be analyzed and reviewed in detail in order to take the right actions and obtain the real results. Many of these activities can be developed by computers in an advanced manner; however, when we talk about clinical data of patients receiving treatments, the results will not always be as expected and in fact there can be findings or unexpected trends that could be identified only by a human eye.

    Pursing a career in Clinical Data Management (CDM) not only allows one to build technical abilities, but also encourages the development of soft skills by challenging individuals to step out of their comfort zone and generate innovative ideas and think outside of the box. Challenging the status quo encourages growth both professionally and personally.

    There are human skills that cannot be programmed in a computer and they represent the key difference between technology and a human being. A person with a career in data management field will develop many skills, but there are three key factors that will be required more often than others during each step of your development process since the beginning of your CDM career. They also will ensure success and growth in the company, in the role and in you: Effort, Versatility and Courage.

    Effort: Being a part of the data management world (especially as an initial job opportunity) is challenging and requires a large amount of effort. You must learn concepts which are completely new to you and understand the new data management language.

    This profession requires daily effort, initially learning the job role, the importance of our work and continuing to stay up to date with current responsibilities. This fosters a persistent and coherent workforce, focused on minimizing risk and ensuring quality.

    Versatility: A career in data management also cultivates development of versatility by the nature of a person’s assignments. CDM maturity is accomplished by facing situations that require adaptation and flexibility, for success to be achieved. Knowledge is paramount, but not enough on its own. Versatility is critical when strategic ideas are required and inversely, when changes are not possible.

    The data manager confronts challenges as their career advances, realizing the principles of the work, becoming more detail oriented to reduce critical errors and focusing on process improvements. These skills will allow data managers to more effectively lead teams.

    As the data manager evolves and adapts, they accumulate best practices, sometimes by learning from their own mistakes or sharing in others’ experiences. Facing failure is difficult, losing is painful, but adapting and growing is enriching.

    CDMs must also be adaptable to the ever-changing business demands; what you know today may no longer be applicable tomorrow.

    The Clinical Data Management Skills TrinityBy Maria Fernanda Posada and Antonio Rivas

  • DATA BASICS 2018 Fall23

    The Clinical Data Management Skills TrinityContinued from page 22

    Courage: Rounding out the trinity is the courage to speak up. It may be difficult to speak in a large group multifunctional team setting, since the level of understanding in various topics differs. There will be times where you may not be an expert or not have previous experience on a specific topic, but you must feel inspired to find the bravest version of yourself and speak up.

    During the course of learning the job responsibilities, you may think that you cannot continue, especially when you must face challenges for which you think you are not prepared. In time, you will learn to negotiate with more experienced colleagues, you will learn to defend your actions and you will accept your errors and have the knowledge to mitigate the error.

    This is courage; facing difficult moments, having challenging conversations, being able to say no when you are certain of the correct path, proposing innovative ideas to those who may not want to challenge the status quo. When we act with courage, it forces us to be better and allows the business to evolve. The data manager is innovative and creative, capable of transmitting their thoughts to others through the leadership of their team.

    Professional achievements occur through experiences encompassed over time (i.e., overcoming challenges via problem solving and developing a new/updated process) rather than accomplished day to day.

    Data Managers are not born as data managers; in fact, most universities do not educate on this profession, even though opportunities are available in many countries around the world. Many successful people within data management have stumbled upon this field by chance, perhaps without knowing the myriad of opportunities available.

    Within the CDM role, it is quite possible to start out managing queries and two years later be designing a complex database, simplifying the work process and maintaining higher quality in your job, but the skills trinity will help you to ensure the achievement of your objectives. Now, you understand better the purpose of this role and remember: Effort, Versatility and Courage are the key.

    ABOUT THE AUTHOR

    Maria Fernanda Posada Maria Fernanda Posada has a BS in Veterinary Medicine from National University of Colombia, a Magister Scientiae degree in Tropical Medicine from National University of Costa Rica, and a degree in International Business from Universidad de Salamanca, Spain. She joined Merck DM in 2014 and has performed different roles in DM with increasing levels of responsibility; CDM, Senior CDM and Lead CDM for Oncology Trials. She is now a Manager at MSD Colombia responsible for leading a group of oncology DM staff. She also leads global training initiatives in Oncology Therapeutic Area, and strategic projects. Calle 127A #53A – 45 Torre 3, 8th floor [email protected] +57-3103025374

    Antonio Rivas Antonio Rivas has a BS in Biomedical Engineering with minor in Mathematics from the Universidad de los Andes. Joining Merck DM in 2014, he’s rapidly progressed through multiple roles with increasing levels of responsibility; CDM and Senior CDM for Oncology Trials across several indications. He is now a Lead CDM at MSD Colombia responsible for several oncology trials and is part of the Oncology Standards team. He is also active in improvements to process, data standards and cross-functional projects.Calle 127A #53A – 45 Torre 3, 8th floor [email protected] +573222182294

  • Get better data fasterReduce clinical trial costsStreamline trial execution

    veeva.com

    Run the Trial You Want

    M O D E R N A D A P T I V E FA S T

    With Veeva Vault EDC, design and build studies in days—not weeks—

    with user-friendly features like drag-and-drop functionality and con�gurable

    review and approval work�ows. Real-time edit checks and personalized

    dashboards give you cleaner data faster. And a modern cloud platform means

    mid-study changes happen with no downtime. 

    Veeva_EDCad_SCDM_A4_FINAL.pdf 1 5/16/18 9:36 AM

  • 25

    Submission Requirements

    DATA BASICS 2018 Fall

    PUBLICATION POLICY

    We welcome submission of materials for publication in Data Basics. Materials should preferably be submitted in electronic form (MS Word). Acceptance of materials for publication will be at the sole discretion of the Editorial Board. The decision will be based primarily upon professional merit and suitability. Publications may be edited at the discretion of the Editorial Board.

    Neither SCDM nor the Data Basics Editorial Board endorses any commercial vendors or systems mentioned or discussed in any materials published in Data Basics.

    ADVERTISING POLICY

    AD RATES** x1 x2 x3 x4

    FULL Page $1,064 each $1,008 each ($2,016) $960 each ($2,880) $906 each ($3,624)

    HALF Page $740 each $700 each ($1,400) $670 each ($2,010) $630 each ($2,520)

    QTR Page $450 each $425 each ($850) $402 each ($1,206) $378 each ($1,512)

    **Ads are net, non-commissionable.

    Advertisers purchasing multiple ad packages will have the option of placing those ads anytime within the 12-month period following receipt of payment by SCDM.

    Quarter Page = (3 5/8 inches x 4 7/8 inches) Half Page-Vertical = (3 5/8 inches x 10 inches)

    Half Page-Horizontal = (7 1/2 inches x 4 7/8 inches) Full Page = (7 1/2 inches x 10 inches)

    MECHANICAL REQUIREMENTS: Do not send logo/photos/images from word processing software, presentation software or websites. Files should be saved in the

    native application/file format in which they were created at a resolution of 300 dpi or higher.

    Acceptable file formats include AI, EPS and high resolution PSD, JPEG, TIF and PDF.

    PAYMENT: Payment must be received with advertisement. Space reservations cannot be made by telephone. There is NO agency discount. All ads must be paid in full.

    CANCELLATIONS: Cancellations or changes in advertising requests by the advertiser or its agency five days or later after the submission deadline will not be accepted.

    GENERAL INFORMATION: All ads must be pre-paid. Publisher is not liable for advertisement printed from faulty ad materials. Advertiser agrees to hold SCDM harmless

    from any and all claims or suits arising out of publication on any advertising. SCDM assumes

    no liability, including but not limited to compensatory or consequential damages, for any

    errors or omissions in connection with any ad. SCDM does not guarantee placement in specific

    locations or in a given issue. SCDM reserves the right to refuse or pull ads for space or content.

    Authors:

    For each article published, authors receive 0.2 CEUs.

    Disclaimer:

    The opinions expressed in this publication are those of the authors. They do not reflect the opinions of SCDM or its members. SCDM does not endorse any products, authors or procedures mentioned in this publication.

    Please submit all forms, artwork, and payments to any of the addresses below:

    Global HeadquartersSociety for Clinical DataManagement, IncBoulevard du Souverain, 280B-1160 BrusselsBelgiumTel: +32-2-740.22.37Fax: [email protected]

    North America OfficeSociety for Clinical DataManagement, Inc7918 Jones Branch Drive Suite 300 McLean VA 22102, USATel: +1-703-506-3260Fax: [email protected]

    India OfficeSociety for Clinical DataManagement, Inc203, Wing B, Citipoint(Near Hotel Kohinoor Continental)J. B. Nagar, Andheri-Kurla RoadAndheri (East). Mumbai – 400059IndiaTel: +91-22-61432600 Fax: [email protected]

    China [email protected]

    mailto:info%40scdm.org?subject=mailto:info-am%40scdm.org?subject=mailto:info-in%40scdm.org?subject=mailto:info-cn%40scdm.org?subject=

    Bouton 40: Page 2:

    Bouton 44: Page 2:

    Bouton 34: Page 3:

    Bouton 35: Page 3:

    Bouton 36: Page 3:

    Bouton 30: Page 4: Page 5: Page 6: Page 7: Page 11: Page 12: Page 13: Page 14: Page 15: Page 16: Page 18: Page 19: Page 20: Page 22:

    Bouton 31: Page 4: Page 5: Page 6: Page 7: Page 11: Page 12: Page 13: Page 14: Page 15: Page 16: Page 18: Page 19: Page 20: Page 22:

    Bouton 32: Page 4: Page 5: Page 6: Page 7: Page 11: Page 12: Page 13: Page 14: Page 15: Page 16: Page 18: Page 19: Page 20: Page 22:

    Bouton 47: Page 8: Page 9: Page 21: Page 23:

    Bouton 48: Page 8: Page 9: Page 21: Page 23:

    Bouton 49: Page 8: Page 9: Page 21: Page 23:

    Bouton 50: Page 25:

    Bouton 51: Page 25:

    Bouton 52: Page 25: