Organizations make strategic and operational decisions based on data. Poor quality data can negatively influence how a company
is perceived in the marketplace; therefore, it is critical to ensure data quality is given the highest priority. With data
standards development in the biopharmaceutical industry, there are now more opportunities for businesses to ensure data quality.
Using standards can increase process efficiency and effectiveness, saving clinical trial data process lifecycle resources
as well as improving compliance. Implementing standards, while a goal and a trend in clinical development, presents some key
challenges, including how to leverage different standards across the development lifecycle. Genzyme has implemented or is
in the process of implementing CDISC standards end-to-end, including PROTOCOL, CDASH, LAB, SDTM, ADaM, and Controlled Terminology
as well as utilizing BRIDG as its underlying information model. To leverage the standards, we are building a Metadata Repository
(MDR) to govern data collection, data processing, and data submission, and to leverage the usage of different standards enterprise-wide.
In order to increase efficiency and effectiveness, a data validation tool is needed for improving data quality and ensuring
data provided by one of Genzyme's many partners or by internal teams matches all specified requirements and ensures "quality
by design."
Importance of Data Quality
Data quality isn't just about the data. It is about people's understanding of what it is, what it means, and how it should
be used. Poor quality data will:
» Increase costs through wasted resources, the need to correct and deal with reported errors, and the inability to optimize
business processes; and
» Ensure lost revenue through customer dissatisfaction, lowered employee morale, and poorer decision-making.
This is an event-driven, process-oriented world, and quality data will be essential to success.
Data Quality Principles
Data quality plays an important role in our business world as well as in daily life. Data quality is not linear and has many
dimensions such as accuracy, completeness, consistency, timeliness, and audit ability. Having data quality on only one or
two dimensions is as good as having no quality. There are many factors that influence data quality, such as data design, data
process, data governance, data validation, etc.
Data design is about discovering and completely defining your application's data characteristics and processes. It is a process
of gradual refinement, from the coarse ("What data does your application require?") to the precise data structures and processes
that provide it. With a good data design, your application's data access is fast, easily maintained, and can gracefully accept
future data enhancements. The process of data design includes identifying the data, defining specific data types and storage
mechanisms, and ensuring data integrity by using business rules and other run-time enforcement mechanisms. A good data design
defines data availability, manageability, performance, reliability, scalability, and security.
Data governance can be defined as:
» A set of processes that ensures that important data assets are formally managed throughout the enterprise;
» A system of decision rights and accountabilities for information-related processes, executed according to agreed-upon
models which describe who can take what actions with what information, and when, under what circumstances, using what methods;
» A quality control discipline for assessing, managing, using, improving, monitoring, maintaining, and protecting organizational
information;
» Putting people in charge of fixing and preventing issues with data so that the enterprise can become more efficient;
and
» Using technology when necessary in many forms to help aid the process.
Data governance describes an evolutionary process for a company, altering the company's way of thinking and setting up the
processes to handle information so that it may be utilized by the entire organization. It ensures that data can be trusted
and that people can be made accountable for any business impact of low data quality. To ensure data quality, data governance
processes need to be developed.
Data Validation is the processes and technologies involved in ensuring the conformance of data values to business requirements
and acceptance criteria. It uses routines, often called "validation rules" or "check routines," that look for correctness,
meaningfulness, and security of data that are inputted to the system. The rules may be implemented through the automated facilities
of a data dictionary, or by the inclusion of explicit application program validation logic.
What can help us to appropriately implement these data quality principles into the business process? We think it is important
and useful to leverage industry common sense by implementing CDISC standards as a basis for data design and validation.