1. Know your audience
A critical consideration in understanding the fit-for-purpose nature of a data set is thinking about stakeholder expectations, including what they want to learn and how the outputs of the research will be employed. This includes assessing if the analysis will be for internal use only, or if it will be included in a regulatory submission.
Regulatory bodies such as the FDA have high standards for real-world data and evidence used to support new products and indications. Using an analysis in regulatory submissions will require certain data transparency in terms of traceability, auditing and sharing. Understanding the ability of a given asset to support those needs is essential.
Ensuring you know the final audience for your work well in advance — and what that audience requires from a data quality documentation and transparency perspective — will start you on a solid foundation before diving into the specifics of your study.
2. Understand the research question, hypothesis or business issue
Having a comprehensive understanding of the question you’re asking, the hypothesis to be tested, or the business issue at hand is inarguably one of the most important steps in determining which data asset to employ.
Skipping this step, even in part, has the potential to set you on a path that may lead to a time-consuming, costly failure. Given the criticality of the question and hypothesis definition, you should invest heavily in the design, definition and validation of the question(s) being asked, and the answer(s) being sought.
Iterating on such questions can ensure the question(s) meet the appropriate level of specificity well in advance of selecting a data asset.
For example, consider an organization that wants to investigate bariatric surgery outcomes. The following examples illustrate how one way of approaching the research goal is better than the other because of the level of specificity.
- Example 1: “Examine bariatric surgery to understand the outcomes.”
- Example 2: “What is the rate of success of bariatric surgery over time and what percentage of patients do not see success? What does that lack of success translate to in terms of cost?”
The second example is a better place to start because it allows you to understand which data elements are clearly necessary to answer the underlying questions. From the second example, you can see that you need a data set that includes both patient outcomes and cost metrics and identifies patients who’ve undergone a surgical procedure. These are all simple but useful considerations in selecting a fit-for-purpose data set.