A Data Quality Framework to Assess Healthcare Data in Saudi Arabia: An Automated Approach
Speaker(s)
Al-Jafar R1, Abuharb A2, Alsaawi FA1, Alzeer A3
1Lean Business Services, Riyadh, Riyadh Region, Saudi Arabia, 2Lean Business Services, Riyadh, 01, Saudi Arabia, 3Lean Business Services, al-Riyad, Saudi Arabia
Presentation Documents
OBJECTIVES: This study aims to assess the quality of healthcare data among Ministry of Health facilities in Saudi Arabia. Toward this end, a data quality engine will be built and validated, and a business rules list for data quality will be established.
METHODS: The study used a probability sampling technique to have a sample representing a large dataset gathered from three main health information systems in Saudi Arabia. Within this sample, we checked the quality of 25 data elements for outpatient data and 22 for inpatient data. The process consisted of three phases: (i) columns identification and data cleansing phase; (ii) measurements and assessment phase, (iii) analysis and improvement phase. The measurements and assessment phase was based on five criteria: uniqueness (number of duplicates of a patient ID number), completeness (ratio between the completed values to the total number of values in the dataset), validity (if it conforms to the syntax [format, type, range] of its definition in Saudi Health Data Dictionary), consistency (the ratio of values matching the values of the source of truth) and timeliness (the degree to which data is updated from the specific point on time).
RESULTS: The study data sample comprises 5,716,953 records. Our proposed approach has detected Most column types with high confidence (80%). Within the measurements and assessment phase, the uniqueness, completeness, validity and consistency percentages were 100, 83, 72, and 94, respectively.
CONCLUSIONS: The proposed data quality framework indicated that health data in Saudi Arabia could be improved and highlighted the areas to be targeted. Our automated approach can be applied to real-world data in other health systems to enhance the data quality assessment.
Code
RWD175
Topic
Study Approaches
Topic Subcategory
Electronic Medical & Health Records
Disease
No Additional Disease & Conditions/Specialized Treatment Areas