false
Catalog
2022 World Conference on Lung Cancer (ePosters)
EP07.01-003. Metadata-Oriented Data Engineering Ap ...
EP07.01-003. Metadata-Oriented Data Engineering Approach for Integration of Medical Data: A Pilot Study
Back to course
Pdf Summary
This study explores the applicability of a metadata-oriented data engineering approach in a small medical dataset. The researchers prepared 43 different data tables with information on 262 patients diagnosed with thymoma and treated at Ankara University Faculty of Medicine. They used Python programming language, along with NumPy and Pandas libraries, to create a holistic data frame that integrates all the data tables and ensures data accuracy and consistency. The software developed in this study allows for the detection of missing and incorrect data, outlier analysis, and the merging and deduplication of data.<br /><br />The successful integration of data from various sources is crucial for stakeholders in the healthcare system. However, traditional data processing methods are insufficient for the processing of big data and data mining. The software developed in this study aims to address these challenges by providing metadata sets that can be used in AI-assisted data analytics studies. The researchers believe that multidisciplinary studies can benefit from this software and that it can be valuable to different researchers.<br /><br />The next steps involve evaluating the software with larger datasets, especially in lung cancer research. The researchers also encourage other researchers to test the software for further validation. The study references previous research on data processing methods and emphasizes the importance of accurate and reliable data for data mining applications.<br /><br />Table 1 provides examples of erroneous data and outliers that were detected using the software. These examples include incorrect survival status entries, different gender labels for the same patient, and incorrect age entries. By identifying and addressing such errors, the software enhances the quality and reliability of the integrated dataset.
Asset Subtitle
Yusuf Kahya
Meta Tag
Speaker
Yusuf Kahya
Topic
Mesothelioma, Thymoma, and Other Thoracic Malignancies - Clinical
Keywords
metadata-oriented data engineering
small medical dataset
data tables
thymoma patients
Ankara University Faculty of Medicine
Python programming language
NumPy library
Pandas library
data accuracy
data consistency
×
Please select your language
1
English