Building Cancer Knowledge Graph for Clinical Decision Support Applications

Author(s)

Hong N¹, Lin W², Li X³, Zhang Q³, Yang Y³, Guo Q²
¹Digital Health China Technologies Co. Ltd., Beijing, 11, China, ²National Cancer Center/ National Clinical Research Center for Cancer/ Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China, ³Digital Health China Technologies Co. Ltd., Beijing, China

OBJECTIVES: This study aimed to build a Chinese cancer knowledge graph that will facilitate integrated data analysis and clinical decision support for the care of patients with cancer in China.

METHODS: The entities and relations of knowledge graph were collected from existing Chinese terminologies and knowledge sources, includes ICD-O-3 Chinese Version, ICD-9-PC Chinese Version, officially issued commonly used Chinese clinical terms, Chinese guidelines for diagnosis and treatment of cancer, Chinese medical books and literatures. 209 unstructured clinical guidelines and clinical study articles were collected for automatically building knowledge graph, a bidirectional LSTM was used for named entity recognition, in addition, dual attention mechanism and multi-entity packaging training were adopted for relation extraction. The extracted entities and relations from multiple sources were aligned and merged into one knowledge graph, which was managed by graph database Neo4j. The performance of natural language processing results was evaluated by average precision, recall, and F1-score. The usability of knowledge graph was evaluated by a question-answering based clinical decision application.

RESULTS: A total of 9442 entities in 7 classes, and 6559 relations in 10 types and 7 attributes were built in knowledge graph. Initial evaluation of the natural language processing results demonstrated average precision, recall, and F1-score of entity extraction was 85.91%, 84.17%, and 84.30%. Average performance of relations extraction was 78.33%, 78.58%, and 78.19%. An iteratively manually review and quality assurance of medical knowledge was conducted to generate an updated Chinese knowledge graph of cancer. Furthermore, a question-answering module, retrieval module and reasoning module were developed based on knowledge graph to provide decision-making support for cancer care.

CONCLUSIONS: This study demonstrated the initial building results of Chinese knowledge graph and its potential applications. With the update of medical knowledge, new entities and relations will be added constantly, and the decision-making support applications will be further improved.

Conference/Value in Health Info

2021-05, ISPOR 2021, Montreal, Canada

Value in Health, Volume 24, Issue 5, S1 (May 2021)

Code

PCN187

Topic

Health Technology Assessment, Medical Technologies, Methodological & Statistical Research, Organizational Practices

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Best Research Practices, Digital Health, Systems & Structure

Disease

Oncology

Explore Related HEOR by Topic

Presentation