Technologies for Complex Intelligent Clinical Data Analysis

Cover Page
  • Authors: Baranov A.A.1, Namazova-Baranova L.S.1, Smirnov I.V.2, Devyatkin D.A.2, Shelmanov A.O.2, Vishneva E.A.1, Antonova E.V.1, Smirnov V.I.1
  • Affiliations:
    1. Scientific Center of Children’s Health, Moscow
    2. Institute for Systems Analysis, Federal Research Center «Computer Science and Control» of Russian Academy of Sciences, Moscow
  • Issue: Vol 71, No 2 (2016)
  • Pages: 160-171
  • Section: STATE OF MEDICAL SCIENCES
  • URL: https://vestnikramn.spr-journal.ru/jour/article/view/663
  • DOI: https://doi.org/10.15690/vramn663
  • Cite item

Abstract


The paper presents the system for intelligent analysis of clinical information. Authors describe methods implemented in the system for clinical information retrieval, intelligent diagnostics of chronic diseases, patient’s features importance and for detection of hidden dependencies between features. Results of the experimental evaluation of these methods are also presented.

Background: Healthcare facilities generate a large flow of both structured and unstructured data which contain important information about patients. Test results are usually retained as structured data but some data is retained in the form of natural language texts (medical history, the results of physical examination, and the results of other examinations, such as ultrasound, ECG or X-ray studies). Many tasks arising in clinical practice can be automated applying methods for intelligent analysis of accumulated structured array and unstructured data that leads to improvement of the healthcare quality.

Aims: the creation of the complex system for intelligent data analysis in the multi-disciplinary pediatric center.

Materials and methods: Authors propose methods for information extraction from clinical texts in Russian. The methods are carried out on the basis of deep linguistic analysis. They retrieve terms of diseases, symptoms, areas of the body and drugs. The methods can recognize additional attributes such as «negation» (indicates that the disease is absent), «no patient» (indicates that the disease refers to the patient’s family member, but not to the patient), «severity of illness», «disease course», «body region to which the disease refers». Authors use a set of hand-drawn templates and various techniques based on machine learning to retrieve information using a medical thesaurus. The extracted information is used to solve the problem of automatic diagnosis of chronic diseases. A machine learning method for classification of patients with similar nosology and the method for determining the most informative patients’ features are also proposed.

Results: Authors have processed anonymized health records from the pediatric center to estimate the proposed methods. The results show the applicability of the information extracted from the texts for solving practical problems. The records of patients with allergic, glomerular and rheumatic diseases were used for experimental assessment of the method of automatic diagnostic. Authors have also determined the most appropriate machine learning methods for classification of patients for each group of diseases, as well as the most informative disease signs. It has been found that using additional information extracted from clinical texts, together with structured data helps to improve the quality of diagnosis of chronic diseases. Authors have also obtained pattern combinations of signs of diseases.

Conclusions: The proposed methods have been implemented in the intelligent data processing system for a multidisciplinary pediatric center. The experimental results show the availability of the system to improve the quality of pediatric healthcare. 


A. A. Baranov

Scientific Center of Children’s Health, Moscow

Email: baranov@nczd.ru

Russian Federation MD, PhD, Professor, Academician of RAS, Director

L. S. Namazova-Baranova

Scientific Center of Children’s Health, Moscow

Email: namazova@nczd.ru

Russian Federation MD, PhD, Professor, Corresponding Member of RAS, Deputy Director

I. V. Smirnov

Institute for Systems Analysis, Federal Research Center «Computer Science and Control»
of Russian Academy of Sciences, Moscow

Email: ivs@isa.ru

Russian Federation

PhD in Physics and Mathematics, Associate Professor, Head of Laboratory

D. A. Devyatkin

Institute for Systems Analysis, Federal Research Center «Computer Science and Control»
of Russian Academy of Sciences, Moscow

Email: devyatkin@isa.ru

Russian Federation

Junior Researcher

A. O. Shelmanov

Institute for Systems Analysis, Federal Research Center «Computer Science and Control»
of Russian Academy of Sciences, Moscow

Email: shelmanov@isa.ru

Russian Federation

PhD in Engineering, Junior Researcher

E. A. Vishneva

Scientific Center of Children’s Health, Moscow

Email: vishneva@nczd.ru

Russian Federation

MD, PhD, Head of Department

E. V. Antonova

Scientific Center of Children’s Health, Moscow

Author for correspondence.
Email: antonova@nczd.ru

Russian Federation

MD, PhD, Professor, Head of Department

V. I. Smirnov

Scientific Center of Children’s Health, Moscow

Email: support@nczd.ru

Russian Federation

PhD in Economics, Deputy Director

 

  1. Musen MA, Middleton B, Greenes RA. Clinical decision-support systems. In: Biomedical informatics. Springer; 2014. p. 643–674. doi: 10.1007/978-1-4471-4474-8_22.
  2. I sa NAM. Towards intelligent diagnostic system employing integration of mathematical and engineering model. In: Proceedings of International Conference on Mathematics, Engineering and Industrial Applications. AIP Publishing; 2015. p. 030002–1– 030002–13. doi: 10.1063/1.4915633.
  3. Abee r YA, Ahmad MA, Majid AA. Clinical decision support system for diagnosis and management of chronic renal failure. In: Proceedings of Applied Electrical Engineering and Computing Technologies. IEEE; 2013. p. 1–6. doi: 10.1109/aeect.2013.6716440.
  4. Zaran di MHF, Zolnoori M, Moin M, Heidarnejad H. A fuzzy rule-based expert system for diagnosing asthma. Transaction E: Industrial Engineering. 2010;17(2):129–142.
  5. Prospe ri MC, Marinho S, Simpson A, Custovic A, Buchan IE. Predicting phenotypes of asthma and eczema with machine learning. BMC medical genomics. 2014;7(1). doi: 10.1186/1755-8794-7-s1-s7.
  6. Carrol l RJ, Thompson WK, Eyler AE, Mandelin AM, Cai T, Zink RM, et al. Portability of an algorithm to identify rheumatoid arthritis in electronic health records. Journal of the American Medical Informatics Association. 2012;19(e1):e162–e169. doi: 10.1136/amiajnl-2011-000583.
  7. Wright A, Chen ES, Maloney FL. An automated technique for identifying associations between medications, laboratory results and problems. Journal of biomedical informatics. 2010;43(6):891–901. doi: 10.1016/j.jbi.2010.09.009.
  8. Doddi S, Marathe A, Ravi SS, T DC. Discovery of association rules in medical data. Informatics for Health and Social Care. 2001;26(1):25–33. doi: 10.1080/14639230117529.
  9. Stilou S, Bamidis P, Maglaveras N, Pappas C. Mining association rules from clinical databases: an intelligent diagnostic process in healthcare. Studies in health technology and informatics. 2001;(2):1399–1403.
  10. Dligach D, Bethard S, Becker L, Miller TA, Savova GK. Discovering body site and severity modifiers in clinical texts. Journal of the American Medical Informatics Association (JAMIA). 2014;p. 448–454. doi: 10.1136/amiajnl-2013-001766.
  11. Chikka VR, Mariyasagayam N, Niwa Y, Karlapalem K. Information Extraction from Clinical Documents: Towards Disease/Disorder Template Filling. In: Experimental IR Meets Multilinguality, Multimodality, and Interaction. Springer; 2015. p. 389–401. doi: 10.1007/978-3-319-24027-5_41.
  12. Баранов АА, Намазова-Баранова ЛС, Смирнов ИВ, Девяткин ДА, Шелманов АО, Вишнева ЕА, et al. Методы и средства комплексного интеллектуального анализа медицинских данных. Труды ИСА РАН. 2015;65(2):81–93.
  13. Gudgin M, Had ley M, Mendelsohn N, Moreau JJ, Nielsen HF, Karmarkar A, et al. Soap version 1.2 part 1: Messaging framework. W3C Working Draft, DevelopMentor, Sun, IBM, Canon, Microsoft. 2002.
  14. Shelmanov AO, Smirnov IV. Methods for semantic role labeling of Russian texts. In: Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference «Dialogue» (2014). 13; 2014. p. 607–620.
  15. Osipov G, Smir nov I, Tikhomirov I, Shelmanov A. Relationalsituational method for intelligent search and analysis of scientific publications. In: Proceedings of the Integrating IR Technologies for Professional Search Workshop; 2013. p. 57–64. doi: 10.3103/s0147688210060080.
  16. Shelmanov AO, S mirnov IV, Vishneva EA. Information Extraction from Clinical Texts in Russian. In: Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference «Dialogue» (2015). 13; 2015. p. 560–572.
  17. Aronson AR, Lang FM. An overview of MetaMap: historical perspective and recent advances. Journal of the American Medical Informatics Association. 2010;17(3):229–236. doi: 10.1136/jamia.2009.002733.
  18. Schuyler PL, Hole WT, Tuttle MS, Sherertz DD. The UMLS Metathesaurus: representing different views of biomedical concepts. Bulletin of the Medical Library Association. 1993;81(2).
  19. 2014AA UMLS MeSH Russian Source Information URL: http:// www.nlm.nih.gov/research/umls/sourcereleasedocs/current/ MSHRUS/; 2015.
  20. Государственный р еестр лекарственных средств (ГРЛС) URL: http://grls.rosminzdrav.ru/Default.aspx; 2015.
  21. Breiman L, Friedm an J, Stone CJ, Olshen RA. Classification and regression trees. CRC press; 1984. doi: 10.2307/2530946.
  22. Breiman L. Random forests. Machine learning. 2001;45(1):5–32. doi: 10.1023/A:1010933404324.
  23. Friedman JH. Greed y function approximation: a gradient boosting machine. Annals of statistics. 2001;p. 1189–1232. Doi:10.1214/ aos/1013203451.
  24. Breiman L. Technica l note: Some properties of splitting criteria. Machine Learning. 1996;24(1):41–47. doi: 10.1007/bf00117831.
  25. Agrawal R, Inski T, Swami A. Mining association rules between sets of items in large databases. Ín: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data. vol. 22. ACM; 1993. p. 207–216. doi: 10.1145/170036.170072.
  26. Agrawal R, Srikant R. Fast algorithms for mining association rules. In: Proceedings of 20th International Conference on Very Large Data Bases. vol. 1215; 1994. p. 487–499.
  27. Vapnik V. The nature of statistical learning theory. Springer Science & Business Media; 1998.
  28. Воронцов КВ. Комбинаторный подход к оценке качества обучаемых алгоритмов. Математические вопросы кибернетики. 2004;13:5–36.
  29. Kelly L, Goeuriot L, Suominen H, Schreck T, Leroy G, Mowery DL, et al. Overview of the SHARE/CLEF eHealth evaluation lab 2014. In: Information Access Evaluation. Multilinguality, Multimodality, and Interaction. Springer; 2014. p. 172–191. doi: 10.1007/978-3-319-11382-1_17.
  30. Powers DM. Evaluation: fr om precision, recall and F-measure to ROC, informedness, markedness and correlation. 2011;2(1):37–63.

Views

Abstract - 79

PDF (Russian) - 95

Cited-By


PlumX

Dimensions



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

This website uses cookies

You consent to our cookies if you continue to use our website.

About Cookies