Background and Motivation

Electronic Clinical Narratives (ECN) have become the standard for storing all the information a practician finds relevant to describe and evaluate a patient's clinical episode or evolution. These documents contain descriptions about previous pathologies, undergone procedures, evolution of a given disease, or prescribed treatments. Secondary use of ECN tackles diverse tasks, including identifying rare medical events, prediction of hospital re-admissions or in Public Health Surveillance among others.

Identifying medical sections in the patient narratives documented in ECNs is a crucial task for higher level applications. Section identification consists in dividing the text into semantic segments categorized with a set of predefined labels. Section identification provides new insights about entities, which might be completely different depending on the section in which they occur; a pathology referenced in the patient's medical history could be used to predict future conditions and risks of illness. Similarly, a symptomatology in the Evolution section could indicate adverse reactions to a given treatment.

Shared Task Summary

The ClinAIS task presented at IberLEF 2023 [1] aims to tackle the problem of automatic identification of sections in unstructured Spanish clinical documents. The task is focused on identifying 7 predefined medical sections: Present Illness, Derived from/to, Past Medical History, Family history, Exploration, Treatment and Evolution in ECNs, mainly progress notes.

The successful resolution of this task will enable the improvement of higher level applications that can extract valuable, actionable information from clinical documents, such as medical entity recognition, patient cohort retrieval, and temporal relation extraction. This will ultimately improve patient care and clinical decision-making.

Citation

Shared Task Overview

I. de la Iglesia, M. Vivó, P. Chocrón, G. de Maeztu, K. Gojenola, A. Atutxa, Overview of ClinAIS at IberLEF 2023: Automatic Identification of Sections in Clinical Documents in Spanish, Procesamiento del Lenguaje Natural 71 (2023) 289–299. URL: http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6560.

@article{PLN6560,
    author = {Iker de la Iglesia and  Maria Viv{\'{o}} and Paula Chocr{\'{o}}n and Gabriel de Maeztu and Koldo Gojenola and Aitziber Atutxa},
    title = {{Overview of ClinAIS at IberLEF 2023: Automatic Identification of Sections in Clinical Documents in Spanish}},
    journal = {{Procesamiento del Lenguaje Natural}},
    volume = {71},
    year = {2023},
    issn = {1989-7553},
    url = {http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6560},
    pages = {289--299}
}
       


Dataset and Evaluation Metric

I. de la Iglesia, M. Vivó, P. Chocrón, G. de Maeztu, K. Gojenola, A. Atutxa, An Open Source Corpus and Automatic Tool for Section Identification in Spanish Health Records, Journal of Biomedical Informatics 145 (2023) 104461. URL: https://www.sciencedirect.com/science/article/pii/S153204642300182X. doi:https://doi.org/10.1016/j.jbi.2023.104461.

@article{delaiglesia2023104461,
  author = {Iker de la Iglesia and  Maria Viv{\'{o}} and Paula Chocr{\'{o}}n and Gabriel de Maeztu and Koldo Gojenola and Aitziber Atutxa},
  title = {{A}n {O}pen {S}ource {C}orpus and {A}utomatic {T}ool for {S}ection {I}dentification in {S}panish {H}ealth {R}ecords},
  journal = {Journal of Biomedical Informatics},
  volume = {145},
  pages = {104461},
  year = {2023},
  issn = {1532-0464},
  doi = {https://doi.org/10.1016/j.jbi.2023.104461},
  url = {https://www.sciencedirect.com/science/article/pii/S153204642300182X}
}
       




IberLEF 2023