CheXpert Project Review

automate chest radiograph (x-ray) interpretation
using large public dataset to train deep learning models to achieve expert-level performance for chest radiograph interpretation
- consisting of 224,316 chest radiographs (images) of 65,240 patients
each patient’s report: chest radiographic examinations (chest x-rays image) + their associated radiology reports

automated chest radiograph interpretation at the level of practicing radiologists could provide substantial benefit in many medical settings:
- improved workflow prioritization
- clinical decision support
- large-scale screening
- global population health initiatives

each imaging study can pertain to one or more images, but most often are associated with two images:
- a frontal view, and
- a lateral view
images are provided with 14 labels derived from a natural language processing (NLP) tool applied to the corresponding free-text radiology reports

The following keywords are the ‘observations’ (medical diagnosis of) sought after in each radiology report:

Each report was processed by an NLP labeler, and the associated x-rays were given the above listed 14 observations with an assigned weight:
- positive (observation exists in x-ray image),
- negative (observation does not exist in x-ray image), or
- uncertain
An automated rule-based labeler (Natural Language Processor) extracted observations from the radiology reports to be used as structured labels for the chest radiographs (x-ray images)
The NLP labeler is set up in three distinct stages:
- mention extraction:
  - the labeler extracts mentions of above listed observations from the impression section of radiology reports
  - summarizes the key findings in the radiographic study
- mention classification:
  - mentions of observations are classified as negative, uncertain, or positive
- mention aggregation:
  - we use the classification for each mention of observations to arrive at a final label for the 14 observations
    - blank for unmentioned, 0 (negative), 1 (positive), or u (uncertain).

NLP process-flow

Model Input and Output

Input: single-view chest radiograph
Output: probability of each of the 14 observations
When more than one view is available, the models output the maximum probability of the observations across the views
The training labels in the dataset for each observation are either 0 (negative), 1 (positive), or u (uncertain)
For the uncertain labels, different approaches are explored during the model training:
- U-Ignore: We ignore the uncertain labels during training
- U-Zeroes: We map all instances of the uncertain label to 0
- U-Ones: We map all instances of the uncertain label to 1
- U-SelfTrained: We first train a model using the U-Ignore approach to convergence, and then use the model to make predictions that re-label each of the uncertainty labels with the probability prediction outputted by the model
- U-MultiClass: We treat the uncertainty label as its own class
Baseline model has been selected based on the best performing approach on each competition tasks on the validation set:
- U-Ones for Atelectasis and Edema,
- U-MultiClass for Cardiomegaly and Pleural Effusion, and
- U-SelfTrained for Consolidation
The model output looks like the following table:

NLP process-flow

Submission Goals:

competition for automated chest x-ray interpretation:
- features uncertainty labels and
- radiologist-labeled evaluation sets as standard reference
uses a hidden test set for official evaluation of models (see codalab submission tutorials)
focus on the evaluation of 5 observations which are the competition tasks, selected based of clinical importance and prevalence:
1. Atelectasis,
2. Cardiomegaly,
3. Consolidation,
4. Edema, and
5. Pleural Effusion
64740 reports for training and the rest for validation

pathology: The anatomic or functional manifestations of a disease: the pathology of cancer