Sample 140

Evaluation Instructions

Task Overview: AI models read clinical trial protocol documents and were asked to extract evidence supporting specific eligibility criteria modifications. You're evaluating whether the AI extracted the RIGHT evidence that justifies the post-modified (final) eligibility criteria.

Workflow: AI reads protocol → AI extracts evidence → Ground truth is final modified criteria → You evaluate if AI's evidence actually supports those final criteria

Important: Some outputs may be low quality and should be scored accordingly.

Your task: Compare the model-generated prediction (right panel) against the ground truth criteria (left panel).

Evaluation scale (0-4):

Consider both content accuracy and completeness. Some predictions may be technically "correct" but incomplete or out of order.

Ground Truth

INCLUSION CRITERIA

  1. Patients with symptomatic unilateral diaphragmatic paralysis persisting for ≥1 year
  2. Patients are scheduled for RATS at thoracic surgery clinics in five Swedish university hospitals (Gothenburg, Linköping, Lund, Umeå, and Uppsala).

EXCLUSION CRITERIA

  1. Inability to complete forms in Swedish
  2. Neuromuscular disease as the cause of diaphragmatic paralysis
  3. Other significant causes of dyspnea
  4. Impaired physical capacity from other causes than diaphragmatic paralysis.

Prediction

INCLUSION CRITERIA

  1. Patients with symptomatic unilateral diaphragmatic paralysis persisting for ≥1 year, scheduled for robot-assisted thoracic surgery (RATS) at thoracic surgery clinics in five Swedish university hospitals (Gothenburg, Linköping, Lund, Umeå, and Uppsala).

EXCLUSION CRITERIA

  1. Inability to complete forms in Swedish.
  2. Neuromuscular disease as the cause of diaphragmatic paralysis.
  3. Other significant causes of dyspnea and impaired physical capacity than diaphragmatic paralysis.
  4. Use of long-term oxygen therapy and/or mechanical ventilation.