Complete Workflow: Patient Data Extraction and Filtering

Extract structured fields from unstructured medical records, filter by criteria, and export a refined dataset.

Overview

Starting with a 150,000+-row parquet file containing unstructured patient records, use LLM-based extraction to create structured columns, filter the dataset by age and diagnosis criteria, and export a subset with selected columns.

Demo showing patient data extraction and filtering in Hyperparam

Steps

  1. Load the dataset

    Open Asclepius-Synthetic-Clinical-Notes/0000

  2. Extract structured fields using chat

    Open any cell from the 'note' column containing patient information

    > Note: We can view full unstructured text data for an individual chart by scrolling down

    Use chat to request extraction: "extract age, diagnosis, symptoms, comorbidities, treatments, outcome in separate columns from 'note' column"

    > Note: Hyperparam will create 6 new columns and populate them with the extractions.

    Columns appear as: age, diagnosis, symptoms, comorbidities, treatments, outcome

    > Note: Scroll down and you will see Hyperparam filling out all rows

  3. Open sample

    Select either the first rows or a random sample to view 100 rows of the dataset, which can then be sorted by the generated columns

  4. Apply filters

    Add Filter by age: age > 50

    Add Filter by diagnosis: diagnosis contains respiratory

    Add Filter by symptoms: contains "fever"

    Only matching patients are shown

  5. Export filtered dataset

    Click export

    Select specific columns: subject_id, age, diagnosis, symptoms, comorbidities, treatments, outcome

    Enable "Apply current table filters"

    Set output filename: filtered_patients.parquet

    Click export to process full dataset with filters applied

Expected Results

  • Extracted columns: Structured fields parsed from unstructured patient text
  • Final export: Exported file includes only patients with matching criteria and only selected columns.

Other Use Cases

Complete Workflow: Patient Data Extraction and Filtering - Hyperparam