Table Classes¶
Patient¶
clifpy.tables.patient.Patient
¶
Bases: BaseTable
Patient table wrapper inheriting from BaseTable.
This class handles patient-specific data and validations while leveraging the common functionality provided by BaseTable.
Initialize the patient table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
None
|
timezone
|
str
|
Timezone for datetime columns |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|
Source code in clifpy/tables/patient.py
ADT (Admission, Discharge, Transfer)¶
clifpy.tables.adt.Adt
¶
Bases: BaseTable
ADT (Admission/Discharge/Transfer) table wrapper inheriting from BaseTable.
This class handles ADT-specific data and validations while leveraging the common functionality provided by BaseTable.
Initialize the ADT table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
None
|
timezone
|
str
|
Timezone for datetime columns |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|
Source code in clifpy/tables/adt.py
check_overlapping_admissions
¶
Check for overlapping admissions within the same hospitalization.
Identifies cases where a patient has overlapping stays in different locations within the same hospitalization (i.e., the out_dttm of one location is after the in_dttm of the next location).
Parameters: save_overlaps (bool): If True, save detailed overlap information to CSV. Default is False. overlaps_output_directory (str, optional): Directory for saving the overlaps CSV file. If None, uses the output_directory provided at initialization.
Returns: int: Count of unique hospitalizations that have overlapping admissions
Raises: RuntimeError: If an error occurs during processing
Source code in clifpy/tables/adt.py
Hospitalization¶
clifpy.tables.hospitalization.Hospitalization
¶
Hospitalization(data_directory=None, filetype=None, timezone='UTC', output_directory=None, data=None)
Bases: BaseTable
Hospitalization table wrapper inheriting from BaseTable.
This class handles hospitalization-specific data and validations while leveraging the common functionality provided by BaseTable.
Initialize the hospitalization table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
None
|
timezone
|
str
|
Timezone for datetime columns |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|
Source code in clifpy/tables/hospitalization.py
calculate_length_of_stay
¶
Calculate length of stay for each hospitalization and return DataFrame with LOS column.
Source code in clifpy/tables/hospitalization.py
get_mortality_rate
¶
Calculate in-hospital mortality rate.
Source code in clifpy/tables/hospitalization.py
get_patient_hospitalization_counts
¶
Return DataFrame with hospitalization counts per patient.
Source code in clifpy/tables/hospitalization.py
get_summary_stats
¶
Return comprehensive summary statistics for hospitalization data.
Source code in clifpy/tables/hospitalization.py
Labs¶
clifpy.tables.labs.Labs
¶
Bases: BaseTable
Labs table wrapper inheriting from BaseTable.
This class handles laboratory data and validations including reference unit validation while leveraging the common functionality provided by BaseTable.
Initialize the labs table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
None
|
timezone
|
str
|
Timezone for datetime columns |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|
Source code in clifpy/tables/labs.py
lab_reference_units
property
¶
Get the lab reference units mapping from the schema.
get_lab_category_stats
¶
Return summary statistics for each lab category, including missingness and unique hospitalization_id counts.
Source code in clifpy/tables/labs.py
get_lab_specimen_stats
¶
Return summary statistics for each lab category, including missingness and unique hospitalization_id counts.
Source code in clifpy/tables/labs.py
Microbiology Culture¶
clifpy.tables.microbiology_culture.MicrobiologyCulture
¶
MicrobiologyCulture(data_directory=None, filetype=None, timezone='UTC', output_directory=None, data=None)
Bases: BaseTable
Microbiology Culture table wrapper inheriting from BaseTable.
This class handles microbiology culture-specific data and validations including organism identification validation and culture method validation.
Initialize the microbiology culture table.
Parameters: data_directory (str): Path to the directory containing data files filetype (str): Type of data file (csv, parquet, etc.) timezone (str): Timezone for datetime columns output_directory (str, optional): Directory for saving output files and logs data (pd.DataFrame, optional): Pre-loaded data to use instead of loading from file
Source code in clifpy/tables/microbiology_culture.py
cat_vs_name_map
staticmethod
¶
cat_vs_name_map(df, category_col, name_col, *, group_col=None, dropna=True, sort='freq_then_alpha', max_names_per_cat=None, include_counts=False)
Build mappings from category→names (2-level) or group→category→names (3-level).
Returns: - if group_col is None: { category: [names...] } or { category: [{"name":..., "n":...}, ...] } - if group_col is provided: { group: { category: [names...] } } or { group: { category: [{"name":..., "n":...}, ...] } }
Notes - Names are unique per (category[, group]) and sorted by: freq desc, then alpha (default), or alpha only if sort="alpha" - Set include_counts=True to return [{"name":..., "n":...}] instead of plain strings. - Set max_names_per_cat to truncate long lists per category.
Source code in clifpy/tables/microbiology_culture.py
139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 | |
isvalid
¶
top_fluid_org_outliers
¶
Identify top positive and negative outliers in fluid_category vs organism_group or organism_category.
Parameters: level (str): "organism_group" or "organism_category" (non-standard) min_count (int): Minimum observed count to consider top_k (int): Number of top positive and negative outliers to return
Returns: Dict with keys "top_positive" and "top_negative", each containing a DataFrame of outliers.
Source code in clifpy/tables/microbiology_culture.py
validate_timestamp_order
¶
Check that order_dttm ≤ collect_dttm ≤ result_dttm. - Resets self.time_order_validation_errors - Adds one entry per violated rule - Extends self.errors and logs: 'Found {len(self.time_order_validation_errors)} time order validation errors' Returns a dataframe of all violating rows (union of both rules) or None if OK.
Source code in clifpy/tables/microbiology_culture.py
Vitals¶
clifpy.tables.vitals.Vitals
¶
Bases: BaseTable
Vitals table wrapper inheriting from BaseTable.
This class handles vitals-specific data and validations including range validation for vital signs.
Initialize the vitals table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
None
|
timezone
|
str
|
Timezone for datetime columns |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|
Source code in clifpy/tables/vitals.py
filter_by_vital_category
¶
Return all records for a specific vital category (e.g., 'heart_rate', 'temp_c').
Source code in clifpy/tables/vitals.py
get_vital_summary_stats
¶
Return summary statistics for each vital category.
Source code in clifpy/tables/vitals.py
Respiratory Support¶
clifpy.tables.respiratory_support.RespiratorySupport
¶
RespiratorySupport(data_directory=None, filetype=None, timezone='UTC', output_directory=None, data=None)
Bases: BaseTable
Respiratory support table wrapper inheriting from BaseTable.
This class handles respiratory support data and validations while leveraging the common functionality provided by BaseTable.
Initialize the respiratory_support table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
None
|
timezone
|
str
|
Timezone for datetime columns |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|
Source code in clifpy/tables/respiratory_support.py
waterfall
¶
Clean + waterfall-fill the respiratory_support table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
id_col
|
str
|
Encounter-level identifier column (default: hospitalization_id) |
'hospitalization_id'
|
bfill
|
bool
|
If True, numeric setters are back-filled after forward-fill |
False
|
verbose
|
bool
|
Print progress messages |
True
|
return_dataframe
|
bool
|
If True, returns DataFrame instead of RespiratorySupport instance |
False
|
Returns:
| Type | Description |
|---|---|
RespiratorySupport
|
New instance with processed data (or DataFrame if return_dataframe=True) |
Notes
The waterfall function expects data in UTC timezone. If your data is in a different timezone, it will be converted to UTC for processing, then converted back to the original timezone on return. The original object is not modified.
Source code in clifpy/tables/respiratory_support.py
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 | |
Medication Administration (Continuous)¶
clifpy.tables.medication_admin_continuous.MedicationAdminContinuous
¶
MedicationAdminContinuous(data_directory=None, filetype=None, timezone='UTC', output_directory=None, data=None)
Bases: BaseTable
Medication administration continuous table wrapper inheriting from BaseTable.
This class handles medication administration continuous data and validations while leveraging the common functionality provided by BaseTable.
Initialize the MedicationAdminContinuous table.
This class handles continuous medication administration data, including validation, dose unit standardization, and unit conversion capabilities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files. If None and data is provided, defaults to current directory. |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.). If None and data is provided, defaults to 'parquet'. |
None
|
timezone
|
str
|
Timezone for datetime columns. Used for proper timestamp handling. |
"UTC"
|
output_directory
|
str
|
Directory for saving output files and logs. If not specified, outputs are saved to the current working directory. |
None
|
data
|
DataFrame
|
Pre-loaded DataFrame to use instead of loading from file. Supports backward compatibility with direct DataFrame initialization. |
None
|
Notes
The class supports two initialization patterns: 1. Loading from file: provide data_directory and filetype 2. Direct DataFrame: provide data parameter (legacy support)
Upon initialization, the class loads medication schema data including category-to-group mappings from the YAML schema.
Source code in clifpy/tables/medication_admin_continuous.py
med_category_to_group_mapping
property
¶
Get the medication category to group mapping from the schema.
Returns:
| Type | Description |
|---|---|
Dict[str, str]
|
A dictionary mapping medication categories to their therapeutic groups. Returns a copy to prevent external modification of the internal mapping. Returns an empty dict if no mappings are loaded. |
Examples:
Patient Assessments¶
clifpy.tables.patient_assessments.PatientAssessments
¶
PatientAssessments(data_directory=None, filetype=None, timezone='UTC', output_directory=None, data=None)
Bases: BaseTable
Patient assessments table wrapper inheriting from BaseTable.
This class handles patient assessment data and validations while leveraging the common functionality provided by BaseTable.
Initialize the patient_assessments table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
None
|
timezone
|
str
|
Timezone for datetime columns |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|
Source code in clifpy/tables/patient_assessments.py
assessment_category_to_group_mapping
property
¶
Get the assessment category to group mapping from the schema.
Position¶
clifpy.tables.position.Position
¶
Bases: BaseTable
Position table wrapper inheriting from BaseTable.
This class handles patient position data and validations while leveraging the common functionality provided by BaseTable.
Initialize the position table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
None
|
timezone
|
str
|
Timezone for datetime columns |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|
Source code in clifpy/tables/position.py
get_position_category_stats
¶
Return summary statistics for each position category, including missingness and unique patient counts. Expects columns: 'position_category', 'position_name', and optionally 'hospitalization_id'.
Source code in clifpy/tables/position.py
Medication Administration (Intermittent)¶
clifpy.tables.medication_admin_intermittent.MedicationAdminIntermittent
¶
Bases: BaseTable
Medication administration intermittent table wrapper inheriting from BaseTable.
This class handles medication administration intermittent data and validations while leveraging the common functionality provided by BaseTable.
Source code in clifpy/tables/base_table.py
Hospital Diagnosis¶
clifpy.tables.hospital_diagnosis.HospitalDiagnosis
¶
HospitalDiagnosis(data_directory=None, filetype=None, timezone='UTC', output_directory=None, data=None)
Bases: BaseTable
Hospital diagnosis table wrapper inheriting from BaseTable.
This class handles hospital diagnosis-specific data and validations while leveraging the common functionality provided by BaseTable. Hospital diagnosis codes are finalized billing diagnosis codes for hospital reimbursement, appropriate for calculation of comorbidity scores but should not be used as input features into a prediction model for an inpatient event.
Initialize the hospital diagnosis table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
None
|
timezone
|
str
|
Timezone for datetime columns |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|
Source code in clifpy/tables/hospital_diagnosis.py
get_diagnosis_by_format
¶
Group diagnoses by format (ICD9/ICD10) and return summary statistics.
Source code in clifpy/tables/hospital_diagnosis.py
get_diagnosis_summary
¶
Return comprehensive summary statistics for hospital diagnosis data.
Source code in clifpy/tables/hospital_diagnosis.py
get_hospitalization_diagnosis_counts
¶
Return DataFrame with diagnosis counts per hospitalization.
Source code in clifpy/tables/hospital_diagnosis.py
get_poa_statistics
¶
Calculate present on admission statistics by diagnosis type.
Source code in clifpy/tables/hospital_diagnosis.py
get_primary_diagnosis_counts
¶
Return DataFrame with counts of primary diagnoses by diagnosis code.
Source code in clifpy/tables/hospital_diagnosis.py
load_table
¶
Load hospital diagnosis table data from the configured data directory.
Source code in clifpy/tables/hospital_diagnosis.py
CRRT Therapy¶
clifpy.tables.crrt_therapy.CrrtTherapy
¶
Bases: BaseTable
CRRT (Continuous Renal Replacement Therapy) table wrapper inheriting from BaseTable.
This class handles CRRT therapy data including dialysis modes, flow rates, and ultrafiltration parameters while leveraging the common functionality provided by BaseTable.
Initialize the CRRT therapy table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
None
|
timezone
|
str
|
Timezone for datetime columns |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|
Source code in clifpy/tables/crrt_therapy.py
Patient Procedures¶
clifpy.tables.patient_procedures.PatientProcedures
¶
PatientProcedures(data_directory=None, filetype=None, timezone='UTC', output_directory=None, data=None)
Bases: BaseTable
Patient procedures table wrapper inheriting from BaseTable.
This class handles patient procedure data including CPT, ICD10PCS, and HCPCS codes while leveraging the common functionality provided by BaseTable.
Initialize the patient procedures table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
None
|
timezone
|
str
|
Timezone for datetime columns |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|
Source code in clifpy/tables/patient_procedures.py
Microbiology Susceptibility¶
clifpy.tables.microbiology_susceptibility.MicrobiologySusceptibility
¶
MicrobiologySusceptibility(data_directory=None, filetype=None, timezone='UTC', output_directory=None, data=None)
Bases: BaseTable
Microbiology susceptibility table wrapper inheriting from BaseTable.
This class handles antimicrobial susceptibility testing data including antimicrobial categories and susceptibility results while leveraging the common functionality provided by BaseTable.
Initialize the microbiology susceptibility table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
None
|
timezone
|
str
|
Timezone for datetime columns |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|
Source code in clifpy/tables/microbiology_susceptibility.py
ECMO MCS¶
clifpy.tables.ecmo_mcs.EcmoMcs
¶
Bases: BaseTable
ECMO (Extracorporeal Membrane Oxygenation) and MCS (Mechanical Circulatory Support) table wrapper inheriting from BaseTable.
This class handles ECMO/MCS device data including device types, flow rates, and support parameters while leveraging the common functionality provided by BaseTable.
Initialize the ECMO/MCS table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
None
|
timezone
|
str
|
Timezone for datetime columns |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|
Source code in clifpy/tables/ecmo_mcs.py
Microbiology Non-Culture¶
clifpy.tables.microbiology_nonculture.MicrobiologyNonculture
¶
MicrobiologyNonculture(data_directory=None, filetype=None, timezone='UTC', output_directory=None, data=None)
Bases: BaseTable
Microbiology non-culture table wrapper inheriting from BaseTable.
This class handles microbiology non-culture test data including PCR and other molecular diagnostic results while leveraging the common functionality provided by BaseTable.
Initialize the microbiology non-culture table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
None
|
timezone
|
str
|
Timezone for datetime columns |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|
Source code in clifpy/tables/microbiology_nonculture.py
Code Status¶
clifpy.tables.code_status.CodeStatus
¶
Bases: BaseTable
Code status table wrapper inheriting from BaseTable.
This class handles patient code status data including DNR, DNAR, DNR/DNI, Full Code, and other resuscitation preferences while leveraging the common functionality provided by BaseTable.
Initialize the code status table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
None
|
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
None
|
timezone
|
str
|
Timezone for datetime columns |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|