BaseTable¶
clifpy.tables.base_table.BaseTable
¶
Base class for all pyCLIF table classes.
Provides common functionality for loading data, running validations, and generating reports. All table-specific classes should inherit from this.
Attributes:
Name | Type | Description |
---|---|---|
data_directory |
str
|
Path to the directory containing data files |
filetype |
str
|
Type of data file (csv, parquet, etc.) |
timezone |
str
|
Timezone for datetime columns |
output_directory |
str
|
Directory for saving output files and logs |
table_name |
str
|
Name of the table (from class name) |
df |
DataFrame
|
The loaded data |
schema |
dict
|
The YAML schema for this table |
errors |
List[dict]
|
Validation errors from last validation run |
logger |
Logger
|
Logger for this table |
Initialize the BaseTable.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
required |
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
required |
timezone
|
str
|
Timezone for datetime columns |
required |
output_directory
|
str
|
Directory for saving output files and logs. If not provided, creates an 'output' directory in the current working directory. |
None
|
data
|
DataFrame
|
Pre-loaded data to use instead of loading from file |
None
|
Source code in clifpy/tables/base_table.py
from_file
classmethod
¶
from_file(
data_directory,
filetype,
timezone="UTC",
output_directory=None,
sample_size=None,
columns=None,
filters=None,
)
Load data from file and create a table instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_directory
|
str
|
Path to the directory containing data files |
required |
filetype
|
str
|
Type of data file (csv, parquet, etc.) |
required |
timezone
|
str
|
Timezone for datetime columns (default: UTC) |
'UTC'
|
output_directory
|
str
|
Directory for saving output files and logs |
None
|
sample_size
|
int
|
Number of rows to load |
None
|
columns
|
List[str]
|
Specific columns to load |
None
|
filters
|
Dict
|
Filters to apply when loading |
None
|
Returns:
Type | Description |
---|---|
Instance of the table class with loaded data |
Source code in clifpy/tables/base_table.py
get_summary
¶
Get a summary of the table data.
Returns:
Name | Type | Description |
---|---|---|
dict |
Dict[str, Any]
|
Summary statistics and information about the table |
Source code in clifpy/tables/base_table.py
isvalid
¶
Check if the data is valid based on the last validation run.
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if validation has been run and no errors were found, False if validation found errors or hasn't been run yet |
Source code in clifpy/tables/base_table.py
save_summary
¶
Save table summary to a JSON file.
Source code in clifpy/tables/base_table.py
validate
¶
Run comprehensive validation on the data.
This method runs all validation checks including: - Schema validation (required columns, data types, categories) - Missing data analysis - Duplicate checking - Statistical analysis - Table-specific validations (if overridden in child class)