validate_codelist_ids()

Validate that all values in coded columns are within the allowed codelist IDs.

Usage

Source

validate_codelist_ids(
    df,
    codelist_ids,
    max_errors=1000,
)

Reports violations across all coded columns in a single error, capped at max_errors entries.

Parameters

df: pd.DataFrame

The DataFrame to validate.

codelist_ids: dict[str, list[str]]

Mapping of column name to list of allowed code IDs.

max_errors: int = 1000
Maximum number of invalid-value entries to include in the error message across all columns. Defaults to 1000.

Raises

ValueError
If any value in a coded column is not in the allowed IDs, listing all offending values (up to max_errors) with their column.