standardize_indicator_id()

Fix the INDICATOR column to be uppercase and prefixed with dataset ID.

Usage

Source

standardize_indicator_id(df)

Ensures all values in the INDICATOR column are upper case, prefixed with the dataset identifier, and have dots replaced with underscores.

Parameters

df: pd.DataFrame
The DataFrame to modify. Must contain an INDICATOR column and either a DATABASE_ID or DATASET_ID column.

Returns

pd.DataFrame
The modified DataFrame with corrected INDICATOR values.

Raises

ValueError
If the database/dataset ID column contains more than one unique value.

Examples

>>> df = pd.DataFrame({
...     "DATABASE_ID": ["WB.DATA360", "WB.DATA360"],
...     "INDICATOR": ["indicator.one", "indicator.two"],
... })
>>> result = standardize_indicator_id(df)
>>> list(result["INDICATOR"])
['WB_DATA360_INDICATOR_ONE', 'WB_DATA360_INDICATOR_TWO']