standardize_indicator_id()
Fix the INDICATOR column to be uppercase and prefixed with dataset ID.
Usage
standardize_indicator_id(df)Ensures all values in the INDICATOR column are upper case, prefixed with the dataset identifier, and have dots replaced with underscores.
Parameters
df: pd.DataFrame-
The DataFrame to modify. Must contain an
INDICATORcolumn and either aDATABASE_IDorDATASET_IDcolumn.
Returns
pd.DataFrame- The modified DataFrame with corrected INDICATOR values.
Raises
ValueError- If the database/dataset ID column contains more than one unique value.
Examples
>>> df = pd.DataFrame({
... "DATABASE_ID": ["WB.DATA360", "WB.DATA360"],
... "INDICATOR": ["indicator.one", "indicator.two"],
... })
>>> result = standardize_indicator_id(df)
>>> list(result["INDICATOR"])
['WB_DATA360_INDICATOR_ONE', 'WB_DATA360_INDICATOR_TWO']