Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-using columns in read_wide_csv_file_if while reading data #83

Open
rgieseke opened this issue Nov 17, 2021 · 1 comment
Open

Re-using columns in read_wide_csv_file_if while reading data #83

rgieseke opened this issue Nov 17, 2021 · 1 comment
Assignees
Labels
enhancement New feature or request priority: medium medium priority issue. Good to solve soon if possible

Comments

@rgieseke
Copy link

Describe the bug
Following up on the discussion in #82 read_wide_csv_file_if doesn't support re-using a source column like

Unfortunately, read_wide_csv_file_if doesn't support re-using a source column multiple times in coords_cols because behind the scenes it just does a renaming. Probably something it should support, so maybe worth opening a bug report - but I can't commit to when I will have time to fix it.

Example code:

file = "rcmip-emissions-annual-means-v5-1-0.csv"
coords_cols = {
    "unit": "Unit",
    "area": "Region",
    "model": "Model",
    "scenario": "Scenario",
    "entity": "Variable",
    "category": "Variable"
}
coords_defaults = {
    "source": "RCMIP",
}
coords_terminologies = {
    "area": "RCMIP",
    "category": "RCMIP",
}
coords_value_mapping = {
    "entity": map_variables
}
meta_data = {
    "rights": "CC BY 4.0 International",
}
data_if = pm2.pm2io.read_wide_csv_file_if(
    file,
    coords_cols=coords_cols,
    coords_defaults=coords_defaults,
    coords_terminologies=coords_terminologies,
    coords_value_mapping=coords_value_mapping,
    meta_data=meta_data,
    filter_keep={"f1": {
        "Model": "CEDS/UVA/GCP/PRIMAP",
    }}
)
data_if

Fails with KeyError.

Expected behavior

Allow re-using a column when reading the data.

Potential workaround is described in #82

@JGuetschow
Copy link
Contributor

Yes, that would indeed make sense. I currently copy the column before reading the data. I only needed it for less important columns like the category name in the original data (before mapping to e.g. IPCC2006 terminologies). But I realize that it's necessary for all the IIASA database type data and thus we should add it soon. I'll assign the issue to me, but can't promise I'll implement it i the next weeks.

@JGuetschow JGuetschow self-assigned this Dec 8, 2021
@JGuetschow JGuetschow added enhancement New feature or request priority: medium medium priority issue. Good to solve soon if possible labels Mar 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority: medium medium priority issue. Good to solve soon if possible
Projects
None yet
Development

No branches or pull requests

2 participants