Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xclim : Xarray-based climate data analytics #73

Closed
18 of 23 tasks
Zeitsperre opened this issue Jan 16, 2023 · 60 comments
Closed
18 of 23 tasks

Xclim : Xarray-based climate data analytics #73

Zeitsperre opened this issue Jan 16, 2023 · 60 comments
Assignees
Labels
6/pyOS-approved 9/joss-approved Pangeo Community partner tag for pangeo

Comments

@Zeitsperre
Copy link

Zeitsperre commented Jan 16, 2023

Submitting Author: Trevor James Smith (@Zeitsperre)
All current maintainers: (@Zeitsperre, @tlogan2000, @aulemahal)
Package Name: xclim
One-Line Description of Package: Climate indices computation package based on Xarray
Repository Link:  https://github.com/Ouranosinc/xclim
Version submitted: v0.40.0
Editor: @Batalex
Reviewer 1: @jmunroe
Reviewer 2: @aguspesce
Archive: DOI
JOSS DOI: DOI
Version accepted: v0.42.0
Date accepted (month/day/year): 04/11/2023


Description

xclim is an operational Python library for climate services, providing numerous climate-related indicator tools with an extensible framework for constructing custom climate indicators, statistical downscaling and bias adjustment of climate model simulations, as well as climate model ensemble analysis tools.

xclim is built using xarray_ and can seamlessly benefit from the parallelization handling provided by dask. Its objective is to make it as simple as possible for users to perform typical climate services data treatment workflows. Leveraging xarray and dask, users can easily bias-adjust climate simulations over large spatial domains or compute indices from large climate datasets.

Scope 

  • Please indicate which category or categories this package falls under:
  • Data retrieval
  • Data extraction
  • Data munging
  • Data deposition
  • Reproducibility
  • Geospatial
  • Education
  • Data visualization*

Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see notes on categories of our guidebook.

  • For all submissions, explain how and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each): 

  - Who is the target audience, and what are scientific applications of this package?

xclim aims to position itself as a climate services tool for any researchers interested in using Climate and Forecast Conventions compliant datasets to perform climate analyses. This tool is optimized for working with Big Data in the climate science domain and can function as an independent library for one-off analyses in Jupyter notebooks or as a backend engine for performing climate data analyses over PyWPS (e.g. Finch). It was primarily developed targeting earth and environmental science audiences and researchers, originally for calculating climate indicators for the Canadian government web service ClimateData.ca.

The primary domains that xclim is built for are in calculating climate indicators, performing statistical correction / bias adjustment of climate model output variables/simulations, and in performing climate model simulation ensemble statistics.

  - Are there other Python packages that accomplish the same thing? If so, how does yours differ?

icclim is another library for the computation of climate indices. Starting with version 5.0 of icclim, some of the core computations rely on xclim. See explanations about differences between xclim and icclim.

scikit-downscale is a library offering algorithms for statistical downscaling. xclim drew inspiration for its fit-predict architecture. The suite of downscaling algorithms offered differ.

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box.  This package:

  • does not violate the Terms of Service of any service it interacts with. 
  • has an OSI approved license.
  • contains a README with instructions for installing the development version. (Under CONTRIBUTING)
  • includes documentation with examples for all functions.
  • contains a vignette with examples of its essential functions and uses.
  • has a test suite.
  • has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.

Publication options

 JOSS Checks  
  • The package has an obvious research application according to JOSS's definition in their submission requirements. Be aware that completing the pyOpenSci review process does not guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS.
  • The package is not a "minor utility" as defined by JOSS's submission requirements: "Minor 'utility' packages, including 'thin' API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria.
  • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
  • The package is deposited in a long-term repository with the DOI: https://doi.org/10.5281/zenodo.2795043

Note: Do not submit your package separately to JOSS
  

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

  • Yes, I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

Code of conduct

Please fill out our survey

P.S. *Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

Editor and review templates can be found here

@Zeitsperre
Copy link
Author

To whom it may concern, our paper is available on a separate branch within xclim. The Pull Request containing the paper can be found here: Ouranosinc/xclim#250

@lwasser
Copy link
Member

lwasser commented Jan 17, 2023

hi @Zeitsperre welcome to pyOpenSci!! are you a part of the pangeo community by chance? i think i saw xclim discussed in a few pangeo threads.

@Zeitsperre
Copy link
Author

Hi @lwasser, nice to meet you!

We are not formally part of the Pangeo community (we do not receive compensation from them), however xclim is mentioned in their list of xarray-based projects (https://docs.xarray.dev/en/latest/ecosystem.html#ecosystem). A handful of xclim's developers have also made significant contributions to packages in the Pangeo ecosystem as well (xarray, xesmf, flox, cf-xarray, etc.) and some key developers of xarray have contributed to xclim, through issue raising or code contributions. I think it's safe to say that the objectives of xclim are in line with those of Pangeo.

@lwasser
Copy link
Member

lwasser commented Jan 17, 2023

awesome! nice to meet you as well @Zeitsperre ! I asked because we are working on a partnership with pangeo to curate that list via peer review. as a part of that curation, we have a small set of checks that are specific to the pangeo community including:

  • ​​Consume and produce high-level data structures (e.g. xarray datasets / pandas dataframes) wherever feasible
  • ​​Operate lazily when called on dask data structure
  • ​​Avoid file I/O unless specifically an I/O package

So we'd want to include you in that given your are in the list of tools. i haven't done a deep dive into your package but these are universal pangeo standards.If the review goes smoothly, we'd then tag your package as being pangeo community related / vetted.

do this package adhere to the above pangeo standards? Many thanks - i'm just trying to figure out the best path here to support our partnership :) i just started working on this and have an open pr here to update our review submission template!

@Zeitsperre
Copy link
Author

I believe we tick all the boxes you listed:

  • ​​Consume and produce high-level data structures (e.g. xarray datasets / pandas dataframes) wherever feasible (all functions operate on either xr.DataArray or xr.Dataset objects)
  • ​​Operate lazily when called on dask data structure (both indices and indicators seemlessly integrate with dask)
  • ​​Avoid file I/O unless specifically an I/O package (xarray is left to handle all file loading or writing)

I know that the core developers of xclim would gladly welcome feedback on how to better organize/structure our code base, and xclim being considered high enough in quality to be formally Pangeo-endorsed would be great!

@Zeitsperre
Copy link
Author

Just to add some clarification, I just remembered that we have some functionality to build indicator catalogues using YAML configurations (https://xclim.readthedocs.io/en/latest/notebooks/extendxclim.html#YAML-file).

We also built a translation engine for providing multi-language climate indicator metadata based on JSON descriptions (currently, we have translations to French, but any other languages are supported; https://xclim.readthedocs.io/en/latest/internationalization.html).

These files are loaded on import or can be supplied by the user explicitly and are solely used for configuration. Hope this helps!

@lwasser
Copy link
Member

lwasser commented Jan 17, 2023

perfect! Thank you @Zeitsperre for the speedy replies! I am going to followup and also get our editor in chief involved @NickleDave I stepped in only because i'm actively working with pangeo now so i just wanted to check in to see if this review could be supported via that collaboration as well. And it does sound like it could. more soon!

@NickleDave
Copy link
Contributor

Welcome @Zeitsperre!

At first glance looks like you have got most everything in line with our initial checks.

I will add the completed checks by end of day tomorrow.

@NickleDave
Copy link
Contributor

NickleDave commented Jan 18, 2023

Hi again @Zeitsperre -- Happy to report that xclim passed initial editor-in-chief checks (below) with flying colors.

I have one suggestion, which we strongly recommend, but do not require:

  • switch to pyproject.toml with static metadata, to adhere to pep 517 + 621, away from legacy setup.py file
    • looks like you mainly use setup.py to clean up README for build if I'm not mistaken?
    • I wonder if you could use other tools to the same ends? E.g., if you use hatch in place of setuptools there's a plug-in that will "prettify" the README for you, see https://talkpython.fm/episodes/show/395/tools-for-readme.md-creation-and-maintenance
    • I know you have a lot of config stuff in the setup.cfg, so switching to another dev tool might be a hassle, but having metadata in the pyproject.toml instead of a setup.py would be ideal

We expect to know how we will go about selecting editors by the end of day Monday (Jan. 23rd).
@lwasser needs to talk a little more with people about the Pangeo collaboration.
We will report back to you then with a timeline for the review.

Editor in Chief Checks

Please check our Python packaging guide for more information on the elements
below.

  • Installation The package can be installed from a community repository such as PyPI (preferred), and/or a community channel on conda (e.g. conda-forge, bioconda).
    • The package imports properly into a standard Python environment import package-name.
      • verified for both pip (version 0.40.0)
  • Fit The package meets criteria for fit and overlap.
  • Documentation The package has sufficient online documentation to allow us to evaluate package function and scope without installing the package. This includes:
    • User-facing documentation that overviews how to install and start using the package.
      • two notes:
        • I wonder if it would help to move some text from about.md to the index? So a user immediately knows what xclim is and does. Getting hit with a long table of contents can be daunting
        • "Basic usage" is great although looks like it could use maybe use some very minor revision of text just for flow + readability
    • Short tutorials that help a user understand how to use the package and what it can do for them.
      • plenty of tutorials, looks excellent
    • API documentation (documentation for your code's functions, classes, methods and attributes): this includes clearly written docstrings with variables defined using a standard docstring format. We suggest using the Numpy docstring format.
  • Core GitHub repository Files
    • README The package has a README.md file with clear explanation of what the package does, instructions on how to install it, and a link to development instructions.
    • Contributing File The package has a CONTRIBUTING.md file that details how to install and contribute to the package.
    • Code of Conduct The package has a Code of Conduct file.
    • License The package has an OSI approved license.
      NOTE: We prefer that you have development instructions in your documentation too.
  • Issue Submission Documentation All of the information is filled out in the YAML header of the issue (located at the top of the issue template).
  • Automated tests Package has a testing suite and is tested via GitHub actions or another Continuous Integration service.
  • Repository The repository link resolves correctly.
  • Package overlap The package doesn't entirely overlap with the functionality of other packages that have already been submitted to pyOpenSci.
  • Archive (JOSS only, may be post-review): The repository DOI resolves correctly.
  • Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

  • Initial onboarding survey was filled out
    We appreciate each maintainer of the package filling out this survey individually. 🙌
    Thank you authors in advance for setting aside five to ten minutes to do this. It truly helps our organization. 🙌


Editor comments

@Zeitsperre
Copy link
Author

Thanks, @NickleDave! I'm glad to see that all the hours spent reading PEPs and adopting good practices has been helpful here!

I'll be opening a PR later today to address the PEP 517 and 621 compliance comment. We don't do anything too fancy in our setup.py, so I don't imagine it should be difficult to adopt pyproject.toml. I'll link that here once it's underway.

@Zeitsperre
Copy link
Author

@NickleDave

I just merged changes to address some of your comments, primarily for pyproject.toml and the ReadMe suggestion. I opted to adopt flit as it seemed to be a very stable and no-nonsense setup engine. It took a bit of time to migrate some configurations around, but it looks/works much better.

Agreed on your point about Basic Usage. I'll be opening a ticket to address that down-the-line. There are a fair amount of Pull Requests currently open, but we tend to address things in a reasonable amount of time.

Let me know if you have any other suggestions, thanks!

@NickleDave
Copy link
Contributor

NickleDave commented Jan 24, 2023

Hi @Zeitsperre!

I just merged changes to address some of your comments

Awesome! Glad to hear it.

Agreed on your point about Basic Usage.

Thank you for even making note of that. It was just a nitpick and really more of a list to give an editor things they might want to revisit. Your docs look really great over all and I'm sure they'll look better after the review.

I'm sorry for not getting back to you by the end of the day yesterday--totally my fault.

I want to let you know we do have an editor, @Batalex! 🎉🎉🎉 who will introduce themselves and officially start the review in the next couple of days. We are actively recruiting reviewers right now as well.

@Batalex
Copy link
Contributor

Batalex commented Jan 25, 2023

Hi @Zeitsperre, I am pleased to meet you!
I will be the editor for this review, and I will get up to speed during the next few days with the pre-submission discussions, the editorial process and the package itself before recruiting reviewers.
I appreciate your patience; I am looking forward to working with you 🫡

@Zeitsperre
Copy link
Author

Hello @Batalex, nice to meet you as well!

That all sounds great! I'm looking forward to hearing how we can improve xclim. Feel free to ping me if you have any questions or see anything obvious that could be improved in advance of the formal peer-review process.

Thanks for your time and effort towards this!

@Batalex
Copy link
Contributor

Batalex commented Feb 2, 2023

👋 Hi @jmunroe and @aguspesce! Thank you for volunteering to review Xclim for pyOpenSci!

Meet @Zeitsperre, our lovely submission author, and feel free to introduce yourselves 🥰

The following resources will help you complete your review:

  1. Here is the reviewers guide. This guide contains all the steps and information needed to complete your review.
  2. You will need the review template to fill out and submit in this issue as a comment once your review is complete.

Please get in touch with any questions or concerns! I just wanted to let you know that your review is due: Feb 26th.

Also, I'd like to point out that the next PR should be part of your review: Ouranosinc/xclim#250.

PS: I'm sorry for the information dump. Please make sure that you review the submitted version. However, some issues you could point out might have already been solved in the meantime, but that's ok.

@Zeitsperre
Copy link
Author

Hey all,

I just wanted to keep people in the loop, but our next version (v0.41) is slated for release on February 24, 2023. Assuming that the degree of changes suggested/recommended is reasonable to address, I feel that the xclim devs can likely prioritize adjustments for the subsequent release (v0.42).

One question that we had concerning the JOSS paper was if it is required that the paper branch be pushed to the main development branch, or if it can be hosted solely in its own branch? What is typically performed concerning the JOSS requirements?

@NickleDave
Copy link
Contributor

https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements

Your paper (paper.md and BibTeX files, plus any figures) must be hosted in a Git-based repository together with your software (although they may be in a short-lived branch which is never merged with the default).

@aguspesce
Copy link

Hello @Zeitsperre,
You need to put the paper in the main branch, as @NickleDave told you.

@NickleDave
Copy link
Contributor

NickleDave commented Feb 7, 2023

I might be reading their docs wrong, but I understood it to say that the paper files can be on some other branch that never gets merged with main.

See for example this published paper that has the files in a joss-rev branch: https://github.com/parmoo/parmoo/commits/joss-rev

@Zeitsperre
Copy link
Author

I might be reading their docs wrong, but I understood it to say that the paper files can be on some other branch that never gets merged with main.

That's what I read as well, but I wanted to confirm. In any case, we can cross that bridge when we get there. The paper is still up-to-date with the current aims and structure of the package, but I would like to add a few comments and clarifications here and there; I anticipate that we'll be modifying it some more in the coming weeks.

@Batalex
Copy link
Contributor

Batalex commented Apr 12, 2023

@all-contributors
please add @Zeitsperre for code
please add @tlogan2000 for code
please add @aulemahal for code
please add @aguspesce for review
please add @jmunroe for review

@Batalex
Copy link
Contributor

Batalex commented Apr 12, 2023

@Zeitsperre, @tlogan2000, @aulemahal
We would love to have you on our Slack if you'd like!
Give me a shout, and we will do so

@Zeitsperre
Copy link
Author

@Batalex Sure! Feel free to add me! (email(s) are in https://github.com/Ouranosinc/xclim/blob/master/AUTHORS.rst)

@Batalex
Copy link
Contributor

Batalex commented Apr 12, 2023

@all-contributors
please add @aulemahal for code
please add @aguspesce for review
please add @jmunroe for review

Plz, be a good bot so I can stop spamming people

@tlogan2000
Copy link

@Zeitsperre, @tlogan2000, @aulemahal We would love to have you on our Slack if you'd like! Give me a shout, and we will do so

Yes please. That would be great

@allcontributors
Copy link
Contributor

@Batalex

I've put up a pull request to add @aulemahal! 🎉

I've put up a pull request to add @aguspesce! 🎉

We had trouble processing your request. Please try again later.

@Batalex
Copy link
Contributor

Batalex commented Apr 12, 2023

@all-contributors please add @jmunroe for review

bad bad bot

@allcontributors
Copy link
Contributor

@Batalex

I've put up a pull request to add @jmunroe! 🎉

Zeitsperre added a commit to Ouranosinc/xclim that referenced this issue May 19, 2023
### What kind of change does this PR introduce?

This PR finalizes the submission and publishing steps for the [Journal
of Open Source
Software](https://joss.readthedocs.io/en/latest/index.html). The paper
is expected to have a length of 250 - 1000 words and demands that the
software is *feature-complete*. As such, the submission process should
not be started until we have at the very least added all necessary
indicators from our backlog and/or stabilized our API (i.e.: v1.0-alpha
or release-candidate).

Updates (May 2023): 
With xclim v0.40.0, the software was deemed ready for submission. The
review process for JOSS was completed via PyOpenSci
(pyOpenSci/software-submission#73), and the
final JOSS review was performed in
openjournals/joss-reviews#5415. The software
(v0.43.0) and paper were published on 18 May 2023
(https://doi.org/10.21105/joss.05415).

### Does this PR introduce a breaking change?

No.

---------

Co-authored-by: Philippe Roy <borghor@yahoo.ca>
Co-authored-by: David Huard <huard.david@ouranos.ca>
Co-authored-by: Abel Aoun <aoun@cerfacs.fr>
Co-authored-by: Pascal Bourgault <bourgault.pascal@ouranos.ca>
Co-authored-by: tlogan2000 <logan.travis@ouranos.ca>
@Zeitsperre
Copy link
Author

Hi @Batalex and @lwasser

The article was just published yesterday in JOSS! Here's the review: openjournals/joss-reviews#5415. I think we're good to close this issue!

@lwasser
Copy link
Member

lwasser commented May 22, 2023

hi there @Zeitsperre !! what wonderful news!! congratulations on the JOSS acceptance. I just tagged this review as JOSS-approved and will close it!! were there any hiccups in the JOSS side of things from your perspective? or did it go smoothly?

@lwasser lwasser closed this as completed May 22, 2023
@lwasser
Copy link
Member

lwasser commented Jun 7, 2023

hey there @Zeitsperre just following up on this review! i hope the JOSS component is going well. I have a small request - can you take 5-10 minutes to fill out our post-review survey please? i'd greatly appreciate it. The most important part for us it how the review impacted / improved etc your package (if it did) and feedback on the process. many thanks in advance for doing this!!

@Zeitsperre
Copy link
Author

Hi Leah, for sure. I have some time this evening.

The JOSS side of things went relatively well. I think there was some confusion over the order in which things needed to get done (the release with all reviewer comment changes needs to happen first before it can be formally approved and published in JOSS, which wasn't clear at first).

Other than that, no issues. It was relatively easy. I have some thoughts that could maybe benefit the pyOpenSci-side, so I'll share them in the survey.

Thanks again!

@lwasser
Copy link
Member

lwasser commented Jun 7, 2023

oh yes - any thoughts you have that could help us clarify what the process looks like would be greatly appreciated. many thanks for this. i'm happy to update our review guides with clearer guidance on that side of things as it makes sense as well. Thank you so much, Trevor for both filling this out and for the input on the JOSS process!

@lwasser
Copy link
Member

lwasser commented Jun 14, 2023

hey @Zeitsperre i wanted to invite you / your maintainer team to write a blog post (totally optional) on your package for us to promote your work! if you are interested - here are a few examples of other blog posts:

pandera
movingpandas

and here is a markdown example that you could use as a guide when creating your post.

it can even be a tutorial like post that highlights what your package does. then we can share it with people to get the word out about your package.

If you are too busy for this no worries. But if you have time - we'd love to spread the word about your package!

@lwasser lwasser added the Pangeo Community partner tag for pangeo label Sep 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6/pyOS-approved 9/joss-approved Pangeo Community partner tag for pangeo
Projects
Status: joss-accepted
Development

No branches or pull requests

7 participants