alan-turing-institute / environmental-ds-book

A computational notebook community for open environmental data science 🌎

Home Page:https://edsbook.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[REVIEW] Variational data assimilation with deep prior

acocac opened this issue · comments

Notebook Review: Issue #177

Binder Open In Colab

Submitting author: @Mukulikaa @Rutika-16

Repository: https://github.com/eds-book-gallery/reproduce-deep-prior-4Dvar

Paper: https://doi.org/10.1017/eds.2022.31

Editor: @acocac

Reviewer: @crlna16 @tinaok @polpel

Managing EiC: @acocac

Status

Reviewer instructions & questions

Hi 👋 @crlna16 @tinaok @polpel, please carry out your review in this issue by updating the checklist below.

As a reviewer, you contribute to the technical quality of the content published by our community.

Before the review, EiC checked if the submission fits the minimum requirements.

The quality of the proposed contribution can be assessed through scientific, technical and code criteria.

The reviewer guidelines are available here: https://edsbook.org/publishing/guidelines/guidelines-reviewers.html.
Any questions/concerns please let @acocac know.

Review checklist for @crlna16

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide.

Conflict of interest

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Code of conduct an peer-review principles

General checks

  • Notebook: Is the notebook file (notebook.ipynb) part of the notebook repository?
  • Contribution and authorship: Does the author list seem appropriate and complete (full name, affiliation, and GitHub/ORCID handle)?
  • Scope and eligibility: Does the submission contain an original and complete analysis according to the scope of EDS book?

Reproducibility

  • Does the notebook run in a local environment?
  • Does the notebook build and run in binder?
  • Are all data sources openly accessible and properly cited (e.g. with citation to a persistent DOI) in the heading section?

Pedagogy

  • Are the notebook purpose and highlights clear?
  • Does the notebook demonstrate some specific data analysis or visualisation techniques?
  • Is the notebook well documented, using both markdown cells and comments in code cells?
  • Does the conclusion section provide clear and concise final say on the tools, analysis and/or datasets used?
  • Is the notebook narrative well written (it does not require editing for structure, language, or writing quality)?

Ethical

  • Is any linkage of datasets in the notebook unlikely to lead to an increased risk of the personal identification of individuals?
  • Is the notebook truthful and clear about any limitations of the analysis (and potential biases in data and/or tools)?
  • Is the notebook unlikely to lead to negative social outcomes, such as (but not limited to) increasing discrimination or injustice?

Other Requirements

  • All mentioned software should be formally and consistently cited wherever possible.
  • Acronyms should be spelled out upon first mention.
  • License conditions on images and figures must be respected (Creative Commons, etc.).

Final approval (post-review)

  • Authors has responded to my review and made changes to my satisfaction. I recommend approving the notebook for publication.

Review checklist for @tinaok

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide.

Conflict of interest

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Code of conduct an peer-review principles

General checks

  • Notebook: Is the notebook file (notebook.ipynb) part of the notebook repository?
  • Contribution and authorship: Does the author list seem appropriate and complete (full name, affiliation, and GitHub/ORCID handle)?
  • Scope and eligibility: Does the submission contain an original and complete analysis according to the scope of EDS book?

Reproducibility

  • Does the notebook run in a local environment?
  • Does the notebook build and run in binder?
  • Are all data sources openly accessible and properly cited (e.g. with citation to a persistent DOI) in the heading section?

Pedagogy

  • Are the notebook purpose and highlights clear?
  • Does the notebook demonstrate some specific data analysis or visualisation techniques?
  • Is the notebook well documented, using both markdown cells and comments in code cells?
  • Does the conclusion section provide clear and concise final say on the tools, analysis and/or datasets used?
  • Is the notebook narrative well written (it does not require editing for structure, language, or writing quality)?

Ethical

  • Is any linkage of datasets in the notebook unlikely to lead to an increased risk of the personal identification of individuals?
  • Is the notebook truthful and clear about any limitations of the analysis (and potential biases in data and/or tools)?
  • Is the notebook unlikely to lead to negative social outcomes, such as (but not limited to) increasing discrimination or injustice?

Other Requirements

  • All mentioned software should be formally and consistently cited wherever possible.
  • Acronyms should be spelled out upon first mention.
  • License conditions on images and figures must be respected (Creative Commons, etc.).

Final approval (post-review)

  • Authors has responded to my review and made changes to my satisfaction. I recommend approving the notebook for publication.

Review checklist for @polpel

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide.

Conflict of interest

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Code of conduct an peer-review principles

General checks

  • Notebook: Is the notebook file (notebook.ipynb) part of the notebook repository?
  • Contribution and authorship: Does the author list seem appropriate and complete (full name, affiliation, and GitHub/ORCID handle)?
  • Scope and eligibility: Does the submission contain an original and complete analysis according to the scope of EDS book?

Reproducibility

  • Does the notebook run in a local environment?
  • Does the notebook build and run in binder?
  • Are all data sources openly accessible and properly cited (e.g. with citation to a persistent DOI) in the heading section?

Pedagogy

  • Are the notebook purpose and highlights clear?
  • Does the notebook demonstrate some specific data analysis or visualisation techniques?
  • Is the notebook well documented, using both markdown cells and comments in code cells?
  • Does the conclusion section provide clear and concise final say on the tools, analysis and/or datasets used?
  • Is the notebook narrative well written (it does not require editing for structure, language, or writing quality)?

Ethical

  • Is any linkage of datasets in the notebook unlikely to lead to an increased risk of the personal identification of individuals?
  • Is the notebook truthful and clear about any limitations of the analysis (and potential biases in data and/or tools)?
  • Is the notebook unlikely to lead to negative social outcomes, such as (but not limited to) increasing discrimination or injustice?

Other Requirements

  • All mentioned software should be formally and consistently cited wherever possible.
  • Acronyms should be spelled out upon first mention.
  • License conditions on images and figures must be respected (Creative Commons, etc.).

Final approval (post-review)

  • Authors has responded to my review and made changes to my satisfaction. I recommend approving the notebook for publication.

Additional instructions

Reviewer general comments are welcome on this REVIEW issue or directly to the notebook repository.

If you do the latter, you will find a Pull Request titled REVIEW where you can carry out the discussion with authors through ReviewNB, a third-party plugin in GitHub for displaying and commenting Jupyter Notebooks (see further details here).

In addition to ReviewNB, we suggest to explore or run the notebook in:

  • Binder (run): Click the Launch Binder button at the top level of this message.

The report below counts blank lines, comment lines, and physical lines of source code files using cloc. It was generated according to the latest commit c077106 of the review branch from the target repository.

Reviewers and authors feel free this info only for informative purposes. We will generate a similar report after the review process.

Software report (experimental):

github.com/AlDanial/cloc v 1.97  T=1.12 s (7.1 files/s, 907.3 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Jupyter Notebook                 1              0            580            192
YAML                             5             20             23            158
Markdown                         1             10              0             30
JSON                             1              0              0              5
-------------------------------------------------------------------------------
SUM:                             8             30            603            385
-------------------------------------------------------------------------------

👋 @crlna16 @tinaok @polpel we will conduct the review in this issue.

Please read through the above information and let me know if you have any questions about the review process.

Thank you 🙏

Hi @crlna16 @tinaok @polpel

Please note the training process doesn't work in Binder. We suggest to try it in Colab and change the Runtime type (Runtime > Change Runtime type > set GPU in Hardware accelerator).

Open In Colab

Hi @Mukulikaa @Rutika-16,

Thank you for the submission! I have added some specific comments on the PR page (best read on ReviewNB). Here are my more general comments:

The notebook successfully reproduces and showcases the results of the paper, which is great. My main suggestion for suggestion for improving the submission would be for the authors to expand on their documentation and add more descriptions throughout the notebook to better explain the context of the different steps of the study. This would really enhance the narrative quality of the notebook and would help it better stand on its own without the reader having to refer to the paper too frequently. Finally, it would also be great if the authors could add some comments on the value of the proposed method and ease of use of the codebase – could it facilitate further (open) science?

Hello @Mukulikaa @Rutika-16, thank you very much for your submitted notebook. Like the previous reviewer, I have added comments related to code directly in the pull request via ReviewNB.

General remarks:
The authors succeed at reproducing the results of the paper. I have two suggestions for improving the overall quality of the notebook.

  1. I felt it is not well motivated why n_samples is reduced to 20 from the original 100, is it due to limited computational ressources, or is there some convergence threshold reached? I suggest to either run for more than 20 iterations e.g. in colab, or to give a clear statement why the cutoff was chosen at 20 iterations.

  2. Overall, I had to refer to the original paper often to understand the context of the notebook. I felt that more explanations and comments would help the readability of the reproducing notebook as a standalone ressource.

@Mukulikaa @Rutika-16 may I ask your attention on above general comments left by reviewers? A reminder you can find their specific comments here.

Please let me know if you have any questions, I'm happy to help.

We'd like to thank both reviewers for their detailed feedback! We are in the process of addressing them so you can expect updates to the notebook in the next few days.

Hello @Mukulikaa and @Rutika-16,

Thank you for submitting your notebook. I have successfully reproduced the notebooks on a pangeo-eosc cloud and a MacBook with Apple M1 Max CPU. I have also added comments related to the code directly in the pull request via ReviewNB.

I have a few general remarks:

  1. It would be helpful for someone trying to reproduce your work if you provide timing information for the parts of the code that take a long time to compute.

  2. The cell that computes four steps at a time could be separated into four cells. Additionally, you can save the result at each step in different outputs. Please provide a more detailed explanation of what each step is doing.

@Mukulikaa and @Rutika-16, may I ask updates of your notebook? I'd be great if you have any estimated date when you'll implement and/or reply reviewers' comments.

Please let me know if you have any questions. I'm happy to help.

Hi @acocac, sorry for the delay. Rutika and I both had some grad school matters to take care of. We should definitely be done with the changes by the end of this week!

Hi @acocac, sorry for the delay. Rutika and I both had some grad school matters to take care of. We should definitely be done with the changes by the end of this week!

Hi @Mukulikaa thanks for the update (:

@Mukulikaa @Rutika-16 I'm wondered if you have any updates about your notebook. We're planning to start publishing them in EDS book in the coming weeks. Thanks for your efforts on this!

@Mukulikaa @Rutika-16 we're already starting preparing the publication of notebooks submitted to the Reproducibility Challenge (see examples in the Pull Request tab). We appreciate if you can go through the comments of the reviewers in your notebook, and implement their suggestions if pertinent. The overall aim is to improve the quality of the notebook through the community-based open review offered by EDS book. Please let us know if you need any help on this.

👋 reviewers @crlna16 @tinaok @polpel, fyi we're already started publishing submitted notebooks to 2023 Climate Informatics Reproducibility Challenge. Due to the slow response from authors of this notebook to your feedback, EDS book maintainers have decided to implement exceptionally most suggested changes related to styling, syntax and readability. Note we will open issues in the notebook repository for very technical changes and we will invite EDS book community to contribute.

We really appreciate your effort going through the notebook and contribute to improve its quality. If everything is ok, we will announce the publication between Monday and Tuesday next week.