Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PRE REVIEW]: Cost-Effective Big Data Orchestration Using Dagster: A Multi-Platform Approach #7267

Closed
editorialbot opened this issue Sep 23, 2024 · 80 comments
Assignees
Labels
Makefile pre-review Python TeX Track: 7 (CSISM) Computer science, Information Science, and Mathematics

Comments

@editorialbot
Copy link
Collaborator

editorialbot commented Sep 23, 2024

Submitting author: @HPicatto (Hernan Picatto)
Repository: https://github.com/ascii-supply-networks/ascii-hydra
Branch with paper.md (empty if default branch): main
Version: v1.0.0
Editor: @HaoZeke
Reviewers: @abhishektiwari, @Midnighter
Managing EiC: Daniel S. Katz

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/b0bfeb890f8d120de4e13dd52f9d5177"><img src="https://joss.theoj.org/papers/b0bfeb890f8d120de4e13dd52f9d5177/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/b0bfeb890f8d120de4e13dd52f9d5177/status.svg)](https://joss.theoj.org/papers/b0bfeb890f8d120de4e13dd52f9d5177)

Author instructions

Thanks for submitting your paper to JOSS @HPicatto. Currently, there isn't a JOSS editor assigned to your paper.

@HPicatto if you have any suggestions for potential reviewers then please mention them here in this thread (without tagging them with an @). You can search the list of people that have already agreed to review and may be suitable for this submission.

Editor instructions

The JOSS submission bot @editorialbot is here to help you find and assign reviewers and start the main review. To find out what @editorialbot can do for you type:

@editorialbot commands
@editorialbot editorialbot added pre-review Track: 7 (CSISM) Computer science, Information Science, and Mathematics labels Sep 23, 2024
@editorialbot
Copy link
Collaborator Author

Hello human, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

✅ OK DOIs

- 10.1016/j.future.2023.12.026 is OK
- 10.1145/2934664 is OK
- 10.1145/3472883.3486982 is OK
- 10.1145/3514221.3526054 is OK
- 10.1007/s11192-020-03726-9 is OK
- 10.5281/zenodo.7196590 is OK

🟡 SKIP DOIs

- No DOI given, and none found for title: Dagster | Cloud-native Orchestration of Data Pipel...
- No DOI given, and none found for title: Cost efficient alternative to databricks lock-in

❌ MISSING DOIs

- None

❌ INVALID DOIs

- None

@editorialbot
Copy link
Collaborator Author

Software report:

github.com/AlDanial/cloc v 1.90  T=0.04 s (986.8 files/s, 128171.7 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          29            568            416           3557
Markdown                         3            126              0            442
TOML                             4             27              2            128
TeX                              1              1              0            103
make                             1              6             21             51
JSON                             4              0              0              7
-------------------------------------------------------------------------------
SUM:                            42            728            439           4288
-------------------------------------------------------------------------------

Commit count by author:

    24	geoHeil
     9	Georg Heiler
     8	HPicatto
     2	Hernan
     1	CI Hotfix

@editorialbot
Copy link
Collaborator Author

Paper file info:

📄 Wordcount for paper.md is 2025

✅ The paper includes a Statement of need section

@editorialbot
Copy link
Collaborator Author

License info:

✅ License found: MIT License (Valid open source OSI approved license)

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@editorialbot
Copy link
Collaborator Author

Five most similar historical JOSS papers:

High-performance neural population dynamics modeling enabled by scalable computational infrastructure
Submitting author: @a9p
Handling editor: @emdupre (Active)
Reviewers: @richford, @tachukao
Similarity score: 0.6596

fseval: A Benchmarking Framework for Feature Selection and Feature Ranking Algorithms
Submitting author: @dunnkers
Handling editor: @diehlpk (Active)
Reviewers: @mcasl, @estefaniatalavera
Similarity score: 0.6578

EspressoDB: A scientific database for managing high-performance computing workflows
Submitting author: @ckoerber
Handling editor: @gkthiruvathukal (Active)
Reviewers: @remram44, @ixjlyons
Similarity score: 0.6475

SCAS dashboard: A tool to intuitively and interactively analyze Slurm cluster usage
Submitting author: @wathom
Handling editor: @danielskatz (Active)
Reviewers: @aturner-epcc, @phargogh
Similarity score: 0.6428

strucscan: A lightweight Python-based framework for high-throughput material simulation
Submitting author: @thohamm
Handling editor: @ppxasjsm (Active)
Reviewers: @mturiansky, @wcwitt
Similarity score: 0.6416

⚠️ Note to editors: If these papers look like they might be a good match, click through to the review issue for that paper and invite one or more of the authors before considering asking the reviewers of these papers to review again for JOSS.

@danielskatz
Copy link

👋 @HPicatto - thanks for your submission.

Before we proceed, there are a few items that need updating:

  1. For the affiliations, we don't really need addresses, just institutions and countries.
  2. Your paper is too long. JOSS requests papers that are roughly 250-1000 words, and yours is over twice as long. See the example paper. Perhaps there is material you can remove and either move to the README or elsewhere in the repo, or in the documentation? (For example, the implementation challenges section might fit a traditional paper about the software, but it doesn't really fit a JOSS paper. And the platform comparison part might be moved to the repo.) Feel free to use the command @editorialbot check repository to run some checks, one of which calculates the word count of the paper. editorialbot commands need to be the first entry in a new comment.

Once these changes are made, please ping me and we can get the review started.

@HPicatto
Copy link

Hi @danielskatz,
We have finished resizing the content in an internal branch, and we moved two sections into .md files. Could you guide us on how we should reference these sections within the main Markdown file so that they are properly linked in the paper? Should we use the final GitHub URL, or is there a specific way that JOSS handles appendices or external references?
Thank you for your help!

@danielskatz
Copy link

JOSS does not consider these appendices, but they are external references, so you can just use URLs to them, saying that they are in the GitHub repo

@HPicatto
Copy link

HPicatto commented Oct 4, 2024

Hi @danielskatz we updated main, could you please give us new feedback

@danielskatz
Copy link

@HPicatto - All the commands I'm now going to run to check the length of the paper, check references, and regenerate the paper are all commands you can run too. Note that editorialbot commands need to be the first entry in a new comment.

@danielskatz
Copy link

@editorialbot check repository

@danielskatz
Copy link

@editorialbot check references

@danielskatz
Copy link

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

Software report:

github.com/AlDanial/cloc v 1.90  T=0.04 s (992.2 files/s, 124633.3 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          31            585            427           3606
Markdown                         3            126              0            442
TOML                             4             27              1            124
TeX                              1              1              0            103
make                             1              6             21             51
JSON                             4              0              0              7
-------------------------------------------------------------------------------
SUM:                            44            745            449           4333
-------------------------------------------------------------------------------

Commit count by author:

    26	geoHeil
     9	Georg Heiler
     8	HPicatto
     2	CI Hotfix
     2	Hernan

@editorialbot
Copy link
Collaborator Author

Paper file info:

📄 Wordcount for paper.md is 2025

✅ The paper includes a Statement of need section

@editorialbot
Copy link
Collaborator Author

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

✅ OK DOIs

- 10.1016/j.future.2023.12.026 is OK
- 10.1145/2934664 is OK
- 10.1145/3472883.3486982 is OK
- 10.1145/3514221.3526054 is OK
- 10.1007/s11192-020-03726-9 is OK
- 10.5281/zenodo.7196590 is OK

🟡 SKIP DOIs

- No DOI given, and none found for title: Dagster | Cloud-native Orchestration of Data Pipel...
- No DOI given, and none found for title: Cost efficient alternative to databricks lock-in

❌ MISSING DOIs

- None

❌ INVALID DOIs

- None

@editorialbot
Copy link
Collaborator Author

License info:

✅ License found: MIT License (Valid open source OSI approved license)

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@editorialbot
Copy link
Collaborator Author

Five most similar historical JOSS papers:

High-performance neural population dynamics modeling enabled by scalable computational infrastructure
Submitting author: @a9p
Handling editor: @emdupre (Active)
Reviewers: @richford, @tachukao
Similarity score: 0.6595

fseval: A Benchmarking Framework for Feature Selection and Feature Ranking Algorithms
Submitting author: @dunnkers
Handling editor: @diehlpk (Active)
Reviewers: @mcasl, @estefaniatalavera
Similarity score: 0.6576

EspressoDB: A scientific database for managing high-performance computing workflows
Submitting author: @ckoerber
Handling editor: @gkthiruvathukal (Active)
Reviewers: @remram44, @ixjlyons
Similarity score: 0.6474

SCAS dashboard: A tool to intuitively and interactively analyze Slurm cluster usage
Submitting author: @wathom
Handling editor: @danielskatz (Active)
Reviewers: @aturner-epcc, @phargogh
Similarity score: 0.6427

strucscan: A lightweight Python-based framework for high-throughput material simulation
Submitting author: @thohamm
Handling editor: @ppxasjsm (Active)
Reviewers: @mturiansky, @wcwitt
Similarity score: 0.6414

⚠️ Note to editors: If these papers look like they might be a good match, click through to the review issue for that paper and invite one or more of the authors before considering asking the reviewers of these papers to review again for JOSS.

@danielskatz
Copy link

@HPicatto - One issue is that the paper still appears to be twice a long as JOSS recommends. As I skim it, the Implementation Challenges section seems to not really fit JOSS's model, though if this was a paper in another venue about the process of developing the software, it would make sense. That's the only section I would suggest removing, and without it, I would be ok going ahead with the review even if the paper was still a bit long.

Another minor issue is that we don't need your mailing addresses in your affiliations - just institution and country is enough, along with unit (e.g. department, division, etc.) if you want.

Thanks for the progress!

@HPicatto
Copy link

@editorialbot check repository

@HPicatto
Copy link

@editorialbot check references

@HPicatto
Copy link

Hi, is there any update on this?

@HPicatto
Copy link

HPicatto commented Dec 6, 2024

Hi @danielskatz do you have any idea on how is the normal process for this kind of papers?
thanks

@danielskatz
Copy link

This does seem to be going slower than I would expect.

👋 @HaoZeke - are you able to push this forward to get reviewers and to start the review?

@danielskatz
Copy link

@editorialbot remind me in 5 days

@editorialbot
Copy link
Collaborator Author

Reminder set for @danielskatz in 5 days

@HaoZeke
Copy link
Member

HaoZeke commented Dec 7, 2024

This does seem to be going slower than I would expect.

👋 @HaoZeke - are you able to push this forward to get reviewers and to start the review?

Sorry, I'm traveling this weekend, will try to see if I can solicit some reviewers.

@HaoZeke
Copy link
Member

HaoZeke commented Dec 7, 2024

@editorialbot assign @abhishektiwari as reviewer.

Thanks for stepping up @abhishektiwari :)

@editorialbot
Copy link
Collaborator Author

I'm sorry human, I don't understand that. You can see what commands I support by typing:

@editorialbot commands

@HaoZeke
Copy link
Member

HaoZeke commented Dec 7, 2024

@editorialbot add @abhishektiwari as reviewer

@editorialbot
Copy link
Collaborator Author

@abhishektiwari added to the reviewers list!

@HaoZeke
Copy link
Member

HaoZeke commented Dec 7, 2024

hi @thohamm @wathom 👋 would you be interested in and available to review this JOSS submission? We carry out our checklist-driven reviews here in GitHub issues and follow these guidelines: joss.readthedocs.io/en/latest/review_criteria.html

If not, could you recommend any potential reviewers? I was hoping to have your insights because of your past authorship of related JOSS publications.

@thohamm
Copy link

thohamm commented Dec 8, 2024

Sorry, but I have no time at the moment.

@editorialbot
Copy link
Collaborator Author

👋 @danielskatz, please take a look at the state of the submission (this is an automated reminder).

@danielskatz
Copy link

@HaoZeke - thanks for looking for more reviewers to move this forward. Please do continue to do so, so that the review can start soon

@wathom
Copy link

wathom commented Dec 12, 2024

Dear @HaoZeke, unfortunately I also have no time at the moment to do the review.

@picattoh
Copy link

hi, is there any way we can help in this process of finding a reviewer @danielskatz @HaoZeke

@danielskatz
Copy link

@picattoh - you can certainly suggest people to @HaoZeke (see #7267 (comment))

@danielskatz
Copy link

👋 @HaoZeke - Can you get this going please?

@danielskatz
Copy link

👋 @picattoh - please do suggest potential reviewers, as mentioned above

@HPicatto
Copy link

What's the criteria for the reviewer?

@danielskatz
Copy link

What's the criteria for the reviewer?

@Midnighter
Copy link

Hi there,

I saw @geoHeil asking for reviewers on Slack. I'm interested in this topic and would be happy to provide a review. I'm generally experienced with data engineering, have a strong background in Python, have published with and reviewed for JOSS and pyOpenSci before.

@HaoZeke
Copy link
Member

HaoZeke commented Jan 18, 2025

@danielskatz and @HPicatto many many apologies for the delay. I had unexpected travel delays. I'll get started right away.

@Midnighter thank you for volunteering! I'll add you as a reviewer, and then we can get started.

@HaoZeke
Copy link
Member

HaoZeke commented Jan 18, 2025

@editorialbot add @Midnighter as reviewer

@editorialbot
Copy link
Collaborator Author

@Midnighter added to the reviewers list!

@HaoZeke
Copy link
Member

HaoZeke commented Jan 18, 2025

@editorialbot start review

@editorialbot
Copy link
Collaborator Author

OK, I've started the review over in #7695.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Makefile pre-review Python TeX Track: 7 (CSISM) Computer science, Information Science, and Mathematics
Projects
None yet
Development

No branches or pull requests