Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added some safeguards when the necessary imports are not available #291

Merged

Conversation

Harthi7
Copy link
Contributor

@Harthi7 Harthi7 commented Oct 22, 2024

Context Issue:

#250 (comment)

Implemented error handling for the import of deepspeed to prevent runtime crashes
when the module is unavailable. This improves the robustness of the application.

Issue: #250

@Harthi7
Copy link
Contributor Author

Harthi7 commented Oct 22, 2024

Hey @RobotSail, I had a rebasing/signing-off problem in my previous branch (#267) so I made a new branch with the requested changes. Please let me know what you think

Copy link
Member

@RobotSail RobotSail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Harthi7
Copy link
Contributor Author

Harthi7 commented Oct 28, 2024

Hello @Maxusmusti, This PR is ready for review please let me know what you think.

@Maxusmusti
Copy link
Contributor

@Harthi7 will take a look today, thanks!

src/instructlab/training/main_ds.py Outdated Show resolved Hide resolved
@Harthi7 Harthi7 force-pushed the feature/add-try-catch-import-to-deepspeed branch from 5d7b743 to b5bc651 Compare October 30, 2024 08:55
@mergify mergify bot added the ci-failure label Oct 30, 2024
abdullah-ibm added 2 commits October 30, 2024 12:01
…cessary imports are not available

Signed-off-by: Harthi7 <[email protected]>
Signed-off-by: abdullah-ibm <[email protected]>
Signed-off-by: abdullah-ibm <[email protected]>
@Harthi7 Harthi7 force-pushed the feature/add-try-catch-import-to-deepspeed branch from e045f2d to a8aab02 Compare October 30, 2024 09:01
@mergify mergify bot added CI/CD Affects CI/CD configuration documentation Improvements or additions to documentation testing Relates to testing dependencies Pull requests that update a dependency file and removed ci-failure labels Oct 30, 2024
@Harthi7 Harthi7 force-pushed the feature/add-try-catch-import-to-deepspeed branch from a8aab02 to 2d3e3cf Compare October 30, 2024 09:14
@Harthi7 Harthi7 requested a review from Maxusmusti October 30, 2024 09:53
@mergify mergify bot added the ci-failure label Oct 30, 2024
@Maxusmusti
Copy link
Contributor

Looks like you are hitting some linting/formatting issues, but otherwise looks good

@mergify mergify bot removed the ci-failure label Nov 1, 2024
@Harthi7
Copy link
Contributor Author

Harthi7 commented Nov 3, 2024

Looks like you are hitting some linting/formatting issues, but otherwise looks good

Fixed, Please take a look and let me know what you think @Maxusmusti

@Maxusmusti
Copy link
Contributor

Re-triggering tests

@mergify mergify bot added the ci-failure label Nov 4, 2024
@Maxusmusti
Copy link
Contributor

Maxusmusti commented Nov 4, 2024

@Harthi7 looks like it is still failing linting and formatting, you can run the following to identify the linting issues and automatically reformat:

python -m tox p -e lint,mypy,ruff

@mergify mergify bot removed the ci-failure label Nov 5, 2024
@JamesKunstle
Copy link
Contributor

@Harthi7 this is a great addition, would you please fix the linting + minor issues?

@mergify mergify bot removed the ci-failure label Nov 7, 2024
@Harthi7 Harthi7 force-pushed the feature/add-try-catch-import-to-deepspeed branch from 85c045b to da624f4 Compare November 7, 2024 10:40
@mergify mergify bot added the ci-failure label Nov 7, 2024
@Harthi7 Harthi7 force-pushed the feature/add-try-catch-import-to-deepspeed branch from da624f4 to 9c03b6b Compare November 7, 2024 11:03
@mergify mergify bot removed the ci-failure label Nov 7, 2024
@Harthi7
Copy link
Contributor Author

Harthi7 commented Nov 7, 2024

I fixed the Pylint/Formatting issue, Please take a look again @Maxusmusti

@Maxusmusti
Copy link
Contributor

@JamesKunstle


# pylint: disable=no-name-in-module
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You separated these try-except blocks because DeepSpeedCPUAdam is only imported for CPU offloading training correct? It'd be appreciated to have this documented since you're repeating code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe that is correct as @RobotSail have stated in a previous comment #267 (comment), as for the documentation how should we move forward?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct

Copy link
Contributor

@JamesKunstle JamesKunstle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanting clarification on the import try-except with a comment, otherwise it'll be good to go in.

Signed-off-by: abdullah-ibm <[email protected]>
@Harthi7 Harthi7 force-pushed the feature/add-try-catch-import-to-deepspeed branch from 37c4fa6 to 5417952 Compare November 11, 2024 08:45
@Harthi7 Harthi7 requested a review from JamesKunstle November 11, 2024 08:46
Copy link
Contributor

@JamesKunstle JamesKunstle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is really helpful. I appreciate your contribution!

@mergify mergify bot removed the one-approval label Nov 11, 2024
@mergify mergify bot merged commit b812604 into instructlab:main Nov 12, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI/CD Affects CI/CD configuration dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation testing Relates to testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants