-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PR4: Write code to test your data after labeling (can use Cleanlab or Deepchecks) #14
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
homework_4/pr4/requirements.txt
Outdated
@@ -0,0 +1 @@ | |||
cleanlab |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
duplicates_df = pd.concat([df.loc[duplicate_issues_idx].reset_index(drop=True), | ||
df.loc[duplicate_issues_idx_2].reset_index(drop=True)], axis=1) | ||
duplicates_df.columns = ['Original_Email', 'Original_Category', 'Duplicate_Email', 'Duplicate_Category'] | ||
duplicates_df.to_csv('duplicate_issues.csv', index=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! could you add a report to README? what did cleanlab able to find?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
# PR4: Write code for transforming your dataset into a vector format, and utilize VectorDB for ingestion and querying. | ||
|
||
|
||
# Cleanlab Discoveries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great!
PR4: Write code to test your data after labeling (can use Cleanlab or Deepchecks)