Skip to content

Commit

Permalink
Merge pull request galaxyproject#5225 from anuprulez/update_confusion…
Browse files Browse the repository at this point in the history
…_matrix

Update confusion matrices
  • Loading branch information
bgruening authored Aug 12, 2024
2 parents 88616d9 + b1f8bea commit 9cc53bc
Show file tree
Hide file tree
Showing 4 changed files with 5 additions and 5 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ We will evaluate the classification by comparing the predicted with the expected
Now we visualize and analyze the classification using the "Plot confusion matrix, precision, recall and ROC and AUC curves" tool.
> <hands-on-title>Check and visualize the classification</hands-on-title>
> {% tool [Plot confusion matrix, precision, recall and ROC and AUC curves](toolshed.g2.bx.psu.edu/repos/bgruening/plotly_ml_performance_plots/plotly_ml_performance_plots/0.3) %}:
> {% tool [Plot confusion matrix, precision, recall and ROC and AUC curves](toolshed.g2.bx.psu.edu/repos/bgruening/plotly_ml_performance_plots/plotly_ml_performance_plots/0.4) %}:
> - {% icon param-file %} *"Select input data file"*: `test_rows_labels_noheader`
> - {% icon param-file %} *"Select predicted data file"*: `LogisticRegression_result`
> - {% icon param-file %} *"Select trained model"*: `LogisticRegression_model`
Expand Down Expand Up @@ -298,7 +298,7 @@ Now, we should evaluate the performance on the test dataset to find out whether
Now we visualize and analyze the classification. As you can see, `NearestNeighbors_result` has a header, so use `test_rows_labels` to compare.
> <hands-on-title>Check and visualize the classification</hands-on-title>
> {% tool [Plot confusion matrix, precision, recall and ROC and AUC curves](toolshed.g2.bx.psu.edu/repos/bgruening/plotly_ml_performance_plots/plotly_ml_performance_plots/0.3) %}:
> {% tool [Plot confusion matrix, precision, recall and ROC and AUC curves](toolshed.g2.bx.psu.edu/repos/bgruening/plotly_ml_performance_plots/plotly_ml_performance_plots/0.4) %}:
> - {% icon param-file %} *"Select input data file"*: `test_rows_labels`
> - {% icon param-file %} *"Select predicted data file"*: `NearestNeighbors_result`
> - {% icon param-file %} *"Select trained model"*: `NearestNeighbors_model`
Expand Down Expand Up @@ -369,7 +369,7 @@ Now we will evaluate the performance of the SVM classifier:
Now let's visualize the results:
> <hands-on-title>Check and visualize the classification</hands-on-title>
> {% tool [Plot confusion matrix, precision, recall and ROC and AUC curves](toolshed.g2.bx.psu.edu/repos/bgruening/plotly_ml_performance_plots/plotly_ml_performance_plots/0.3) %}:
> {% tool [Plot confusion matrix, precision, recall and ROC and AUC curves](toolshed.g2.bx.psu.edu/repos/bgruening/plotly_ml_performance_plots/plotly_ml_performance_plots/0.4) %}:
> - {% icon param-file %} *"Select input data file"*: `test_rows_labels`
> - {% icon param-file %} *"Select predicted data file"*: `SVM_result`
> - {% icon param-file %} *"Select trained model"*: `SVM_model`
Expand Down Expand Up @@ -408,8 +408,8 @@ One big advantage of random forest is that it can be used for both classificatio
> What are the advantages of random forest classifier compared with KNN and SVM?
>
> > <solution-title></solution-title>
> > 1. The overfitting problem will never arise when we use the random forest algorithm in any classification problem.
> > 2. The same random forest algorithm can be used for both classification and regression task.
> > 1. It is resistant to overfitting as it produces results based on multiple decision trees.
> > 2. It scales well with large datasets.
> > 3. The random forest algorithm can be used for feature engineering, which means identifying the most important features out of the available features from the training dataset.
> {: .solution}
>
Expand Down

0 comments on commit 9cc53bc

Please sign in to comment.