Preliminary classifier for continuous features related to production line, with precision score of 89.2% (low number of false failures classified).
https://siddhanthunnithan.github.io/bosch_plp/
Refer to analysis.txt for notes taken while modelling.
https://www.kaggle.com/c/bosch-production-line-performance
- Matthews correlation coefficient (MCC)
- MCC = ((TP * TN) - (FP * FN))/sqrt((TP+FP) * (TP+FN) * (TN+FP) * (TN+FN))
- bosch manufactures hundreds of thousands of parts at a high frequency
- defective parts cost the same amount as functional parts, in terms of manufacturing
- losses are incurred when defective parts are used and shipped to end user - forms: refunds, free-exchanges, lawsuits
- two possible forms of prediction: - right after manufacturing => defective parts are discarded - while manufacturing; after particular steps => non-trivial
- 968 numeric features
- 1,183,747 unique parts
- 6879 TP recorded failures