This project demonstrates a simple linear regression model to predict the salary based on years of experience using Python and scikit-learn. The dataset used is Salary_Data.csv
.
Ensure you have the following libraries installed:
- numpy
- pandas
- matplotlib
- scikit-learn
You can install them using pip:
pip install numpy pandas matplotlib scikit-learn
The dataset Salary_Data.csv
should be placed in the same directory as the script. It contains two columns:
YearsExperience
: The number of years of experience.Salary
: The corresponding salary in dollars.
- Clone the repository or download the script and dataset.
- Run the script using a Python environment.
python salary_predictor.py
The plot of the training set shows the relationship between years of experience and salary. The blue points represent the actual data points from the training set, while the red line represents the predictions made by the linear regression model. This line should closely follow the trend of the blue points if the model is well-fitted.
see the file Salary_vs_Experience(Training_set).png
The plot of the test set similarly shows the relationship between years of experience and salary, but for the test data. The blue points represent the actual data points from the test set, and the red line is the same as in the training set visualization. This helps to visually assess how well the model generalizes to new data.
See the file Salary_vs_Experience(Test_set).png
After training the model, you can input a specific number of years of experience, and the script will output the predicted salary for that experience level. This demonstrates the model's ability to make practical predictions based on the learned relationship from the data.
- The dataset used in this project is for educational purposes.
- Thanks to the contributors of scikit-learn, numpy, pandas, and matplotlib for providing excellent libraries for machine learning and data visualization.
Feel free to add or modify any sections to better suit your project needs.