The Dataset
- Satisfaction Level (0-1)
- Last evaluation (0-1)
- Number of projects (integer)
- Average monthly hours (integer)
- Time spent at the company (integer)
- Whether they have had a work accident (0-no, 1-yes)
- Whether they have had a promotion in the last 5 years (0-no, 1-yes)
- Department name (text)
- Salary (text: low, medium, high)
- Whether the employee has left (0-no, 1-yes)
satisfaction |
last |
number |
average |
time |
work |
left |
promotion |
sales |
salary |
---|---|---|---|---|---|---|---|---|---|
0.38 |
0.53 |
2 |
157 |
3 |
0 |
1 |
0 |
7 |
1 |
0.8 |
0.86 |
5 |
262 |
6 |
0 |
1 |
0 |
7 |
2 |
0.11 |
0.88 |
7 |
272 |
4 |
0 |
1 |
0 |
7 |
2 |
0.72 |
0.87 |
5 |
223 |
5 |
0 |
1 |
0 |
7 |
1 |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
Training the AI Model
Support Vector Machine (SVM) model
- Create a new AI-TOOLKIT project (Open AI-TOOLKIT Editor + New Project).
- Insert the SVM model template (Insert ML Template + choose Supervised Learning + Support Vector Machine).
- Save the project.
- Download the data (at the end of the article) and change the extension to ‘.tsv’. Import the data into a new AI-TOOLKIT database (On the DATABASE tab: Import Data Into Database + follow the instructions on the screen. It is important that you indicate the correct number of header rows (non-numerical) and the zero based index of the decision column (6 in this example)). Use as table name: ‘hr_data’.
- Save the database into the same folder as the project is saved. Use the name ‘hr.sl3’.
- Run the SVM parameter optimization module to find the optimal parameters (SVM Parameter Optimizer on the AI-TOOLKIT tab). You may stop the optimization earlier if you see a high enough accuracy or just skip the optimization and use the values shown below.
- Adjust the SVM model template as shown below (some of the unneeded parameters and comments are not shown). The optimal parameters are filled in.
model:
id: 'ID-EFnMmvBNWr'
type: SVM
path: 'hr.sl3'
params:
- svm_type: C_SVC
- kernel_type: RBF
- gamma: 15.0
- C: 281.8
training:
- data_id: 'hr_data'
- dec_id: 'decision'
test:
- data_id: 'hr_data'
- dec_id: 'decision'
input:
- data_id: 'input_data'
- dec_id: 'decision'
output:
- data_id: 'output_data'
- col_id: 'decision'
- Save the project.
- Train AI model (AI-TOOLKIT tab).
Performance Evaluation Results
Confusion Matrix [predicted x original] (number of classes: 2):
(0) (1) (0) 11427 0 (1) 1 3571 Accuracy 99.99% Error 0.01% C.Kappa 99.98% (0) (1) Precision 100.00% 99.97% Recall 99.99% 100.00% FNR 0.01% 0.00% F1 100.00% 99.99% TNR 100.00% 99.99% FPR 0.00% 0.01%
DeepAI Educational Neural Network Model
- Number of iterations: 10
- Learning rate: 0.01
- Regularization rate: 0.001
- Batch size: 10
- Activation Function: TANH
- Regularization Function: NONE
- Test data %: 10
- Treat data as X-Y Classification / Regression
References
- The Application of Artificial Intelligence, Zoltan Somogyi.
- HR Analytics Dataset: Attribution-Share Alike 4.0 International (CC BY-SA 4.0) license, Source: https://www.kaggle.com/ludobenistant/hr-analytics.
You can download the dataset in MS Excel format here: HR_COMMA_SEP_U.XLS
Learn about the application of Artificial Intelligence and Machine Learning from the book "The Application of Artificial Intelligence | Step-by-Step Guide from Beginner to Expert", Springer 2020 (~400 pages) (ISBN 978-3-030-60031-0). Unique, understandable view of machine learning using many practical examples. Introduces AI-TOOLKIT, freely available software that allows the reader to test and study the examples in the book. No programming or scripting skills needed! Suitable for self-study by professionals, also useful as a supplementary resource for advanced undergraduate and graduate courses on AI. More information can be found at the Springer website: Springer book: The Application of Artificial Intelligence. |