ARTIFICIAL INTELLIGENCE

TOOLKIT

The Application of AI made Easy!

Build and Apply Machine Learning Without Any Programming!
Supervised, Unsupervised & Reinforcement Learning.

MS Windows & Open Source Software
FREE FOR NON-COMMERCIAL USE!

AI TOOLKIT DOWNLOAD IT HERE! TRAINING VIDEOS
EXAMPLES OF APPLICATIONS OPEN SOURCE


Part of the The Intelligent Enterprise Group

The Intelligent Enterprise Group


AI Toolkit







Decision AI Professional
Software Toolkit for building and using state of the art Machine Learning models (easy Training, Testing and Inference) and for building Intelligent Systems (several AI models working together). Supervised Learning + Unsupervised Learning + Reinforcement Learning. Several built-in Tools and Apps for editing and transforming audio, images, large text files, Face Recognition, Speaker Recognition, Fingerprint Recognition, etc.
Read More
VoiceData
VoiceData can be used for generating data for training Automatic Speech Recognition (ASR) models in many languages. The generated data includes both the transcription files and the synchronized audio (the input text is read by a machine trained very human sounding synthethized voice; male or female). + Text Normalization + Text Recognition.

Read More
DocumentSummary
Can be used to create a short summary from any text document as simple text, PDF files, HTML files, etc. on your computer or on the internet. Uses Artificial Intelligence (AI) powered language models. Able to take into account specialized words specific to your discipline (law, medicine, chemistry, etc.).


Read More
VectorML
Bitmap to vector (svg) conversion (machine learning) and fast svg view with presentation mode. Combined GPU and CPU acceleration.




Read More
FacilityNetworkML
Process design aided by machine learning. Define your connected facilities (departments, work cells, service stations, etc.) and the software will guide you in sizing your network (number of servers/employees, waiting time, queues, etc.).


Read More
 AI-TOOLKIT Download

AI-TOOLKIT Training Video's

Open Source Software


VoiceBridge
VoiceBridge is an Open Source state of the art Speech Recognition C++ Toolkit
Read More







Knowledge

Showing posts with label healthcare. Show all posts
Showing posts with label healthcare. Show all posts

Table of Contents



Learn about the application of Artificial Intelligence and Machine Learning from the book "The Application of Artificial Intelligence | Step-by-Step Guide from Beginner to Expert", Springer 2020 (~400 pages) (ISBN 978-3-030-60031-0). Unique, understandable view of machine learning using many practical examples. Introduces AI-TOOLKIT, freely available software that allows the reader to test and study the examples in the book. No programming or scripting skills needed! Suitable for self-study by professionals, also useful as a supplementary resource for advanced undergraduate and graduate courses on AI. More information can be found at the Springer website: Springer book: The Application of Artificial Intelligence.


@book{Somogyi_2021, doi = {10.1007/978-3-030-60032-7}, url = {https://doi.org/10.1007%2F978-3-030-60032-7}, year = 2021, publisher = {Springer International Publishing}, author = {Zolt{\'{a}}n Somogyi}, title = {The Application of Artificial Intelligence} }
The Application of Artificial Intelligence | Step-by-Step Guide from Beginner to Expert

AI in Detecting Diseases

The breast cancer diagnosis process is a complex and unpleasant process for the patients. This example will present a possible improvement of this process by using machine learning (ML). 

The patient goes through several process steps from which one step is where a digitized image of a breast mass is created and analyzed by the computer and the so called cell nucleus characteristics are measured and recorded. By studying and comparing the characteristics of the cell nucleus for many patients, who have or do not have cancer, and feeding the collected data to an ML model, the ML model can learn which characteristics result in cancer of the patient. The necessary ML training data attributes are decided by specialist and computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. 

Building and using an ML model in the decision process does not only decrease process time significantly but it also makes the process more reliable (this depends of course on the accuracy of the ML model) because it eliminates possible human error. Another advantage could be that the input data can automatically be fed to the ML model and by this eliminating a very time consuming manual process step.

The input data

Each record contains a series of attributes and the final diagnosis whether the patient with these attributes has cancer (malignant tumor) or not. The aim is to collect all possible combinations of the attributes in a way that the ML model can be trained well and that it then can decide very accurately whether the patient has breast cancer or not.

Two digitized images with the cell nucleus present are shown below.

The different attributes in the data are as follows:
  • Column 1
    • Diagnosis: Malignant=1, Benign=2)
  • Columns 2-31
    • Ten real-valued features are computed for each cell nucleus. The mean, standard error, and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features:
      • Radius (mean of distances from center to points on the perimeter)
      • Texture (standard deviation of gray-scale values)
      • Perimeter
      • Area
      • Smoothness (local variation in radius lengths)
      • Compactness (perimeter^2 / area - 1.0)
      • Concavity (severity of concave portions of the contour)
      • Concave points (number of concave portions of the contour)
      • Symmetry
      • Fractal dimension ("coastline approximation" - 1)
digitized images with the cell nucleus present, source [2]

The data file (can be downloaded at the end of the article) has a simple tab separated format. In order to use the data in the AI-TOOLKIT we need to change the extension of the data file to ‘.TSV’ (the AI-TOOLKIT expects this extension for tab delimited data files).

In order to use the fully numerical ML model all attributes need to be converted to numerical values. In our case there is only one non-numerical attribute and that is the Decision variable which is the Diagnosis whether the patient has breast cancer or not. The two possible options can be simply converted to Malignant=1, Benign=2. The AI-TOOLKIT can do this conversion automatically for you while importing the data (select the ‘Automatically Convert Categorical or Text values’ option) or you can just do a text replace in a text editor.

After preparing the input data in the appropriate format (tab separated values) the type of the ML model must be chosen. Let us choose an SVM model for this example.

First, in case of an SVM model the ML model parameters need to be optimized. This can be done automatically by the AI-TOOLKIT by using the built-in SVM Parameter Optimization module. The AI-TOOLKIT will report the best parameter combination for the input data which then can be filled in as follows:
model:
    id: 'ID-WFcqHlreYm'
    type: SVM
    path: 'wdbc.sl3'
    params:
        - svm_type: C_SVC 
        - kernel_type: RBF 
        - gamma: 15.0 
        - C: 1.779 
    training: 
        - data_id: 'wdbc' 
        - dec_id: 'decision' 
    test: 
        - data_id: 'wdbc_t' 
        - dec_id: 'decision'
    input: 
        - data_id: 'input_data' 
        - dec_id: 'decision'
    output:
        - data_id: 'output_data'
        - col_id: 'decision'
After importing the data, defining the data table names (wdbc and wdbc_t) and entering the optimal model parameters the ML model can be trained. 

When the ML model is ready learning the problem it will let you know the accuracy of the model on the training dataset:

Performance Evaluation Results: TRAINING
 

  Confusion Matrix [predicted x original] (number of classes: 2):

  (0) (1)
(0) 199 0
(1) 0 342
     
Accuracy 100.00%  
Error 0.00%  
C.Kappa 100.00%  
     
  (0) (1)
Precision 100.00%  100.00%
Recall 100.00% 100.00%
FNR 0.00% 0.00%
F1 100.00% 100.00%
TNR 100.00% 100.00%
FPR 0.00% 0.00%
The ML model is able to predict correctly whether the patient has breast cancer or not in all of the cases but do not forget that the model still needs to be tested with an appropriate number of data records (attribute sets) unseen during the training of the ML model in order to make sure that the ML model is learned enough about the phenomena and that it generalizes well!

In the case we use 5% of the input data for testing (removing it from the training data) and let the AI-TOOLKIT test the trained ML model with this test data then we get the results as follows:

Performance Evaluation Results: TEST
 

  Confusion Matrix [predicted x original] (number of classes: 2):

  (0) (1)
(0) 12 1
(1) 1 14
     
Accuracy 92.86%  
Error 7.14%  
C.Kappa 85.64%  
     
  (0) (1)
Precision 92.31%  93.33%
Recall 92.31% 93.33%
FNR 7.69% 6.67%
F1 92.31% 93.33%
TNR 93.33% 92.31%
FPR 6.67% 7.69%
The test results are less good than what we have seen during the training but the trained ML model can still predict 26 cases well from 28 which is still a very good result especially if we consider that we use new data! The ML model makes 1 mistake predicting incorrectly cancer when it should not and 1 mistake predicting no cancer when it should. Predicting incorrectly cancer is a less sever mistake because the diagnosis can still be checked by a medical doctor but the mistake of predicting no cancer when there is cancer should be eliminated! This is an important special way of ML model evaluation in the healthcare sector, not all mistakes have the same weight!

The above SVM model can still be improved by adding more data or/and changing the input features. It is of course also possible to choose another ML model e.g., a neural network model.

The extended performance evaluation results of the AI-TOOLKIT allows us to make a thorough analysis of the performance of the ML model but this is left as an exercise for the reader.

The trained ML model can be used to make important decisions and the input data could be fed to the ML model automatically and the results could also be collected automatically. The ML algorithm could even be integrated into different digital devices in order to have an all-in automatic analysis possible.

Conclusion

As we have seen above an ML model can be very useful in the improvement of business processes. The techniques explained in this article can be used not only in the healthcare sector but in many other sectors too! There are two important considerations while using an ML model:
  1. The attributes and the data records (attribute sets) used to train the ML model are very important. The capabilities of the ML model will depend on the data it gets for learning a specific phenomenon. You can of course add more data and/or attributes and re-train the model. Not only the amount of input data but the selection of the right attributes (features) is also very important.
  2. Extensively testing the ML model is very important in order to make sure that it is trained well in all aspects of the studied phenomena and that the model generalizes well (performs well in case of using during the training unseen input data).

References

  1. The Application of Artificial Intelligence, Zoltan Somogyi.
  2. Breast Cancer Wisconsin (Diagnostic) Data Set: Dr. William H. Wolberg, General Surgery Dept. University of Wisconsin, Clinical Sciences Center Madison, WI 53792. You can download the dataset here: Breast Cancer Diagnosis data set.

Learn about the application of Artificial Intelligence and Machine Learning from the book "The Application of Artificial Intelligence | Step-by-Step Guide from Beginner to Expert", Springer 2020 (~400 pages) (ISBN 978-3-030-60031-0). Unique, understandable view of machine learning using many practical examples. Introduces AI-TOOLKIT, freely available software that allows the reader to test and study the examples in the book. No programming or scripting skills needed! Suitable for self-study by professionals, also useful as a supplementary resource for advanced undergraduate and graduate courses on AI. More information can be found at the Springer website: Springer book: The Application of Artificial Intelligence.

The Application of Artificial Intelligence | Step-by-Step Guide from Beginner to Expert

AI in Healthcare process improvement

The main aim of this simple example is to demonstrate the applications of machine learning in business process improvement in the healthcare sector. Many of the principles and ideas applied in this case study are also applicable in other sectors!

We will try to improve the post-operative patient care process in a hospital. After an operation, according to the current post-operative patient care process, patients need to be examined by a medical doctor to determine where the patients should be sent from the postoperative recovery area. The possibilities are the following:
  • The patient may go home,
  • The patient needs to go to the general care hospital floor (GC),
  • The patient needs to be transferred to intensive care (IC).
In order to improve this process (make the process much faster and more reliable) the hospital needs to collect all necessary data which is needed to make this decision for many patients and then use this data to train a machine learning model. After the machine learning model is successfully trained a hospital employee (e.g. a nurse) can simply feed the specific patient data to the model and the machine learning model will tell instantly (inference) what should happen with the patient. This improved process is much faster because the waiting time for the medical doctor is eliminated, in many cases the decision is more reliable because the machine learning model does not get tired or confused by external factors, the medical doctor or specialist can do other important things, and as last but not least the patient will be more satisfied with the faster process! Several important reasons to implement such a process improvement!

The dataset can be seen below (can be downloaded at the end of the article):

L-CORE

L-SURF

L-O2

L-BP

SURF-STBL

CORE-STBL

BP-STBL

COMFORT

DECISION

mid

low

excellent

mid

stable

stable

stable

15

GC

mid

high

excellent

high

stable

stable

stable

10

Home

high

low

excellent

high

stable

stable

mod-stable

10

GC

mid

low

good

high

stable

unstable

mod-stable

15

GC

mid

mid

excellent

high

stable

stable

stable

10

GC

high

low

good

mid

stable

stable

unstable

15

Home

mid

low

excellent

high

stable

stable

mod-stable

5

Home


Most of the numerical attributes such as the temperature, are grouped and converted to textual categories. This is one of the tricks which can be used while training a machine learning model. You may of course use numerical values but then you will give more freedom to the model! In many situations it is sufficient to group the data into well-chosen textual categories. The AI-TOOLKIT can handle both categorical and numerical attributes.

The collected attributes and groupings are the following:
  • L-CORE: the patient's internal temperature: high > 37°, 37° ≥ mid ≥ 36°, low < 36°.
  • L-SURF: the patient's surface temperature: high > 36.5°, 35° ≥ mid ≥ 36.5°, low < 35°.
  • L-O2: the oxygen saturation: excellent ≥ 98%, 98% > good ≥ 90%, 90% > fair ≥ 80%, poor < 80%.
  • L-BP: the last measurement of blood pressure: high > 130/90, 130/90 ≥ mid ≥ 90/70, low < 90/70.
  • SURF-STBL: the stability of patient's surface temperature: stable, mod-stable and unstable.
  • CORE-STBL: the stability of patient's core temperature: stable, mod-stable and unstable.
  • BP-STBL: the stability of patient's blood pressure: stable, mod-stable and unstable.
  • COMFORT: the patient's perceived comfort at discharge, measured as an integer between 0 and 20.
  • DECISION: the discharge decision: Home: the patient needs to be prepared to go Home, GC: the patient must be sent to the General Care hospital floor. IC: the patient must be sent to Intensive Care.
There is a huge imbalance in the data. There are 64 cases with a ‘Home’ decision, 24 cases with ‘GC’ decision and only 2 cases with ‘IC’ decision. The dataset is also very small with 90 records. We will need to take the imbalance into account while evaluating the machine learning model or we could also resample the data and remove the imbalance. We will try both in this example.

The first step is to create an AI-TOOLKIT database with the “Create New AI-TOOLKIT Database” command on the Database tab on the left taskbar. Save the database in a directory of your choice. The second step is to import all data into the database created in the former step with the “Import Data into Database” command. Do not forget to indicate the number of header rows (if any) and the correct zero based index of the decision column! We also have to select the two ‘Conversion/Resampling’ options for automatic conversion of the categorical values and in case we want to resample the dataset then the resample for imbalance reduction option also. 

In this example we will import the dataset two times, first without resampling and then with resampling in order to be able to test both cases. Both data can be imported into the same AI-TOOLKIT database but with a different name. For the categorical conversion we use the default option which is integer encoding. While resampling the dataset we must also set the ‘Majority Limit’ option to 64 (the number of cases with the majority class) in order to have 64 resampled cases for all three classes (Home, GC and IC).

Next we must create the AI-TOOLKIT project file. Use the “Open AI-TOOLKIT Editor” command and then insert the chosen model template with the “Insert ML Template” button. We will train a simple neural network classification model with three internal layers. The project file and the optimal parameters are as follows:
model:
    id: 'ID-BBEMTNTNEY'
    type: FFNN1_C 
    path: 'postoperative.sl3'
    params:
        - layers:
            - Linear: 
            - TanHLayer: 
                nodes: 100
            - Linear: 
            - TanHLayer: 
                nodes: 80
            - Linear: 
            - TanHLayer: 
                nodes: 40
            - Linear: 
        - iterations_per_cycle: 1000
        - num_cycles: 10
        - step_size: 5e-5
        - batch_size: 1
        - optimizer: SGD_ADAM 
        - stop_tolerance: 1e-5
        - sarah_gamma: 0.125
    training: 
        - data_id: 'postoperative' 
        - dec_id: 'decision'
    test: 
        - data_id: 'postoperative' 
        - dec_id: 'decision'
    input: 
        - data_id: 'po_input_data' 
        - dec_id: 'decision'
    output:
        - data_id: 'po_output_data'
        - col_id: 'decision'
The training process and the extended evaluation results with the original (not resampled) dataset are as follows:
AI Training... (Model ID: ID-BBEMTNTNEY).
0 - training accuracy = 71.11 % (Model ID: ID-BBEMTNTNEY) (training time: 0m 2s ).
1 - training accuracy = 81.11 % (Model ID: ID-BBEMTNTNEY) (training time: 0m 0s ).
2 - training accuracy = 90.00 % (Model ID: ID-BBEMTNTNEY) (training time: 0m 0s ).
3 - training accuracy = 91.11 % (Model ID: ID-BBEMTNTNEY) (training time: 0m 0s ).
4 - training accuracy = 92.22 % (Model ID: ID-BBEMTNTNEY) (training time: 0m 0s ).
5 - training accuracy = 92.22 % (Model ID: ID-BBEMTNTNEY) (training time: 0m 0s ).
6 - training accuracy = 92.22 % (Model ID: ID-BBEMTNTNEY) (training time: 0m 0s ).
7 - training accuracy = 93.33 % (Model ID: ID-BBEMTNTNEY) (training time: 0m 0s ).
8 - training accuracy = 93.33 % (Model ID: ID-BBEMTNTNEY) (training time: 0m 0s ).
9 - training accuracy = 93.33 % (Model ID: ID-BBEMTNTNEY) (training time: 0m 0s ).
10 - training accuracy = 93.33 % (Model ID: ID-BBEMTNTNEY) (training time: 0m 0s ).

The best model is chosen with training accuracy = 93.33 % (Model ID: ID-BBEMTNTNEY). 

Performance Evaluation Results
 

  Confusion Matrix [predicted x original] (number of classes: 3):

  GC Home IC
GC63 5 0
Home 1 19 0
IC0 0 2
       
Accuracy 93.33%   
Error 6.67%    
C.Kappa 83.46%   
       
  GC HomeIC
Precision 92.65%  95.00% 100.00%
Recall 98.44% 79.17% 100.00%
FNR 1.56% 20.83% 0.00%
F1 95.45% 86.36% 100.00%
TNR 80.77% 98.48% 100.00%
FPR 19.23% 1.52% 0.00%
Despite the fact that there is not enough data and that there is an imbalance in the data the results are quite good. Sending patients to general care when they could go home is less dangerous for the patients and the doctor can still send them home from general care, therefore this kind of error is preferred compared to the other errors. The model does not make the worst mistake and send a patient home or to general care when he/she should go to intensive care. This is very important in this case!

A simple visual form could be presented to a nurse who must enter the appropriate attributes and push the Ask AI button to get the instant answer from the machine learning model.

For further explanation of the results and for the results with the resampled dataset read the book “The Application of Artificial Intelligence”. 


This article is a slightly modified excerpt from the book “The Application of Artificial Intelligence”. If you are interested in the subject then it is strongly recommended to read the book which contains many more details and real world case studies for several sectors and disciplines! The book explains  several examples step-by-step by using the AI-TOOLKIT. The book is going through the publishing process at the time of writing this article. You may use the contact form for info about pre-ordering the book.

References

  1. The Application of Artificial Intelligence, Zoltan Somogyi.
  2. Post-Operative Patient Data Set: Sharon Summers, School of Nursing, University of Kansas. Medical Center, Kansas City, KS 66160. Linda Woolery, School of Nursing, University of Missouri, Columbia, MO 65211.
    You can download the data here: Post-Operative Patient Care Process Decision AI data set
For in case you are also interested in some BUSINESS PROCESS IMPROVEMENT Cloud computing tools visit the following links:

Learn about the application of Artificial Intelligence and Machine Learning from the book "The Application of Artificial Intelligence | Step-by-Step Guide from Beginner to Expert", Springer 2020 (~400 pages) (ISBN 978-3-030-60031-0). Unique, understandable view of machine learning using many practical examples. Introduces AI-TOOLKIT, freely available software that allows the reader to test and study the examples in the book. No programming or scripting skills needed! Suitable for self-study by professionals, also useful as a supplementary resource for advanced undergraduate and graduate courses on AI. More information can be found at the Springer website: Springer book: The Application of Artificial Intelligence.

The Application of Artificial Intelligence | Step-by-Step Guide from Beginner to Expert

Contact

Have a general inquiry?

Contact our team.

Search This Website