Machine learning is the new kid on the block and has become accessible with the advent of the Microsoft.ML library. In medicine machine learning hasn’t been used that much. Most scientific epidemiological research still relies on established statistical analysis. The core principle, however, is the same. Known exposure is used to predict outcome. In Pediatric Intensive Care, prediction of mortality is used to benchmark and monitor performance. In the Netherlands, for this purpose, data is gathered at a national basis. This blog will discuss a machine learning setup to analyze data using a data set with 13.793 PICU admissions.
F# is used to extract data and feed this to the ML algorithms. Code is written in a regular F# script file, this allows for running and testing the code and parts of code in the F# interactive. This process is called a REPL, Read, Evaluate, Print and Loop. This is an extremely efficient way to write code.
The first step is to setup the infrastructure:
- install a dotnet tool chain : dotnet new tool-manifest
- install paket for package management : dotnet tool add paket
- initiate paket: dotnet paket init
- add Microsoft.ML : dotnet paket add Microsoft.ML
- generate load scripts : dotnet paket generate-load-scripts
The generate-load-scripts command, no surprise, generates load scripts which can be used in a script file to get access to the needed libraries. There is a caveat however, the code is not compiled but interpreted. Therefore, a required native runtime library, CpuMathNative, is not copied over to the folder containing microsoft.ml.cpumath. This has to be done manually in order to get things up and running.
The seconds step is to create an F# script file and open this file using Visual Studio Code or Visual Studio. The script file starts with the load scripts, after which the Microsoft.ML library can be opened.
// Load all dependencies
#load "./.paket/load/netstandard2.0/main.group.fsx"
open System
open System.IO
open Microsoft.ML
open Microsoft.ML.Data
// Make sure that code uses the current source directory
Environment.CurrentDirectory <- __SOURCE_DIRECTORY__
The third step is to load the data, in this case from a tab delimited text file.
// Random order an array using a
// guid, then return the random
// ordered array
let randomOrder xs =
xs
|> Array.map (fun x -> Guid.NewGuid (), x)
|> Array.sortBy fst
|> Array.map snd
// Get the data from the file
// put this in an array of arrays
// i.e. a table structure. Also make
// sure that data is in random order
let source =
File.ReadAllLines "Scores.txt"
|> fun xs ->
xs
|> Array.skip 1
|> randomOrder
|> Array.append (xs |> Array.take 1)
|> Array.map (fun r -> r.Split('\t'))
// get table value from a row r with
// column name c
let getRowColumn c (r : string[]) =
let i =
source
|> Array.head
|> Array.findIndex ((=) c)
r.[i]
F# excels in these kind of programming problems. There is also a specific CSV provider, but in fact reading and getting the data out is trivial. At the same time you can also add features like the ability to randomize the data rows.
// The type to hold the data
// with the features and the
// label, -> Death
[<CLIMutable>]
type Data =
{
Age : Single
Elective : Single
SystolicBloodPressure : Single
Ventilated : Single
Oxygen : Single
NoRecovery : Single
NonNeuroScore : Single
NeuroScore : Single
LowRisk : Single
HighRisk : Single
VeryHighRisk : Single
Cancer : Single
PIM3Score : Single
Death : bool
}
From a medical epidemiological view we look at data in terms of exposure and outcome.

In ML land, exposure is called features and outcome is called label. So, the above data record holds the features in the top fields and the lower field, Death, is in fact the label, i.e. outcome.
Also, note that you can use a regular F# record to hold the data. You do need to add the CLIMutable attribute to make sure the record has an object initializer. You do not need to add the Column attributes you often see in ML code. Transforming the columns to the appropriate data type, however, is something that can be really easily achieved in F#.
To create the records the following utility functions are used:
// Low-risk diagnosis:
let pimLowRisk =
[ "Asthma"
"Bronchiolitis"
"Croup"
"ObstructiveSleepApnea"
"DiabeticKetoacidosis"
"SeizureDisorder" ]
// High-risk diagnosis:
let pimHighRisk =
[ "CerebralHemorrhage"
"CardiomyopathyOrMyocarditis"
"HIVPositive "
"HypoplasticLeftHeartSyndrome"
"NeurodegenerativeDisorder"
"NecrotizingEnterocolitis" ]
// Very high-risk diagnosis:
let pimVeryHighRisk =
[ "CardiacArrestInHospital"
"CardiacArrestPreHospital"
"SevereCombinedImmuneDeficiency"
"LeukemiaorLymphoma"
"BoneMarrowTransplant"
"LiverFailure" ]
// Map a specific diagnosis to
// either a low, high or very high risk.
let mapRiskDiagnosis xs x =
if xs |> List.exists ((=) x) then 1. else 0.
|> single
let mapLowRisk = mapRiskDiagnosis pimLowRisk
let mapHighRisk = mapRiskDiagnosis pimHighRisk
let mapVeryHighRisk = mapRiskDiagnosis pimVeryHighRisk
// Map a string to a value with type Single
let parseSingleWithDefault d (s : string) =
s
|> Single.TryParse
|> function
| true, x -> x |> single
| _ -> d
// Map a string to a boolean with type Single
let mapBoolean s2 s1 =
if s1 = s2 then 1 else 0
|> single
// Create an array of Data
let data =
source
|> Array.filter ((getRowColumn "Age(days)") >> ((=) "") >> not)
|> Array.skip 1
|> Array.map (fun r ->
{
Age =
r
|> getRowColumn "Age(days)"
|> fun x ->
try
x |> single
with _ -> sprintf "cannot parse %s" x |> failwith
Elective =
r
|> getRowColumn "Urgency"
|> mapBoolean "Elective"
SystolicBloodPressure =
r
|> getRowColumn "SystolicBP"
|> parseSingleWithDefault 120.f
Ventilated =
r
|> getRowColumn "Ventilated"
|> mapBoolean "True"
Oxygen =
let o =
r
|> getRowColumn "PaO2"
|> parseSingleWithDefault 0.f
let f =
r
|> getRowColumn "FiO2"
|> parseSingleWithDefault 1.f
if o > 0.f then o / f else 0.23f
NoRecovery =
r
|> getRowColumn "Recovery"
|> mapBoolean "NoRecovery"
NonNeuroScore =
r
|> getRowColumn "PRISM3Score"
|> parseSingleWithDefault 0.f
NeuroScore =
r
|> getRowColumn "PRISM3Neuro"
|> parseSingleWithDefault 0.f
LowRisk =
r
|> getRowColumn "RiskDiagnoses"
|> mapLowRisk
HighRisk =
r
|> getRowColumn "RiskDiagnoses"
|> mapHighRisk
VeryHighRisk =
r
|> getRowColumn "RiskDiagnoses"
|> mapVeryHighRisk
Cancer =
r
|> getRowColumn "Cancer"
|> mapBoolean "True"
PIM3Score =
r
|> getRowColumn "PIM3Score"
|> parseSingleWithDefault 0.f
Death =
r
|> getRowColumn "Status"
|> fun x -> x = "Death"
}
)
In this case we want a binary prediction, whether the patient will survive a PICU admission or not.
The next step is to divide the data set in a training set and a test set.
// Divide the data in a training set
// and a test set. Making sure that
// the training set is balanced, i.e.
// an equal amount of deaths as alive.
// also, the test set will not contain
// any records that were included in
// the training data.
let trainData, testData =
// get all the cases in the dataset
let cases =
data
|> Array.filter (fun d -> d.Death)
// calculate the case incidence
let incidence =
cases |> Array.length |> float
|> fun x -> x / (data |> Array.length |> float)
// create a training set with 80% of cases and
// keep track of selected cases in selected
let selected, trainData =
let selected =
cases
|> Array.take (0.8 * (cases |> Array.length |> float) |> int)
selected,
data
|> Array.filter (fun d -> d.Death |> not)
|> Array.take (selected |> Array.length)
|> Array.append selected
// pick the not-selected cases
let notSelected =
data
|> Array.filter(fun x -> x.Death)
|> Array.filter (fun x -> selected |> Array.exists ((=) x) |> not)
trainData,
// take a random sample for the test data
// making sure that it has the right incidence
data
|> randomOrder
|> Array.filter (fun x ->
x.Death |> not &&
trainData |> Array.exists((=) x) |> not
)
|> Array.take (1. / incidence * (notSelected |> Array.length |> float) |> int)
|> Array.append notSelected
There is also a ML method to split the data, however, this will not create a balanced training set, i.e. a training set that contains an equal amount of cases compared to controls. The above code will do that. At the same time, the test data set will be completely independent of the training data set and have the same case incidence as in the whole data set.
The ML library can generate a metrics object which can be used to print out the model metrics.
let printDataMetrics (trainData : Data seq) (testData : Data seq) =
printfn "* Metrics for train and test data "
printfn "*-----------------------------------------------------------"
printfn "* Model trained with %i records" (trainData |> Seq.length)
printfn "* Containing %i deaths" (trainData |> Seq.filter (fun d -> d.Death) |> Seq.length)
printfn "* Model tested with %i records" (testData |> Seq.length)
printfn "* Containing %i deaths" (testData |> Seq.filter (fun d -> d.Death) |> Seq.length)
printfn ""
let printCalibratedMetrics (metrics : CalibratedBinaryClassificationMetrics) =
printfn "* Metrics for binary classification model "
printfn "*-----------------------------------------------------------"
printfn "* Accuracy: %.3f" metrics.Accuracy
printfn "* Area Under Roc Curve: %.3f" metrics.AreaUnderRocCurve
printfn "* Area Under PrecisionRecall Curve: %.3f" metrics.AreaUnderPrecisionRecallCurve
printfn "* F1 Score: %.3f" metrics.F1Score
printfn "* LogLoss: %.3f" metrics.LogLoss
printfn "* LogLoss Reduction: %.3f" metrics.LogLossReduction
printfn "* Positive Precision: %.3f" metrics.PositivePrecision
printfn "* Positive Recall: %.3f" metrics.PositiveRecall
printfn "* Negative Precision: %.3f" metrics.NegativePrecision
printfn "* Negative Recall: %.3f" metrics.NegativeRecall
The actual calculation is relative simple:
// Calculate the model using the training data,
// and test data for the metrics. Include the features
// (Data column names) that has to be included in the model.
let calculate trainData testData features =
let context = MLContext()
let trainView = context.Data.LoadFromEnumerable trainData
let testView = context.Data.LoadFromEnumerable testData
let pipeline =
let features = features |> Seq.toArray
EstimatorChain()
.Append(context.Transforms.Concatenate("Features", features))
.Append(context.BinaryClassification.Trainers.SdcaLogisticRegression("Death", "Features"))
let trained = pipeline.Fit(trainView)
let predicted = trained.Transform(testView)
let metrics =
//context.BinaryClassification.EvaluateNonCalibrated(data=predicted, labelColumnName="Death", scoreColumnName="Score")
context.BinaryClassification.Evaluate(data=predicted, labelColumnName="Death", scoreColumnName="Score")
printDataMetrics trainData testData
metrics
The above function takes in a list of a training data set, a test data set and a list of features (a list of field names from the Data record). From this a metrics object is calculated to assess the performance of the generated model.
This code can be directly used from the script file like:
// analyze a features set
let analyze features =
features
// Calculate the model, metrics
// will be printed
|> calculate trainData testData
|> fun m ->
m |> printCalibratedMetrics
printfn ""
printfn ""
printfn "%s" (m.ConfusionMatrix.GetFormattedConfusionTable())
m
// Calculate the model, metrics
// will be printed
[
"AdmissionYear"
"Age"
"Elective"
"PIM3Score"
"Ventilated"
]
|> analyze
|> ignore
This will print out the following metrics (note that the training data set is balance, while the test data set represents the actual incidence):
* Metrics for train and test data
*-----------------------------------------------------------
* Model trained with 744 records
* Containing 372 deaths
* Model tested with 2851 records
* Containing 93 deaths
* Metrics for binary classification model
*-----------------------------------------------------------
* Accuracy: 0.805
* Area Under Roc Curve: 0.834
* Area Under PrecisionRecall Curve: 0.252
* F1 Score: 0.194
* LogLoss: 0.906
* LogLoss Reduction: -3.368
* Positive Precision: 0.112
* Positive Recall: 0.720
* Negative Precision: 0.988
* Negative Recall: 0.807
TEST POSITIVE RATIO: 0.0326 (93.0/(93.0+2758.0))
Confusion table
||======================
PREDICTED || positive | negative | Recall
TRUTH ||======================
positive || 67 | 26 | 0.7204
negative || 531 | 2,227 | 0.8075
||======================
Precision || 0.1120 | 0.9885 |
These metrics and the confusion table are really clearly described in this blog.