F# Neural Networks with FsLab

nn_previewNeural networks are very powerful tool and at the same time, it is not easy to use all its power. Now we are one step closer to it from F# and .NET. We will delegate model training to R using R Provider. Also we will use Deedle (that was announced some days ago) for handy data manipulation.

Prerequisites:

Learning from Data:

First of all, we need to load required assemblies into our FSI session. It is pretty easy with FsLab because package have bootstrapping script.

#load "..\packages\FsLab.0.1.4\FsLab.fsx"

The next step is to download and install missed R packages. For this demo, we need neuralnet for training neural network model and prediction, caret for data visualization.

open RProvider.utils
R.install_packages("MASS")
R.install_packages("pbkrtest")
R.install_packages("lattice")
R.install_packages("Matrix")
R.install_packages("mgcv")
R.install_packages("grid")
R.install_packages("neuralnet")
R.install_packages("caret")
R.install_packages("zoo")

Now we are ready to start work. We need to open namespaces and load a data set. For this demo, we have chosen iris data set, which is classic for lots of demos.

open Deedle
open RDotNet
open RProvider
open RProvider.``base``
open RProvider.datasets
open RProvider.neuralnet
open RProvider.caret

let iris : Frame<int, string> = R.iris.GetValue()

To better understand what we are going to do, let’s plot this data set. First of all, split data into two parts: features (Sepal.Length; Sepal.Width; Petal.Length; Petal.Width) and a target variable (Species). After that plot these data into different dimensions (different colors represent different Species).

let features =
iris
|> Frame.filterCols (fun c _ -> c <> "Species")
|> Frame.mapColValues (fun c -> c.As<double>())
let targets =
R.as_factor(iris.Columns.["Species"])

R.featurePlot(x = features, y = targets, plot = "pairs")

nn_features

As you see, our task is not trivial – we have 3 classes instead of 2 (that is not classic situation) and classes are not clearly separable. Nevertheless let’s try!  First of all, we need to split our data into 2 parts – training and testing data sets (70% vs 30%). The first part will be sent to the neural network for learning, the second one will be used for measuring model quality. Also let’s shuffle data to be honest.

iris.ReplaceColumn("Species", targets.AsNumeric())
let range = [1..iris.RowCount]
let trainingIdxs : int[] = R.sample(range, iris.RowCount*7/10).GetValue()
let testingIdxs : int[] = R.setdiff(range, trainingIdxs).GetValue()
let trainingSet = iris.Rows.[trainingIdxs]
let testingSet = iris.Rows.[testingIdxs]

Now we are ready to train a neural network, all we need is to provide a formula (specify what is the input for our model and what is the output) “Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width”, provide a data set and specify the structure of hidden layers. In the following example, we will train the network with two layers of hidden nodes, the first layer with 3 nodes and the second layer with 2 nodes.

let nn =
R.neuralnet(
"Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width",
data = trainingSet, hidden = R.c(3,2),
err_fct = "ce", linear_output = true)

// Plot the resulting neural network with coefficients
R.eval(R.parse(text="library(grid)"))
R.plot_nn nn

nn_network

Cool! How simple it is. To be able to measure quality of the classification we need to split our training set into features and targets.

let testingFeatures =
testingSet
|> Frame.filterCols (fun c _ -> c <> "Species")
|> Frame.mapColValues (fun c -> c.As<double>())
let testingTargets =
testingSet.Columns.["Species"].As<int>().Values

To execute the neural network on the new data (apply our classification) we should call R.compute method and pass the training data set there.

let prediction =
R.compute(nn, testingFeatures)
.AsList().["net.result"].AsVector()
|> Seq.cast<double>
|> Seq.map (round >> int))

Finally, let’s compare prediction results with testing values:

let misclassified =
Seq.zip prediction testingTargets
|> Seq.filter (fun (a,b) -> a<>b)
|> Seq.length

printfn "Misclassified irises '%d' of '%d'" misclassified (testingSet.RowCount)

If you execute all these steps one by one, you will see that there are only ~3 misclassifies of 45 samples. Pretty well quality.

Full script:

#load "..\packages\FsLab.0.1.4\FsLab.fsx"

// You need to install 'nnet' and 'caret' packages if you do not have them
open RProvider.utils
open RProvider.utils
R.install_packages("MASS")
R.install_packages("pbkrtest")
R.install_packages("lattice")
R.install_packages("Matrix")
R.install_packages("mgcv")
R.install_packages("grid")
R.install_packages("neuralnet")
R.install_packages("caret")
R.install_packages("zoo")

open Deedle
open RDotNet
open RProvider
open RProvider.``base``
open RProvider.datasets
open RProvider.neuralnet
open RProvider.caret

// Load data from R to Deedle frame
let iris : Frame<int, string> = R.iris.GetValue()

// Observe iris data set
let features =
iris
|> Frame.filterCols (fun c _ -> c <> "Species")
|> Frame.mapColValues (fun c -> c.As<double>())
let targets =
R.as_factor(iris.Columns.["Species"])

R.featurePlot(x = features, y = targets, plot = "pairs")

iris.ReplaceColumn("Species", targets.AsNumeric())
// Split data to training and testing sets (70% vs 30%)
let range = [1..iris.RowCount]
let trainingIdxs : int[] = R.sample(range, iris.RowCount*7/10).GetValue()
let testingIdxs : int[] = R.setdiff(range, trainingIdxs).GetValue()
let trainingSet = iris.Rows.[trainingIdxs]
let testingSet = iris.Rows.[testingIdxs]

// Train neural network
let nn =
R.neuralnet(
"Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width",
data = trainingSet, hidden = R.c(3,2),
err_fct = "ce", linear_output = true)

// Plot the resulting neural network with coefficients
R.eval(R.parse(text="library(grid)"))
R.plot_nn nn

// Split testing set into features and targets
let testingFeatures =
testingSet
|> Frame.filterCols (fun c _ -> c <> "Species")
|> Frame.mapColValues (fun c -> c.As<double>())
let testingTargets =
testingSet.Columns.["Species"].As<int>().Values

// Predict `Species` for testingFeatures with neural network
let prediction =
R.compute(nn, testingFeatures)
.AsList().["net.result"].AsVector()
|> Seq.cast<double>
|> Seq.map (round >> int))

// Calculate number of misclassified irises
let misclassified =
Seq.zip prediction testingTargets
|> Seq.filter (fun (a,b) -> a<>b)
|> Seq.length

printfn "Misclassified irises '%d' of '%d'" misclassified (testingSet.RowCount)

P.S.

Notice, if you have problems with bootstrapping RProvider and/or converting R data frame to Deedle data frames – you need to verify that during installation of NuGet packages, all assemblies have been copied to RProvider’s lib sub-folder (see in the following picture).

deedle_rprovider

21 thoughts on “F# Neural Networks with FsLab

  1. hello!,I like your writing very much! percentage we be in contact extra about your article on
    AOL? I require a specialist on this house to resolve my
    problem. Maybe that’s you! Taking a look ahead to peer you.

  2. Hi Sergey, thanks for this excellent work.

    I have been trying to replicate what you have done as a first step. However on line 3,

    1 #I @”..\packages\Deedle.1.0.0″
    2 #I @”..\packages\RProvider.1.0.5\”
    3 #load “C:\Quant_Codes\FSharp\RProvider_Console\packages\RProvider.1.0.10\RProvider.fsx”
    4 #load “Deedle.fsx”

    I got following error:

    Error 1 One or more errors in loaded file.
    The type provider ‘RProvider.RProvider’ reported an error: The type provider constructor has thrown an exception: Value cannot be null.
    Parameter name: path1 C:\Quant_Codes\FSharp\RProvider_Console\RProvider_Console\Script1.fsx 4 1 RProvider_Console

    I use visual studio 2013 express edition for desktop with Microsoft (R) F# Interactive version 12.0.30110.0

    Any idea?

  3. Under the current version, the following line:

    let targets =
    R.as_factor(iris.Columns.[“Species”].As())

    throws the exception: “Input string was not in a correct format.”

  4. Hi Sergey,

    Thank you for your email message. After learning that the code on your website does not run with the current versions of Visual Studio, and after a brief investigation of F# (and RProvider) I have decided to abandon F#.

    I predict Microsoft will stand by and watch Swift and Julia bury F#. It is clear to me that Microsoft has no interest in making their languages work with Python and R. It is a fatal flaw in the culture at Microsoft.

    Charles

    1. Hi Charles,

      Sorry for your bad experience with F#.

      To be honest, RProvider and PythonProvider are not Microsoft’ projects. They are community projects. Cross-language interoperability is hard by definition.

      I am not sure that Swift or Julia are propose something similar to F# type providers to user… But nevertheless good luck in your journey.

      Sergey

  5. HI Sergey,
    I got this error message “neuralNetworks.fsx(16,16): error FS0039: The namespace ‘neuralnet’ is not defined”

    1. Visual Studio Express 2013 version 12.0.21005.1 REL
    2. Windows 8.1
    3. RProvider.1.1.8
    4. R.NET.Community.1.5.16
    5. R.NET.Community.FSharp.0.1.9
    6. R x64 3.1.3

      1. I installed both 32 and 64 bit version of R 3.1.3. It works, BUT I got a new error

        Binding session to ‘C:\sync\fsharp\AlgTrade\AlgTrade\packages\RProvider.1.1.8\lib/net40\RProvider.Runtime.dll’…
        Binding session to ‘C:\sync\fsharp\AlgTrade\AlgTrade\packages\RProvider.1.1.8\../R.NET.Community.FSharp.0.1.9/lib/net40\RDotNet.FSharp.dll’…
        System.FormatException: Input string was not in a correct format.
        at System.Number.StringToNumber(String str, NumberStyles options, NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
        at System.Number.ParseInt32(String s, NumberStyles style, NumberFormatInfo info)

Leave a Reply to Emanuek Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s