JuliaML — Flux a (very) beginner example — Part 1
When i’ve tried the first time to use Flux, i face a challenge to understand how it works. I’ve tried neural networks in the mid-90s for academic use, but since i’ve never used them again (at least my own NN).
Julia
Julia is a “recent” language, mostly use actually for academic purposes (as i know). The basis are quite simple, but the deep concepts and it’s main powerful features (like multiple dispatch) makes it difficult to manage for beginners.
The speed and efficiency of Julia, makes it very interesting, but i think the main issue that should be solve to become a good general language, is to have a package manager like NPM for javascript, and a good compiler/buiding infrastructure.
Flux
This Machine Learning package, is 100% written in Julia and contains all basis tools to build machine learning application.
Neural network basis
Flux provide all tools to build, train and evaluate neural networks.
The main components are:
- a model (the structure of the neural network)
- a loss function, that during the training will be an used to optimize the network
- a gradient function, which given will help to optimize the network parameters given a loss function result
Let’s start !
For beginning i found a simple regression as the perfect base to learn what to do, not to do, and experiment multiples options….
and you should having a beautiful plot ….
ok this might be a bit complicated for just getting a regression line, but we will see in more complex example that this could be very powerful.
Now i will explain every lines :
using Plots
This line will allow use to use a graphic library that will generate plots of our data.
To be able to use this library, you will need the first time, to install it in Julia the Plots package, so from the Julia REPL , either enter in pkg manager (]) of juste type :
using Pkg;
Pkg.add("Plots");
(https://datatofish.com/install-package-julia/)
using Flux
The main package of this tutorial, that will able you to do machine learning.
Same as Plots, you need to install it the first time:
using Pkg;
Pkg.add("Flux");
using NNlib
A package that will bring lots of activation functions (see Part 3), and as above, you need to install it the first time:
using Pkg;
Pkg.add("NNlib");
using Flux: @epochs
This line will bring macro @epochs from Flux package. This special macro will allow to execute multiple times a command. We will use it to train our model several times.
For example,
@epochs 2 println(“Hello”)
will print to time the line “Hello”.
This time no need to install, because it is a macro from the already installed package Flux.
m = Dense(1,1)
This line will create the simpliest neural network possible:
One input, no hidden layers, One output, and identity as activation.
As you know neural network are linear models with a final activation, so in this case:
output = identity( input * W + b) )
with W and b are one length’s vector .. so just a line like y = ax +b !
loss(x, y) = sum((m(x).-y).^²)
This function will determine the strategy for optimizing the neural network, and in this case to optimize our regression line to fit our datas.
The neural network will try to minimize this loss function, here we calculate the sum of square difference, also called the sum of squares
dataset = [([0.8], [1.0]),
([2.0], [3.0]),
([2.4], [2.0]),
([0], [0.5]),
([1.5], [2]),
([3], [2.5]),
([4.0], [1.5])]
Now we define our datas. This part seems simple, but you must understand exactly the structure to be able to expand our example for future applications :
if you ask Julia what is the type of this variable :
typeof(dataset)
you will get :
Array{Tuple{Array{T,1} where T,Array{T,1} where T},1}
That seems very complex, but it can be splitted in very simple levels :
Level 1 : Array{ 'Level 2' ,1}
So we have a one dimensional array. Each element of the array is a Tuple :
Level 2 : Tuple{'Level 3','Level 3'}
A tuple, is an ordered set, like (1,2). Each element of the Tuple is :
Level 3 : Array{T,1} where T
Which means : one dimensional array, where each element if type T
If we put all together, dataset is :
an one dimensional array of Tuple of one dimensional array of type T element …..
In fact, each Tuple will be a pair X and Y where f(X) = Y, if f is my regression line function, and the fact that X and Y are one dimensional array, if because they represente the inputs and outputs of my network.
In this example my network is one input and one output, so my array if just one element, but if i get a 28x28 input images (784 pixels) and 10 outputs in a network (like MNIST), then my tuple will be :
( [ 0, ….. 784 numbers representing each pixels……,1] , [0, … 10 numbers representing outputs, 1])
so dataset is : [ Tuple1, Tuple2, …. TypleN ] if my dataset has N points
where Tuplek : ( X, Y )
and X = [ x1, x2, …… xi ] for a ‘i’ input’s neural network
and Y= [ y1, y2, …… yj ] for a ‘j’ output’s neural network
To come back to my own dataset:
Tuple1 = ([0.8], [1.0]), Tuple2 = ([2.0], [3.0]) …
@epochs 100 Flux.train!(loss, Flux.params(m), dataset, ADAM())
This line is the calculation part, it ‘just’ train the neural network, 100 times with our ‘loss’ function, our neural network model ‘m’, our ‘dataset’ and with a gradient function ‘ADAM()’ from NNlib.
and now we only have to display our results:
x = 0:5
y =zeros(6)for i in x
y[i+1] = m([i])[1]
end;
This will create ‘x’ points from 0 to 5. For each point ‘x’, we will get the value from our trained model and store it in ‘y’.
To do this, be very carefull to give as an input to your model an array with the proper size (in this case just one element array : [i]).
In the same way, you get an array as an output, so to get the first element of the array, use array indexing properly : m(…)[1] to get the element n°1
plot(dataset, seriestype = :scatter, legend = false); plot!(x,y)
And now we display the plot, first we display our dataset as a scatter plot, and then we add our calculated points from our trained model !
Finish ….
See you next time !!!