JuliaML — Flux a (very) beginner example — Part 3

Sébastien Dejean

3 min readJan 23, 2021

We’ve seen in part 1 and part 2, parameters to create regression line and how parameters influences final result.

But can we explore now the structure of the neural network itself.

We’ve use a very simple network : One input, no hidden layers, One output and identity as activation function.

The model for such a network in Flux is Dense(1,1)

Let’s create a NN with a 16 neurons hidden layer.

We still have “One input”, and “Output”

We will “chain” 2 models:

Dense (1, 16) and a Dense (16,1)

In flux, it is very easy to create such a model:

model = Chain( Dense(1,16), Dense(16,1) )

So let’s try this new model:

Oups …. this looks almost as a single layer !
In fact it’s perfectly, because if you compose linear equation, you’ll get a linear equation !
What’s will make the difference is the activation function. Up to now we use ‘identity’ which is linear, so everything keeps linear.

Activation functions

Activation functions are non-linear function that will provide all the power to neural networks.

We will use NNLib package that provide a lot of such functions :

relu : Rectified Linear Unit
gelu : Gaussian Error Linear Unit

sigmoid

So let’s introduce those non linear function:

We chain 2 layers, with “gelu” activation between them :

Model = Chain( Dense(1, 1, gelu), Dense(1,1))

and with 4 neurons in the hidden layer:

Model = Chain( Dense(1, 4, gelu), Dense(4,1))

With sigmoid activation:

Model = Chain( Dense(1, 4, sigmoid), Dense(4,1))

and with relu:

Be carefull, every time we create a model, internal parameters are initialised with random values, so if we train to few cycles our network, the final result will be different.

to be continued !

JuliaML — Flux a (very) beginner example — Part 3

Activation functions

Written by Sébastien Dejean

No responses yet