JuliaML — Flux a (very) beginner example — Part 3
We’ve seen in part 1 and part 2, parameters to create regression line and how parameters influences final result.
But can we explore now the structure of the neural network itself.
We’ve use a very simple network : One input, no hidden layers, One output and identity as activation function.
The model for such a network in Flux is Dense(1,1)
Let’s create a NN with a 16 neurons hidden layer.
We still have “One input”, and “Output”
We will “chain” 2 models:
Dense (1, 16) and a Dense (16,1)
In flux, it is very easy to create such a model:
model = Chain( Dense(1,16), Dense(16,1) )
So let’s try this new model:
Oups …. this looks almost as a single layer !
In fact it’s perfectly, because if you compose linear equation, you’ll get a linear equation !
What’s will make the difference is the activation function. Up to now we use ‘identity’ which is linear, so everything keeps linear.
Activation functions
Activation functions are non-linear function that will provide all the power to neural networks.
We will use NNLib package that provide a lot of such functions :
- relu : Rectified Linear Unit
- gelu : Gaussian Error Linear Unit
- sigmoid
So let’s introduce those non linear function:
We chain 2 layers, with “gelu” activation between them :
and with 4 neurons in the hidden layer:
With sigmoid activation:
and with relu:
Be carefull, every time we create a model, internal parameters are initialised with random values, so if we train to few cycles our network, the final result will be different.
to be continued !