Tag Archives: Neural network

TensorFlow and Keras on RTX2060 for pattern recognition

The MNIST database is a catalog of handwritten digits for image processing. With TensorFlow and Keras training a neural network classifier using the Nvidia RTX206 GPU is a walk in the park.
mnist2

Using the default import of the MNIST dataset using tf.keras, which comprises of 60,000 handwritten digits images in 28 x 28 pixels, the training of a neural network to learn classifying it could be accomplished in a matter of seconds, depending on the accuracy. The same learning done on ordinary CPU is not as quick as GPU for architectural differences. In this sample run, the digit “eight” is correctly identified using the neural network.
mnist4.PNG

A simple comparison of the training result of the MNIST database on my RTX2060 with varying training samples depicts slight differences in the final accuracy.
mnist1

 

Advertisements

Performance testing the Movidius Neural Compute Stick

The Movidius Neural Compute Stick is great for visual recognition projects with low power consumption and small form factor requirements. A basic introduction is covered in the previous installment, and it is time for a little more field tests on performance.

The first test is calculator, one is Texas Instruments TI84 Plus Pocket SE and the other a Casio fx-4500PA. Both are recognized as hand-held computer with fairly high confidence.movtest1movtest2

Second test is a luxury watch photo, recognized as an analog clock.movtest3

The final test is a feather. At some point the stick returned a 100% confidence. The result is correct.movtest4

This neural computing module did a good job with amazing results as a development kit which is also great for hobby projects. As the technology on hardware AI acceleration matures, we may one day see its integration with CPU to make it ubiquitous product available on the market.

Deep Learning with the Movidius Neural Compute Stick

IMAG1428

Deep Learning is a breakthrough in Artificial Intelligence. With its root from neural network, modern computing hardware advancement enabled new possibilities by sophisticated integrated circuits technology.

A branch of this exciting area in AI is machine learning. The leading development frameworks include TensorFlow and Caffe. Pattern recognition is a practical application of machine learning where photos or videos are analysed by machine to produce usable output as if a human did the analysis. GPU has been a favorite choice for its specialized architecture, delivering its supreme processing power not only in graphics processing but also popular among the neural network community. Covered in a previous installment is how to deploy an Amazon Web Services GPU instance to analyse real time traffic camera images using Caffe.

To bring this kind of machine learning power to IoT, Intel shrank and packaged a specialized Vision Processing Unit into the form factor of a USB thumb drive in the Movidius™ Neural Compute Stick.
IMAG1432

It sports an ultra low power Vision Processing Unit (VPU) inside an aluminium casing, weights only 30g (without the cap). Supported on the Raspberry Pi 3 model B makes it a very attractive add-on for development projects involving AI application on this platform.IMAG1435

In the form factor of an USB thumb drive, the specialized VPU geared for machine learning in the Movidius performs as an AI accelerator for the host computer.IMAG1439

To put this neural compute stick into action, an SDK available from git download provided by Movidius is required. Although this SDK runs on Ubuntu, Windows users with VirtualBox can easily install the SDK with an Ubuntu 16.04 VM.

While the SDK comes with many examples, and the setup is a walk in the park, running these examples is not so straight forward, especially on a VM. There are points to note from making this stick available in the VM including USB 3 and filters setting in VirtualBox, to the actual execution of the provided sample scripts. Some examples required two sticks to run. Developers should be comfortable with Python, unix make / git commands, as well as installing plugins in Ubuntu.
mod1

The results from the examples in the SDK alone are quite convincing, considering the form factor of the stick and its electrical power consumption. This neural computing stick “kept its cool” literally throughout the test drive, unlike the FPGA stick I occasionally use for bitcoins mining which turn really hot.

Experimenting with convergence time in neural network models

After setting up Keras and Theano and have some basic benchmark on the Nvidia GPU, the next thing to get a taste of neural network through these deep learning models are to compare these with one to solve the same problem (an XOR classification) that run on a modern calculator, the TI Nspire, using the Nelder-Mead algorithm for convergence of neural network weights.

A sample of SGD settings in Keras Theano with 30000 iterations converged in around 84 seconds. While the TI Nspire  completed with comparable results in 19 seconds. This is not a fair game of course, as there are lots of parameters that can be tuned in the model.

keras-nspire4

 

Exploring Theano with Keras

Theano needs no introduction in the field of deep learning. It is based on Python and supports CUDA. Keras is a libray that wraps the complexity of Theano to provide a high level abstraction for developing deep learning solutions.

Installing Theano and Keras are easy and there are tons of resources available online. However, my primary CUDA platform is on Windows so most standard guides that are based on Linux required some adaptations. Most notably are the proper setting of the PATH variable and the use of the Visual Studio command prompt.

The basic installation steps include setting up of CUDA, a scientific python environment, and then Theano and Keras. CuDNN is optional and required Compute Capability of greater than 3.0 which unfortunately my GPU is a bit old and does not meet this requirement.

keras1

Some programs on Windows platform encountered errors and found to be library related issues. Like this one that failed to compile on Spyder can be resolved using the Visual Studio Cross Tool Command Prompt.
keras4keras3a

The Nvidia profiler checking for the performance of the GPU, running the Keras example of the MNIST digits with MLP.keras2

Training neural network using Nelder-Mead algorithm on TI Nspire

In this installment the Nelder-Mead method is used to train a simple neural network for the XOR problem. The network consisted of 2-input, 1-output, and 2 hidden layers, and is fully connected. In mainstream practical neural network, back propagation and other evolutionary algorithms are much more popular for training neural network for real world problem. Nelder-Mead is used here just out of curiosity to see how this general optimization routine performed under neural network settings on TI Nspire.
neural-network-xor1

The sigmoid function is declared in an TI Nspire function.
neural-network-xor2

For the XOR problem, the inputs are defined as two lists, and the expected output in another.
neural-network-xor3

The activation functions for each neuron are declared.
neural-network-xor4

To train the network, the sum of squared error function is used to feed into the Nelder-Mead algorithm for minimization. Random numbers are used for initial parameters.
neural-network-xor6
neural-network-xor7

Finally the resulting weights and bias are obtained from running the Nelder-Mead program.
neural-network-xor8

The comparison graph of the performance of the Nelder-Mead trained XOR neural network against expected values.
neural-network-xor9

 

Stochastic Gradient Descent in R

Stochastic Gradient Descent (SGD) is an optimization method common used in machine learning, especially neural network. The name implied it is aimed at minimization of function.

In R, there is a SGD package for the purpose. As a warm up for the newly upgraded R and RStudio, it is taken as the target of a test drive.

R-sgd1

Running the documentation example.
R-sgd2

Running the included demo for logistic regression.R-sgd3