image_neural_net

Animal recognition using a neural network.
Log | Files | Refs | README

commit 29ab57879f5108493b3d6f3bead3744458d79f84
parent e09707321c7b31a93e6c08695064e17e5dc2f768
Author: John Kubach <johnkubach@gmail.com>
Date:   Tue, 28 Sep 2021 11:19:55 -0400

README

Diffstat:
AREADME | 201+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 201 insertions(+), 0 deletions(-)

diff --git a/README b/README @@ -0,0 +1,201 @@ +% Computer Vision: Assignment 4 +% John Kubach, Abhinav Bhaskara, Krupa Patel +% 04/10/2019 + +# Summary + +This program takes a set of images with classifications, +learns to properly classify each image, and attempts to classify +new images. The program uses a neural network with 4 layers. + +The program is able to correctly classify all test digits from 0 to 9. +The program can handle some minor alterations to a digit. If a larger +classification table was implemented, along with more complex feature +extraction, the program would likely be able to identify more extreme +changes to a digit. + +# Structure + +The neural network is comprised of an input layer, two hidden layers, and +an output layer. Each having 64, 64, 16, and 4 neurons, respectively. +The expected input is a 16x16 image. Images are scaled to this sized +beforehand, and transformed into a 1 dimensional array. This array is then +fed into the neural network, one pixel per input neuron. + +# Training + +A 2 dimensional array of weights is created for each non-input layer. +The initial values are randomized in an attempt to provide 'neutral' values. +The neural network is then trained for a desired amount of iterations. +During each iteration, the current error rate is calculated, and appropriate +weights are adjusted. This continues for however many iterations were specified. +After training, the current output layer values are displayed, and each +layers current weights are saved for the testing phase. + +Finding the optimal amount of trials requires some testing and some luck +with the randomly assigned weights. One interesting note is that it is +possible to *over* train the neural network. This causes it to be unable to +classify images that do not look exactly like the training images. + +## Example Training Output + +```text +Training Output: +[[9.98888174e-01 6.82591060e-04 8.04198012e-04 5.93295192e-04] // Dog Set +[9.98707815e-01 8.08405508e-04 8.64003748e-04 8.53519317e-04] +[9.98794009e-01 8.35549210e-04 8.21596776e-04 8.06750883e-04] +[9.98636681e-01 7.35252658e-04 1.02774762e-03 7.58062747e-04] +[9.98546622e-01 1.50103573e-03 5.16936628e-04 1.03439701e-03] +[2.60304468e-04 9.99974825e-01 2.29332040e-04 1.69056971e-04] // Cat Set +[1.07440031e-03 9.98568536e-01 9.69294389e-05 3.40544296e-05] +[6.91610270e-04 9.99719712e-01 1.98320394e-04 5.56304841e-05] +[1.18627780e-03 9.99020156e-01 2.18108343e-03 5.71687197e-05] +[1.16542730e-03 9.98645448e-01 9.63009320e-05 3.32109809e-05] +[3.39606256e-04 3.47142378e-04 9.98805323e-01 8.90485423e-04] // Bird Set +[2.16299157e-04 6.55176578e-04 9.99191789e-01 9.02445843e-04] +[1.73870465e-04 1.08641555e-03 9.98324536e-01 1.07892674e-03] +[1.97466255e-04 5.50309150e-04 9.98364132e-01 1.23132418e-03] +[7.70361146e-04 7.12426479e-04 9.98765734e-01 3.95692047e-04] +[2.21954250e-04 1.39516012e-04 1.08161029e-03 9.99118352e-01] // Dolphin Set +[3.60208262e-04 1.44197118e-04 4.30607822e-04 9.99354617e-01] +[1.38041875e-03 1.49779824e-05 1.27121142e-03 9.98606011e-01] +[1.09901220e-03 2.33248134e-05 9.43514418e-04 9.98818368e-01] +[5.48383802e-04 6.02698258e-05 7.61811226e-04 9.98620953e-01]] +Training Accuracy: 99.92408797440399% +``` +Every image in each set has been correctly classified. The highest number in a row +(typically 9.Xe-01) indicated what the neural network deems to be the most +likely classification. This is derived from the array of labels provided to +the neural network. + +```text +/* Label arrays */ + +Dog = [1, 0, 0, 0] +Cat = [0, 1, 0, 0] +Bird = [0, 0, 1, 0] +Dolphin = [0, 0, 0, 1] +``` + +# Testing + +An image or set of images is loaded into the testing suite. Here, +the images are loaded into an array (or groups of arrays) and run through +the neural network using its previously calculated weights. The results for +each groups of arrays are then displayed. + +All images used for testing can be found at +http://elvis.rowan.edu/~kubachj9/test/ + +# Results + +The neural network is able to classify new images with an acceptable +degree of accuracy. Some images were chosen to deliberately throw off the +neural network. Some end up classified correctly, some do not. Some test +runs are show below. Each run uses the same test images, but the neural +network has been re-trained between each run. + +```text +Dog Output: +[[9.92946312e-01 5.13394720e-03 4.40893902e-04 2.11938835e-03] +[5.16077993e-04 8.66850125e-02 1.25762657e-01 2.41557723e-04] +[8.16348130e-01 1.20886021e-04 2.89027234e-03 2.47092104e-02] +[8.39228245e-01 7.88785240e-04 1.00745923e-04 5.33087572e-02] +[1.94548654e-02 1.99387163e-03 2.17110484e-04 3.41123925e-01] +[7.07005020e-01 5.83498359e-01 1.53408880e-05 2.95748952e-04] +[9.97196647e-01 3.83429749e-03 3.41908344e-04 2.14276994e-03]] +Cat Output: +[[4.47759781e-04 8.18766948e-01 2.11805016e-03 6.79393286e-05] +[2.19098873e-01 9.83914807e-04 3.04876521e-02 4.37712414e-04] +[2.39054539e-04 1.15335029e-02 9.57995828e-01 4.47584065e-04] +[7.01469601e-04 9.99686310e-01 8.64621916e-05 5.80064861e-05] +[5.97478994e-05 9.41148733e-01 8.49791525e-03 4.99674179e-04]] +Bird Output: +[[4.67698446e-02 5.51885959e-02 1.32107761e-01 1.10466507e-05] +[3.35017940e-03 3.21389726e-03 5.85197874e-01 6.76088527e-05] +[1.69694787e-04 1.21550027e-03 9.96513058e-01 1.09756966e-03] +[2.12303469e-04 4.20100127e-04 9.99060237e-01 8.69037275e-04] +[2.28823606e-04 4.18970232e-04 9.98585494e-01 9.85507606e-04]] +Dolphin Output: +[[3.26050685e-02 3.85284434e-05 7.11450048e-01 3.28085677e-01] +[4.99723602e-04 7.13053502e-04 3.01765292e-05 9.98332469e-01] +[2.24624469e-01 4.42831937e-06 7.73170125e-01 1.32411367e-01] +[4.72816283e-01 4.22773667e-05 1.54391360e-02 8.82644792e-02] +[3.69741707e-01 7.21510992e-06 2.04173069e-01 1.75370263e-01]] + + +Dog Output: +[[6.30028965e-01 7.65779685e-04 1.41203486e-02 3.37035716e-03] +[7.85611661e-03 5.99072250e-06 1.56075824e-01 2.19917355e-01] +[6.95952890e-01 1.00695132e-03 8.47136595e-01 7.96077662e-05] +[9.81858956e-01 5.16115888e-05 3.53474470e-04 8.75940979e-03] +[9.98113438e-01 2.60358683e-05 1.10719655e-03 1.89139239e-03] +[9.42393628e-02 1.93356741e-03 9.90377805e-01 7.04090901e-05] +[9.99052824e-01 8.78798624e-05 1.27929834e-03 8.43979401e-04]] +Cat Output: +[[1.43149110e-01 4.06166027e-03 9.72594378e-01 5.66885911e-05] +[7.93407950e-06 2.19519453e-02 9.61920824e-01 4.76949079e-01] +[8.29368075e-03 8.70782429e-01 3.84326505e-02 8.39343867e-06] +[1.14608710e-05 9.98416086e-01 4.52121848e-04 4.11318760e-03] +[7.99861660e-01 2.38319696e-02 2.81453634e-04 7.71789299e-03]] +Bird Output: +[[1.40948747e-02 6.63199349e-06 9.78398586e-02 2.91727180e-01] +[4.71099710e-03 3.78062135e-05 9.78174407e-01 2.96259498e-03] +[5.82817937e-03 2.90389957e-04 9.87300855e-01 1.07471737e-03] +[2.23767335e-03 1.02802741e-03 3.05612965e-01 1.89408887e-01] +[2.45536585e-04 3.31128652e-03 9.68748782e-01 4.19595914e-02]] +Dolphin Output: +[[7.58683819e-05 1.21636670e-05 9.95259789e-01 6.15061392e-02] +[1.79361673e-03 1.71031619e-03 2.69409674e-05 9.95182306e-01] +[5.47264181e-06 6.21264886e-04 4.69341926e-01 9.90294448e-01] +[2.56329261e-03 1.07000062e-04 1.56927726e-03 9.88918495e-01] +[5.02611255e-01 6.81189948e-06 9.74481891e-03 1.06574704e-01]] + + +Dog Output: +[[9.96400142e-01 1.97708262e-04 6.77067649e-07 1.04714627e-02] +[1.18393157e-06 9.99851308e-01 5.15722273e-04 1.28717576e-01] +[5.74074507e-03 1.04569865e-03 9.86427936e-01 1.14124466e-06] +[9.99670447e-01 9.59188238e-06 2.23122563e-03 1.18118484e-06] +[9.24513929e-01 9.27379280e-02 1.79431243e-06 2.38075810e-05] +[6.40694394e-01 9.60086468e-01 1.06194472e-04 2.20960359e-05] +[1.22750679e-02 9.76273384e-01 4.64674969e-06 4.65350887e-04]] +Cat Output: +[[4.62843653e-04 3.49801442e-03 1.13481111e-01 9.96811172e-04] +[1.39896546e-02 9.43618854e-01 1.50248051e-02 2.13342898e-05] +[1.94886339e-05 9.99849255e-01 5.04574869e-04 3.69805434e-03] +[4.41025113e-05 9.99676185e-01 1.10710283e-03 6.78510057e-04] +[1.82662942e-01 9.89296675e-01 3.36913388e-05 3.27509366e-05]] +Bird Output: +[[1.22798068e-06 4.47640675e-03 9.85119181e-01 5.27644360e-02] +[2.03495974e-04 6.81640997e-01 1.18247815e-01 3.27804476e-04] +[1.63215381e-03 3.73775800e-04 9.86614980e-01 1.36626474e-03] +[3.17849786e-02 6.24507124e-05 9.96535527e-01 6.08654305e-06] +[1.11357598e-03 9.49210423e-04 9.97605104e-01 1.60149642e-03]] +Dolphin Output: +[[1.45198838e-02 2.60925718e-05 4.54587724e-07 9.98977069e-01] +[2.61312888e-04 1.29922718e-04 3.96288314e-03 9.66529138e-01] +[7.87979924e-05 6.39200055e-01 3.76149999e-01 1.15555043e-03] +[4.19385722e-01 6.27590480e-06 8.86406071e-03 2.15754644e-01] +[1.36959602e-01 8.51370210e-06 8.73003711e-03 8.88510541e-01]] +``` + +## Common Errors + +The neural network seems to have the most trouble differentiating cats +and dogs. Especially images of dogs sitting, which it often misidentifies +as either a cat or a bird. + +![Often Classified as 'Cat' or 'Bird'](../test/dog/testdog2.png){ width=250px } + +This is likely due to none of the training data for dogs contains an +image of a sitting dog, but all of the cats contain images of sitting cats. + +Another common misidentification is the image of a cat laying down. This is +often classified as either a dog or a bird. + +![Often Classified as 'Dog' or 'Bird'](../test/cat/testcat2.png){ width=250px } + +This error likely has a similar reason to the previous, since there are no +training images of cats laying down. The spread out shape is likely most +similar to a bird flying or a dog walking.