commit 29ab57879f5108493b3d6f3bead3744458d79f84
parent e09707321c7b31a93e6c08695064e17e5dc2f768
Author: John Kubach <johnkubach@gmail.com>
Date: Tue, 28 Sep 2021 11:19:55 -0400
README
Diffstat:
A | README | | | 201 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
1 file changed, 201 insertions(+), 0 deletions(-)
diff --git a/README b/README
@@ -0,0 +1,201 @@
+% Computer Vision: Assignment 4
+% John Kubach, Abhinav Bhaskara, Krupa Patel
+% 04/10/2019
+
+# Summary
+
+This program takes a set of images with classifications,
+learns to properly classify each image, and attempts to classify
+new images. The program uses a neural network with 4 layers.
+
+The program is able to correctly classify all test digits from 0 to 9.
+The program can handle some minor alterations to a digit. If a larger
+classification table was implemented, along with more complex feature
+extraction, the program would likely be able to identify more extreme
+changes to a digit.
+
+# Structure
+
+The neural network is comprised of an input layer, two hidden layers, and
+an output layer. Each having 64, 64, 16, and 4 neurons, respectively.
+The expected input is a 16x16 image. Images are scaled to this sized
+beforehand, and transformed into a 1 dimensional array. This array is then
+fed into the neural network, one pixel per input neuron.
+
+# Training
+
+A 2 dimensional array of weights is created for each non-input layer.
+The initial values are randomized in an attempt to provide 'neutral' values.
+The neural network is then trained for a desired amount of iterations.
+During each iteration, the current error rate is calculated, and appropriate
+weights are adjusted. This continues for however many iterations were specified.
+After training, the current output layer values are displayed, and each
+layers current weights are saved for the testing phase.
+
+Finding the optimal amount of trials requires some testing and some luck
+with the randomly assigned weights. One interesting note is that it is
+possible to *over* train the neural network. This causes it to be unable to
+classify images that do not look exactly like the training images.
+
+## Example Training Output
+
+```text
+Training Output:
+[[9.98888174e-01 6.82591060e-04 8.04198012e-04 5.93295192e-04] // Dog Set
+[9.98707815e-01 8.08405508e-04 8.64003748e-04 8.53519317e-04]
+[9.98794009e-01 8.35549210e-04 8.21596776e-04 8.06750883e-04]
+[9.98636681e-01 7.35252658e-04 1.02774762e-03 7.58062747e-04]
+[9.98546622e-01 1.50103573e-03 5.16936628e-04 1.03439701e-03]
+[2.60304468e-04 9.99974825e-01 2.29332040e-04 1.69056971e-04] // Cat Set
+[1.07440031e-03 9.98568536e-01 9.69294389e-05 3.40544296e-05]
+[6.91610270e-04 9.99719712e-01 1.98320394e-04 5.56304841e-05]
+[1.18627780e-03 9.99020156e-01 2.18108343e-03 5.71687197e-05]
+[1.16542730e-03 9.98645448e-01 9.63009320e-05 3.32109809e-05]
+[3.39606256e-04 3.47142378e-04 9.98805323e-01 8.90485423e-04] // Bird Set
+[2.16299157e-04 6.55176578e-04 9.99191789e-01 9.02445843e-04]
+[1.73870465e-04 1.08641555e-03 9.98324536e-01 1.07892674e-03]
+[1.97466255e-04 5.50309150e-04 9.98364132e-01 1.23132418e-03]
+[7.70361146e-04 7.12426479e-04 9.98765734e-01 3.95692047e-04]
+[2.21954250e-04 1.39516012e-04 1.08161029e-03 9.99118352e-01] // Dolphin Set
+[3.60208262e-04 1.44197118e-04 4.30607822e-04 9.99354617e-01]
+[1.38041875e-03 1.49779824e-05 1.27121142e-03 9.98606011e-01]
+[1.09901220e-03 2.33248134e-05 9.43514418e-04 9.98818368e-01]
+[5.48383802e-04 6.02698258e-05 7.61811226e-04 9.98620953e-01]]
+Training Accuracy: 99.92408797440399%
+```
+Every image in each set has been correctly classified. The highest number in a row
+(typically 9.Xe-01) indicated what the neural network deems to be the most
+likely classification. This is derived from the array of labels provided to
+the neural network.
+
+```text
+/* Label arrays */
+
+Dog = [1, 0, 0, 0]
+Cat = [0, 1, 0, 0]
+Bird = [0, 0, 1, 0]
+Dolphin = [0, 0, 0, 1]
+```
+
+# Testing
+
+An image or set of images is loaded into the testing suite. Here,
+the images are loaded into an array (or groups of arrays) and run through
+the neural network using its previously calculated weights. The results for
+each groups of arrays are then displayed.
+
+All images used for testing can be found at
+http://elvis.rowan.edu/~kubachj9/test/
+
+# Results
+
+The neural network is able to classify new images with an acceptable
+degree of accuracy. Some images were chosen to deliberately throw off the
+neural network. Some end up classified correctly, some do not. Some test
+runs are show below. Each run uses the same test images, but the neural
+network has been re-trained between each run.
+
+```text
+Dog Output:
+[[9.92946312e-01 5.13394720e-03 4.40893902e-04 2.11938835e-03]
+[5.16077993e-04 8.66850125e-02 1.25762657e-01 2.41557723e-04]
+[8.16348130e-01 1.20886021e-04 2.89027234e-03 2.47092104e-02]
+[8.39228245e-01 7.88785240e-04 1.00745923e-04 5.33087572e-02]
+[1.94548654e-02 1.99387163e-03 2.17110484e-04 3.41123925e-01]
+[7.07005020e-01 5.83498359e-01 1.53408880e-05 2.95748952e-04]
+[9.97196647e-01 3.83429749e-03 3.41908344e-04 2.14276994e-03]]
+Cat Output:
+[[4.47759781e-04 8.18766948e-01 2.11805016e-03 6.79393286e-05]
+[2.19098873e-01 9.83914807e-04 3.04876521e-02 4.37712414e-04]
+[2.39054539e-04 1.15335029e-02 9.57995828e-01 4.47584065e-04]
+[7.01469601e-04 9.99686310e-01 8.64621916e-05 5.80064861e-05]
+[5.97478994e-05 9.41148733e-01 8.49791525e-03 4.99674179e-04]]
+Bird Output:
+[[4.67698446e-02 5.51885959e-02 1.32107761e-01 1.10466507e-05]
+[3.35017940e-03 3.21389726e-03 5.85197874e-01 6.76088527e-05]
+[1.69694787e-04 1.21550027e-03 9.96513058e-01 1.09756966e-03]
+[2.12303469e-04 4.20100127e-04 9.99060237e-01 8.69037275e-04]
+[2.28823606e-04 4.18970232e-04 9.98585494e-01 9.85507606e-04]]
+Dolphin Output:
+[[3.26050685e-02 3.85284434e-05 7.11450048e-01 3.28085677e-01]
+[4.99723602e-04 7.13053502e-04 3.01765292e-05 9.98332469e-01]
+[2.24624469e-01 4.42831937e-06 7.73170125e-01 1.32411367e-01]
+[4.72816283e-01 4.22773667e-05 1.54391360e-02 8.82644792e-02]
+[3.69741707e-01 7.21510992e-06 2.04173069e-01 1.75370263e-01]]
+
+
+Dog Output:
+[[6.30028965e-01 7.65779685e-04 1.41203486e-02 3.37035716e-03]
+[7.85611661e-03 5.99072250e-06 1.56075824e-01 2.19917355e-01]
+[6.95952890e-01 1.00695132e-03 8.47136595e-01 7.96077662e-05]
+[9.81858956e-01 5.16115888e-05 3.53474470e-04 8.75940979e-03]
+[9.98113438e-01 2.60358683e-05 1.10719655e-03 1.89139239e-03]
+[9.42393628e-02 1.93356741e-03 9.90377805e-01 7.04090901e-05]
+[9.99052824e-01 8.78798624e-05 1.27929834e-03 8.43979401e-04]]
+Cat Output:
+[[1.43149110e-01 4.06166027e-03 9.72594378e-01 5.66885911e-05]
+[7.93407950e-06 2.19519453e-02 9.61920824e-01 4.76949079e-01]
+[8.29368075e-03 8.70782429e-01 3.84326505e-02 8.39343867e-06]
+[1.14608710e-05 9.98416086e-01 4.52121848e-04 4.11318760e-03]
+[7.99861660e-01 2.38319696e-02 2.81453634e-04 7.71789299e-03]]
+Bird Output:
+[[1.40948747e-02 6.63199349e-06 9.78398586e-02 2.91727180e-01]
+[4.71099710e-03 3.78062135e-05 9.78174407e-01 2.96259498e-03]
+[5.82817937e-03 2.90389957e-04 9.87300855e-01 1.07471737e-03]
+[2.23767335e-03 1.02802741e-03 3.05612965e-01 1.89408887e-01]
+[2.45536585e-04 3.31128652e-03 9.68748782e-01 4.19595914e-02]]
+Dolphin Output:
+[[7.58683819e-05 1.21636670e-05 9.95259789e-01 6.15061392e-02]
+[1.79361673e-03 1.71031619e-03 2.69409674e-05 9.95182306e-01]
+[5.47264181e-06 6.21264886e-04 4.69341926e-01 9.90294448e-01]
+[2.56329261e-03 1.07000062e-04 1.56927726e-03 9.88918495e-01]
+[5.02611255e-01 6.81189948e-06 9.74481891e-03 1.06574704e-01]]
+
+
+Dog Output:
+[[9.96400142e-01 1.97708262e-04 6.77067649e-07 1.04714627e-02]
+[1.18393157e-06 9.99851308e-01 5.15722273e-04 1.28717576e-01]
+[5.74074507e-03 1.04569865e-03 9.86427936e-01 1.14124466e-06]
+[9.99670447e-01 9.59188238e-06 2.23122563e-03 1.18118484e-06]
+[9.24513929e-01 9.27379280e-02 1.79431243e-06 2.38075810e-05]
+[6.40694394e-01 9.60086468e-01 1.06194472e-04 2.20960359e-05]
+[1.22750679e-02 9.76273384e-01 4.64674969e-06 4.65350887e-04]]
+Cat Output:
+[[4.62843653e-04 3.49801442e-03 1.13481111e-01 9.96811172e-04]
+[1.39896546e-02 9.43618854e-01 1.50248051e-02 2.13342898e-05]
+[1.94886339e-05 9.99849255e-01 5.04574869e-04 3.69805434e-03]
+[4.41025113e-05 9.99676185e-01 1.10710283e-03 6.78510057e-04]
+[1.82662942e-01 9.89296675e-01 3.36913388e-05 3.27509366e-05]]
+Bird Output:
+[[1.22798068e-06 4.47640675e-03 9.85119181e-01 5.27644360e-02]
+[2.03495974e-04 6.81640997e-01 1.18247815e-01 3.27804476e-04]
+[1.63215381e-03 3.73775800e-04 9.86614980e-01 1.36626474e-03]
+[3.17849786e-02 6.24507124e-05 9.96535527e-01 6.08654305e-06]
+[1.11357598e-03 9.49210423e-04 9.97605104e-01 1.60149642e-03]]
+Dolphin Output:
+[[1.45198838e-02 2.60925718e-05 4.54587724e-07 9.98977069e-01]
+[2.61312888e-04 1.29922718e-04 3.96288314e-03 9.66529138e-01]
+[7.87979924e-05 6.39200055e-01 3.76149999e-01 1.15555043e-03]
+[4.19385722e-01 6.27590480e-06 8.86406071e-03 2.15754644e-01]
+[1.36959602e-01 8.51370210e-06 8.73003711e-03 8.88510541e-01]]
+```
+
+## Common Errors
+
+The neural network seems to have the most trouble differentiating cats
+and dogs. Especially images of dogs sitting, which it often misidentifies
+as either a cat or a bird.
+
+![Often Classified as 'Cat' or 'Bird'](../test/dog/testdog2.png){ width=250px }
+
+This is likely due to none of the training data for dogs contains an
+image of a sitting dog, but all of the cats contain images of sitting cats.
+
+Another common misidentification is the image of a cat laying down. This is
+often classified as either a dog or a bird.
+
+![Often Classified as 'Dog' or 'Bird'](../test/cat/testcat2.png){ width=250px }
+
+This error likely has a similar reason to the previous, since there are no
+training images of cats laying down. The spread out shape is likely most
+similar to a bird flying or a dog walking.