README (9346B)
1 % Computer Vision: Assignment 4 2 % John Kubach, Abhinav Bhaskara, Krupa Patel 3 % 04/10/2019 4 5 # Summary 6 7 This program takes a set of images with classifications, 8 learns to properly classify each image, and attempts to classify 9 new images. The program uses a neural network with 4 layers. 10 11 The program is able to correctly classify all test digits from 0 to 9. 12 The program can handle some minor alterations to a digit. If a larger 13 classification table was implemented, along with more complex feature 14 extraction, the program would likely be able to identify more extreme 15 changes to a digit. 16 17 # Structure 18 19 The neural network is comprised of an input layer, two hidden layers, and 20 an output layer. Each having 64, 64, 16, and 4 neurons, respectively. 21 The expected input is a 16x16 image. Images are scaled to this sized 22 beforehand, and transformed into a 1 dimensional array. This array is then 23 fed into the neural network, one pixel per input neuron. 24 25 # Training 26 27 A 2 dimensional array of weights is created for each non-input layer. 28 The initial values are randomized in an attempt to provide 'neutral' values. 29 The neural network is then trained for a desired amount of iterations. 30 During each iteration, the current error rate is calculated, and appropriate 31 weights are adjusted. This continues for however many iterations were specified. 32 After training, the current output layer values are displayed, and each 33 layers current weights are saved for the testing phase. 34 35 Finding the optimal amount of trials requires some testing and some luck 36 with the randomly assigned weights. One interesting note is that it is 37 possible to *over* train the neural network. This causes it to be unable to 38 classify images that do not look exactly like the training images. 39 40 ## Example Training Output 41 42 ```text 43 Training Output: 44 [[9.98888174e-01 6.82591060e-04 8.04198012e-04 5.93295192e-04] // Dog Set 45 [9.98707815e-01 8.08405508e-04 8.64003748e-04 8.53519317e-04] 46 [9.98794009e-01 8.35549210e-04 8.21596776e-04 8.06750883e-04] 47 [9.98636681e-01 7.35252658e-04 1.02774762e-03 7.58062747e-04] 48 [9.98546622e-01 1.50103573e-03 5.16936628e-04 1.03439701e-03] 49 [2.60304468e-04 9.99974825e-01 2.29332040e-04 1.69056971e-04] // Cat Set 50 [1.07440031e-03 9.98568536e-01 9.69294389e-05 3.40544296e-05] 51 [6.91610270e-04 9.99719712e-01 1.98320394e-04 5.56304841e-05] 52 [1.18627780e-03 9.99020156e-01 2.18108343e-03 5.71687197e-05] 53 [1.16542730e-03 9.98645448e-01 9.63009320e-05 3.32109809e-05] 54 [3.39606256e-04 3.47142378e-04 9.98805323e-01 8.90485423e-04] // Bird Set 55 [2.16299157e-04 6.55176578e-04 9.99191789e-01 9.02445843e-04] 56 [1.73870465e-04 1.08641555e-03 9.98324536e-01 1.07892674e-03] 57 [1.97466255e-04 5.50309150e-04 9.98364132e-01 1.23132418e-03] 58 [7.70361146e-04 7.12426479e-04 9.98765734e-01 3.95692047e-04] 59 [2.21954250e-04 1.39516012e-04 1.08161029e-03 9.99118352e-01] // Dolphin Set 60 [3.60208262e-04 1.44197118e-04 4.30607822e-04 9.99354617e-01] 61 [1.38041875e-03 1.49779824e-05 1.27121142e-03 9.98606011e-01] 62 [1.09901220e-03 2.33248134e-05 9.43514418e-04 9.98818368e-01] 63 [5.48383802e-04 6.02698258e-05 7.61811226e-04 9.98620953e-01]] 64 Training Accuracy: 99.92408797440399% 65 ``` 66 Every image in each set has been correctly classified. The highest number in a row 67 (typically 9.Xe-01) indicated what the neural network deems to be the most 68 likely classification. This is derived from the array of labels provided to 69 the neural network. 70 71 ```text 72 /* Label arrays */ 73 74 Dog = [1, 0, 0, 0] 75 Cat = [0, 1, 0, 0] 76 Bird = [0, 0, 1, 0] 77 Dolphin = [0, 0, 0, 1] 78 ``` 79 80 # Testing 81 82 An image or set of images is loaded into the testing suite. Here, 83 the images are loaded into an array (or groups of arrays) and run through 84 the neural network using its previously calculated weights. The results for 85 each groups of arrays are then displayed. 86 87 All images used for testing can be found at 88 http://elvis.rowan.edu/~kubachj9/test/ 89 90 # Results 91 92 The neural network is able to classify new images with an acceptable 93 degree of accuracy. Some images were chosen to deliberately throw off the 94 neural network. Some end up classified correctly, some do not. Some test 95 runs are show below. Each run uses the same test images, but the neural 96 network has been re-trained between each run. 97 98 ```text 99 Dog Output: 100 [[9.92946312e-01 5.13394720e-03 4.40893902e-04 2.11938835e-03] 101 [5.16077993e-04 8.66850125e-02 1.25762657e-01 2.41557723e-04] 102 [8.16348130e-01 1.20886021e-04 2.89027234e-03 2.47092104e-02] 103 [8.39228245e-01 7.88785240e-04 1.00745923e-04 5.33087572e-02] 104 [1.94548654e-02 1.99387163e-03 2.17110484e-04 3.41123925e-01] 105 [7.07005020e-01 5.83498359e-01 1.53408880e-05 2.95748952e-04] 106 [9.97196647e-01 3.83429749e-03 3.41908344e-04 2.14276994e-03]] 107 Cat Output: 108 [[4.47759781e-04 8.18766948e-01 2.11805016e-03 6.79393286e-05] 109 [2.19098873e-01 9.83914807e-04 3.04876521e-02 4.37712414e-04] 110 [2.39054539e-04 1.15335029e-02 9.57995828e-01 4.47584065e-04] 111 [7.01469601e-04 9.99686310e-01 8.64621916e-05 5.80064861e-05] 112 [5.97478994e-05 9.41148733e-01 8.49791525e-03 4.99674179e-04]] 113 Bird Output: 114 [[4.67698446e-02 5.51885959e-02 1.32107761e-01 1.10466507e-05] 115 [3.35017940e-03 3.21389726e-03 5.85197874e-01 6.76088527e-05] 116 [1.69694787e-04 1.21550027e-03 9.96513058e-01 1.09756966e-03] 117 [2.12303469e-04 4.20100127e-04 9.99060237e-01 8.69037275e-04] 118 [2.28823606e-04 4.18970232e-04 9.98585494e-01 9.85507606e-04]] 119 Dolphin Output: 120 [[3.26050685e-02 3.85284434e-05 7.11450048e-01 3.28085677e-01] 121 [4.99723602e-04 7.13053502e-04 3.01765292e-05 9.98332469e-01] 122 [2.24624469e-01 4.42831937e-06 7.73170125e-01 1.32411367e-01] 123 [4.72816283e-01 4.22773667e-05 1.54391360e-02 8.82644792e-02] 124 [3.69741707e-01 7.21510992e-06 2.04173069e-01 1.75370263e-01]] 125 126 127 Dog Output: 128 [[6.30028965e-01 7.65779685e-04 1.41203486e-02 3.37035716e-03] 129 [7.85611661e-03 5.99072250e-06 1.56075824e-01 2.19917355e-01] 130 [6.95952890e-01 1.00695132e-03 8.47136595e-01 7.96077662e-05] 131 [9.81858956e-01 5.16115888e-05 3.53474470e-04 8.75940979e-03] 132 [9.98113438e-01 2.60358683e-05 1.10719655e-03 1.89139239e-03] 133 [9.42393628e-02 1.93356741e-03 9.90377805e-01 7.04090901e-05] 134 [9.99052824e-01 8.78798624e-05 1.27929834e-03 8.43979401e-04]] 135 Cat Output: 136 [[1.43149110e-01 4.06166027e-03 9.72594378e-01 5.66885911e-05] 137 [7.93407950e-06 2.19519453e-02 9.61920824e-01 4.76949079e-01] 138 [8.29368075e-03 8.70782429e-01 3.84326505e-02 8.39343867e-06] 139 [1.14608710e-05 9.98416086e-01 4.52121848e-04 4.11318760e-03] 140 [7.99861660e-01 2.38319696e-02 2.81453634e-04 7.71789299e-03]] 141 Bird Output: 142 [[1.40948747e-02 6.63199349e-06 9.78398586e-02 2.91727180e-01] 143 [4.71099710e-03 3.78062135e-05 9.78174407e-01 2.96259498e-03] 144 [5.82817937e-03 2.90389957e-04 9.87300855e-01 1.07471737e-03] 145 [2.23767335e-03 1.02802741e-03 3.05612965e-01 1.89408887e-01] 146 [2.45536585e-04 3.31128652e-03 9.68748782e-01 4.19595914e-02]] 147 Dolphin Output: 148 [[7.58683819e-05 1.21636670e-05 9.95259789e-01 6.15061392e-02] 149 [1.79361673e-03 1.71031619e-03 2.69409674e-05 9.95182306e-01] 150 [5.47264181e-06 6.21264886e-04 4.69341926e-01 9.90294448e-01] 151 [2.56329261e-03 1.07000062e-04 1.56927726e-03 9.88918495e-01] 152 [5.02611255e-01 6.81189948e-06 9.74481891e-03 1.06574704e-01]] 153 154 155 Dog Output: 156 [[9.96400142e-01 1.97708262e-04 6.77067649e-07 1.04714627e-02] 157 [1.18393157e-06 9.99851308e-01 5.15722273e-04 1.28717576e-01] 158 [5.74074507e-03 1.04569865e-03 9.86427936e-01 1.14124466e-06] 159 [9.99670447e-01 9.59188238e-06 2.23122563e-03 1.18118484e-06] 160 [9.24513929e-01 9.27379280e-02 1.79431243e-06 2.38075810e-05] 161 [6.40694394e-01 9.60086468e-01 1.06194472e-04 2.20960359e-05] 162 [1.22750679e-02 9.76273384e-01 4.64674969e-06 4.65350887e-04]] 163 Cat Output: 164 [[4.62843653e-04 3.49801442e-03 1.13481111e-01 9.96811172e-04] 165 [1.39896546e-02 9.43618854e-01 1.50248051e-02 2.13342898e-05] 166 [1.94886339e-05 9.99849255e-01 5.04574869e-04 3.69805434e-03] 167 [4.41025113e-05 9.99676185e-01 1.10710283e-03 6.78510057e-04] 168 [1.82662942e-01 9.89296675e-01 3.36913388e-05 3.27509366e-05]] 169 Bird Output: 170 [[1.22798068e-06 4.47640675e-03 9.85119181e-01 5.27644360e-02] 171 [2.03495974e-04 6.81640997e-01 1.18247815e-01 3.27804476e-04] 172 [1.63215381e-03 3.73775800e-04 9.86614980e-01 1.36626474e-03] 173 [3.17849786e-02 6.24507124e-05 9.96535527e-01 6.08654305e-06] 174 [1.11357598e-03 9.49210423e-04 9.97605104e-01 1.60149642e-03]] 175 Dolphin Output: 176 [[1.45198838e-02 2.60925718e-05 4.54587724e-07 9.98977069e-01] 177 [2.61312888e-04 1.29922718e-04 3.96288314e-03 9.66529138e-01] 178 [7.87979924e-05 6.39200055e-01 3.76149999e-01 1.15555043e-03] 179 [4.19385722e-01 6.27590480e-06 8.86406071e-03 2.15754644e-01] 180 [1.36959602e-01 8.51370210e-06 8.73003711e-03 8.88510541e-01]] 181 ``` 182 183 ## Common Errors 184 185 The neural network seems to have the most trouble differentiating cats 186 and dogs. Especially images of dogs sitting, which it often misidentifies 187 as either a cat or a bird. 188 189 ![Often Classified as 'Cat' or 'Bird'](../test/dog/testdog2.png){ width=250px } 190 191 This is likely due to none of the training data for dogs contains an 192 image of a sitting dog, but all of the cats contain images of sitting cats. 193 194 Another common misidentification is the image of a cat laying down. This is 195 often classified as either a dog or a bird. 196 197 ![Often Classified as 'Dog' or 'Bird'](../test/cat/testcat2.png){ width=250px } 198 199 This error likely has a similar reason to the previous, since there are no 200 training images of cats laying down. The spread out shape is likely most 201 similar to a bird flying or a dog walking.