Term of Award

Fall 2020

Degree Name

Master of Science in Computer Science (M.S.)

Document Type and Release Option

Thesis (open access)

Copyright Statement / License for Reuse

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


Department of Computer Science

Committee Chair

Kai Wang

Committee Member 1

Gursimran Singh Walia

Committee Member 2

Wen-Ran Zhang


When the MNIST dataset was introduced in 1998, training a network was a multiple week problem in order to receive results far less accurate than an average CPU can produce within a couple of hours today. While this indicates that training a network on such a dataset is not the complicated problem it may have been twenty years ago, the MNIST dataset makes a good tool for study and testing with beginner and medium complexity neural networks. This paper follows along with the work presented in the online textbook “Neural Networks and Deep Learning” by Michael Nielson and an updated repository of his python code examples made current for the most recent version of python. In this paper, the convolutional neural networks outlined in chapter 6 of "Neural networks and deep learning" will be built, run and the results will be analyzed and compared to the results shown in Nielson’s work to see how convolutional layers improve accuracy of the network. Making use of Nielson’s network3.py, I will conduct several experiments on this example starting with a single hidden layer to get a baseline accuracy which will be used to compare with the further tests in order to determine if the more complex and additional layers have improved the network’s accuracy. The second test will build on the first network by additionally utilizing a 5x5 local receptive field, 20 feature maps, and a max-pooling layer 2x2. The third test will use the previous features and insert a second convolutional-pooling layer identical in design to the first layer. The fourth test will build the network using rectified linear units and some L2 regularization (lmbda=0.1). Using the four different networks which each seek to correctly determine the appropriate digit from a dataset of thousands of handwritten images, I will compare how the networks are impacted by these modifications and compare my results with those presented in Michael Nielson’s textbook.

OCLC Number


Research Data and Supplementary Material


Included in

Data Science Commons