INITIALIZATION STRATEGY AND ACTIVATION FUNCTION SELECTION FOR NEURAL NETWORKS BASED ON GAUSSIAN PROCESS OPTIMIZATION
Loading...
Can’t use the file because of accessibility barriers? Contact us
Date
2021-05
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Permanent Link
Abstract
To achieve better prediction performance, much research effort in deep/machine learning has been devoted to improving neural network model development and training. Neural networks have been implemented in healthcare systems, image classification, navigation, and exploration applications. However, neural network models, in general, may fail to train effectively without proper preparation. In view of this, choosing appropriate initialization schemes and activation functions remains an important research topic within the deep learning and machine learning communities. My dissertation studies the connection between Gaussian processes and neural networks. It seeks to leverage their synergies to enhance neural network initialization and activation function selection for improved model accuracy.
In my dissertation, I investigate marginal likelihood maximization, a Gaussian process procedure that learns from data. Prediction tasks on real-world and simulated data are performed with networks initialized with learned hyperparameters. The objective is to evaluate the statistical technique for assisting model initialization in single- and multi-layer neural networks. Furthermore, a simulation is carried out to assess the method for activation function selection in single-hidden-layer neural networks. Empirical results suggest that the proposed Gaussian process technique is a promising approach for guiding neural network model initialization and activation function selection to achieve improved prediction performance.
There are three main contributions in this dissertation. First, I investigate the link between neural networks and Gaussian processes. Second, I implement and validate the method of marginal likelihood maximization for improving initialization and model prediction in single- and multi-layer neural networks. Lastly, I demonstrate that under certain conditions the Gaussian process technique is also effective in selecting activation functions in neural network models.
Description
Thesis (Ph.D.)
Keywords
neural networks, Gaussian processes, model initialization, activation function selection
Citation
Journal
DOI
Link(s) to data and video for this item
Relation
Rights
Type
Doctoral Dissertation