Tip #5: My Starting-Point CNN Model

Go to the beginning of the article

Read the previous tip: Follow Tested Practices

Assuming you are working with non-image data (and therefore transfer learning is not an option) and there is a good reason to use CNN-based models (i.e., there is some continuity between neighbors in at least one dimension of the data; see this post for further explanation), I would start with a CNN architecture with five hidden layers: a first convolutional layer, a first max pooling layer, a second convolutional layer, a second max pooling layer, and a flattening layer. I would fix the minibatch size as 32, the activation function as Relu, and the optimizer as Adam. These are well-tested parameters that typically perform well without much adjustment.

Initial filter sizes should be determined based on the characteristics of the data (see this post for more information), and the number of filters or neurons should depend on the number of training samples. As a general rule of thumb, if you have around 400 training samples (with or without augmentation), 30-40 filters/neurons per convolutional layer would be appropriate. If you have around 800 training samples, you could use around 60 filters/neurons per convolutional layer. If you have around 2000 training samples, it might be a good idea to add a third convolutional layer and a third max pooling layer, with 40-60 filters/neurons per convolutional layer.

On the other hand, if you only have around 150-200 training samples, you should probably remove the second convolutional and second max pooling layers, and be aware that you are approaching the limits of what CNN (or any DNN)-based models can handle. If your training set is smaller than 150, you should consider whether it is possible to make data augmentation work (see this post) and whether it might be better to switch to an SVM model.

About the Author: Justin T. Li received his Ph.D. in Neurobiology from the University of Wisconsin-Madison in 2000 and a M.S. in Computer Science from the University of Houston in 2001. From 2004 to 2009, he served as an Assistant Professor in the Medical School at the University of Minnesota Twin Cities campus, and from 2009 to 2013, he worked as the Chief Bioinformatics Officer at LC Sciences in Houston. In June 2013, Justin joined AccuraScience as a Lead Bioinformatician. He has been working on machine learning and AI applications in the biological and biomedical fields since 2004 and has published approximately 10 articles on related topics in bioinformatics journals over the past 15 years. You can find more information about Justin at http://www.accurascience.com/our_team.html.

Need assistance in your AI/deep learning project? We may be able to help. Take a look at the intro to our bioinformatician team, see some of the advantages of using our team's help here, and check out our FAQ page!

Send us an inquiry, chat with us online (during our business hours 9-5 Mon-Fri U.S. Central Time), or reach us in other ways!



Chat Support Software