AI Models in Bioinformatics: Some General Tips

The use of deep neural network (DNN) models (or, what are often referred to as artificial intelligence or AI models) has risen sharply in biological/biomedical domains over the past few years. Generally, biological and biomedical data are costly to acquire (in contrast to, say, dog and cat image data). Understandably, frontline developments for high-power AI techniques - those that require tons of data - do not happen often in these domains. In fact, even those not-so-new developments in DNN - e.g. adversarial networks and auto-encoders are not often seen in biological/biomedical applications. By my count, about 90% of all AI applications in bioinformatics/biomedical informatics fall into a very narrow realm of supervised learning: classification. For 90% of those applications, CNN (convolutional neural networks) are used, And at 90% of the times, the input data are either biomedical images, signal traces, or nucleic acid and/or protein sequences.

Hence, it is safe to say that if you do well with CNN-based classification models with image and sequence data, you are well on the way to handling the vast majority of AI problems in biological and biomedical domains.

In this series of blogs, I introduce some general tips for building these most useful AI models in bioinformatics/biomedical informatics.

Tip #1: Augmentation Is King
Tip #2: Do Transfer Learning When You Can
Tip #3: Watch Your Own Data
Tip #4: Follow Tested Practices
Tip #5: My Starting-Point CNN Model
-- About the author

Read Tip #1: Augmentation Is King

Chat Support Software