A Biomarker or Classification Modeling Project (11/7/2015)
Mon, No 2, 2015 at 9:13 AM
Customer inquires about AccuraScience's capability in biomarker discovery.
Thu, Nov 12, 2015 at 2:47 PM
AccuraScience LB: When biomedical researchers talk about biomarker discovery, they often refer to the effort of construction of a predictive model that is capable of predicting the phonotype/group label (case vs control) for future samples, and identifying genes and mutations, or other features or "independent variables"(so-called "biomarkers") most informative in predicting the phenotype/group label. In some disciplines in biomedical research, some researchers call these predictive models "fingerprints" or "signatures". They all refer to the same thing.
In terms adopted by people in the machine learning field, building of the predictive model is often referred to as a "classification problem", and the identification of the biomarkers is often referred to as "feature selection problem". We have Lead Bioinformaticians who - as professors and group leaders before joining AccuraScience - led groups to publish multiple research articles in these research domains. A typical plan for carrying out a biomarker discovery project is, we would try several classification methods (e.g., support vector machine (SVM), random forest (RF) and na?ve Bayes (NB)), and several feature selection techniques (e.g., recursive feature elimination (RFE)) and chi-squared-based filtering), and apply a 2-layer cross-validation scheme to objectively evaluate the performance of the predictive model. This would give us confidence that we would have obtained the most effective predictive model and set of features (biomarkers) most effective in predicting the group label in future samples.
Note: LB stands for Lead Bioinformatician. An AccuraScience LB is a senior bioinformatics expert and leader of an AccuraScience data analysis team.
Disclaimer: This text was selected and edited based on genuine communications that took place between a customer and AccuraScience data analysis team at specified dates and times. The editing was made to protect the customer's privacy and for brevity. The edited text may or may not have been reviewed and approved by the customer. AccuraScience is solely responsible for the accuracy of the information reflected in this text.