Sun, April 17, 2016 at 4:20 PM

Customer asks about our experience with PacBio data and their assemblers.

Mon, Apr 18, 2016 at 3:02 PM

AccuraScience LB: we have some experience with PacBio, though we work on Illumina data most often. Current assemblers for PacBio includes Falcon, MHAP and CA. Assembling PacBio data is not technically challenging, but could be very demanding on computational resources, if the genome you work on is large.

Mon, Apr 18, 2016 at 3:11 PM

Customer: We have SMRT data for a lower eukaryotic species, ~20 isolates in total. A reference genome is available in NCBI. But we would want to compare the synteny of our isolates to the “classical” sequenced and assembled yeast genomes in GenBank. These ~20 isolates are similar: one of them is the ancestor, and others are derivatives of it.

Tue, Apr 19, 2016 at 4:11 PM

AccuraScience LB: We should attempt assembling the ancestor genome then comparing it with the "wild type" genome in NCBI using a tool such as SyMAP. After the ancestor genome is assembled, the derivative isolates could be analyzed using mapping-based methods – with the genome of the ancestor isolate used as a reference. Depending on how big differences are expected between the derivatives and ancestors, we will decide whether it is necessary to call structural variants on top of SNV/Indel calling.

