PlantJournal基于三代测

 

基于singlemoleculelong-readisoform测序和Illumina-basedRNA-seq数据解析甘蓝型油菜的转录组图谱

甘蓝型油菜是一种新近形成的异源多倍体,由白菜(ArAr)和甘蓝(CoCo)远缘杂交后加倍形成。由于An和Cn亚基因组间的序列具有高度相似性,因此很难通过传统的二代测序提供一个准确的甘蓝型油菜转录图谱。为了解决这一问题,本文作者通过全长转录组测序(single-moleculelong-readisoformsequencing,Iso-Seq)技术(该技术凭借其读长优势,能够直接获得由5’端至3’端poly(A)尾的完整mRNA序列)直接在转录本异构体(isoforms)水平,探索甘蓝型油菜复杂的转录组。通过Iso-Seq数据分析共得到个非冗余的isoforms,覆盖个注释基因。其中,18.1%(/)的多外显子基因存在可变剪接(alternativesplicing,AS)。作者还发现个长非编码RNAs(lncRNAs),其中大多数lncRNA表现为组织特性表达。检测到个注释基因具有包含替代性聚腺苷酸化(alternativepolyadenylation,APA)位点的isoforms。除此之外,个位于ORF中的AS事件由于in-frame或frameshift改变而产生新的蛋白isoforms。作者还对混池进行Iso-Seq的五个组织进行了IlluminaRNA-seq测序分析。结果表明,69%的AS事件具有组织特异性。综上所述,作者为甘蓝型油菜的转录本isoforms识别提供了丰富的转录组资源,这有助于甘蓝型油菜基因组的重新注释,加强我们对甘蓝型油菜转录组的了解并应用于接下来的油菜功能基因组研究。

图1实验分析流程

(a)ThepipelineofIso-Seqdataanalysis.Thispipelineincludestheworkflowforthequalitycontroloftherawdata,theclassificationofthereadsoftheinsert,isoformclustering,correctionandtranscriptomeanalysis.(b)ThepipelineofIlluminadataanalysis.Thispipelineincludestheworkflowforthefilteringofrawdata,mapping,assembly,identificationofASeventsanddifferentialASevents

图2Iso-Seq数据中readsofinserts(ROIs)的质量分析

(a)Divisionofthefull-lengthnonchimericreadsbasedontheirgenomemappingcharacteristics.Thenumberofreadsineachgroupisdepictedinthepiechart.(b)ComparisonofPacBioandannotatedisoforms.

图3An和Cn亚基因组中的ROI比对

TheROIsarefromonehomoeologousgenepair(BnaA08gDandBnaC08gD).Thegenemodelsinreferenceannotationareshownbyblueboxes.Coloredlinesindicatehomoeologoussingle-nucleotidepolymorphisms(SNPs)betweentheAnandtheCnsubgenomes,asshownbyarrows.(a)VisualizationofreadmappingintheAnsubgenome.TheuppertrackshowsthattheROIsfromBnaA08gDweremappedontheAnsubgenome.ThelowertrackshowsthattheROIsfromBnaC08gDwerealsomappedontheAnsubgenomewithsomehomoeologousSNPs.(b)VisualizationofreadsmappingintheCnsubgenome.TheuppertrackshowsthattheROIsfromBnaC08gDweremappedontheCnsubgenome.ThelowertrackshowsthattheROIsfromBnaA08gDwerealsomappedontheCnsubgenomewithsomehomoeologousSNPs.

图4Iso-Seq数据中AS和剪接异构体分析

(a)ClassificationofASevents.CartoonsshowASevents:intronretention(IR),thealternativedonor(AD),alternativeadaptor(AA),andexonskipping(ES).ThenumbersofASeventsareshown.Filledboxesrepresentexons,intronsarerepresentedbyblacklines.ThenumbersofASeventsshownandnumbersinparenthesesshowtheproportionsofthistypeofAS.(b)DistributionofgenesthatproduceoneormorespliceisoformsfromIso-Seqdata.(c)GeneOntologyenrichmentanalysisofASgenes

图5PCR检测AS事件的真实性

RT-PCRvalidationofASeventsforsixgenes.GelbandsineachfigureshowDNAmarkersandPCRresultsinfivetissues/samples(bud,root,leaf,callus,andsilique).Thetranscriptstructureofeachisoformisshownintherightpanel.Exonsarerepresentedbyyellowfilledboxes,andintronsarerepresentedbylines.Primeraredesignspanningthesplicingevents.PCRprimers(F,forwardandR,reverse)areshownonthefirstisoformofeachgene.ThelengthofeachexpectedPCRproductisshownafterthetranscriptstructure

图6注释基因由于AS事件导致结构域改变

(a).HistogramindicatingfrequencyofORFlocatedASeventsatdifferentchangedlengthsofnucleotides.(b).Percentagesofinframeshiftandin-framechangeindifferentkindsofdomainchanges.(c)ExampleoftheannotatedgeneBnaC03gDwithASeventsthatcauseddomainchanges.

图7Iso-Seq数据的转录组学分析

(a)ComparisonofthelengthsofunannotatedlncRNAsidentifiedinthisstudywithpreviouslyreportedlncRNAs.(b)ProportionsoffourkindsoflncRNAsclassifiedaccordingtobiogenesis.(c)AnumberofexonsoflncRNAsandnon-lncRNAs.(d)HeatmapoflncRNAexpressioninfivetissues.(e)Distributionofthenumberofpoly(A)sitespergenemodel.(f)Relativefrequencyofeachnucleotidearoundpoly(A)cleavagesites.Sequencesupstream(-50bp)anddownstream(+50bp)ofeachpoly(A)cleavagesitewereanalyzed.(g)MEMEanalysisidentifiedpoly(A)signalsintranscripts.Anover-representedmotif(AAUAAA)upstreamofthepoly(A)sitesimilartotheknownsignalindicotswasidentified.Ⅱ.Anotheroverrepresentedmotif(UGUA)wasalsofoundupstreamofthepoly(A)site.(h)ChromosomallandscapeofisoformsinthereferenceannotationandPacBiodataatthegenomelevel.Thedatatypethateachtrackrepresentsisshownintheleftcorner.Theinnerlinesshowlociforfusiongenes.Genedensitywascalculatedina1-Mbslidingwindowatkbintervals.

?

图8通过基于SMRT-和IlluminaRNA-seq揭示的AS事件的详细比较

(a)HierarchicalclusteranalysisoftheexpressionprofileofPacBioisoformsfromdifferenttissues.Eachclusterwasdividedintoredframes.(b)LinechartshowingthepercentageofeachofthefourmainAStypes(Y-axis),namely,alternativeacceptorsite(AA),intronretention(IR),alternativedonorsite(AD)andexonskipping(ES),amongtheRNA-seqtissuesamples(X-axis).(c)BarplotshowingthenumberofASgenesidentifiedfromIlluminadata.(d)BoxplotshowingtheexonnumberofASandnon-ASgenesinIlluminadata.(e)BoxplotshowingtheexpressionlevelsofASandnon-ASgenesinIlluminadata.(f)Venndiagramdepicting

  转载请注明原文网址:http://www.yinggali.com/gljz/9191.html
  • 上一篇文章:
  • 下一篇文章: 没有了