A Next Generation Sequencing approach to
elucidate the existence of ten viral species
associated with CSSD in West Africa
Emmanuelle Muller, Andrew Wetten, Joel Allainguillaume, Francis
Abrokwah, Koffié Kouakou, Henry K. Dzahini-Obiatey
Cacao swollen shoot disease
▪
Characteristic disease symptoms
on Theobroma cacao :
red vein banding in young leaves
shoot, stem and root swelling
rounded pods
➢Family Caulimoviridae
Genus Badnavirus
➢dsDNA (7-7.3kbp)
➢Bacilliform particles
(30-150nm)
➢Semi-persistent transmission
by mealybugs
(Pseudococcidae)
➢ No transmission confirmed
by seeds
Côte d’Ivoire 1946 Nigeria 1944 Togo 1949 Ghana 1922
CSSV epidemics in West Africa:
differentiated situations according to the countries
In Ghana: the disease diffused rather quickly almost
everywhere,
New outbreaks in the West
In Côte d’Ivoire: The disease has not diffused from
outbreaks in the East until 2002
New outbreaks in the West central area and now
everywhere
In Togo: the disease diffused only in the south producing area (Kloto) until 2000,
New outbreaks since 2000 in the main producing area
(Litimé) Geographic and historical distribution of
Probable origin : host shift (s) from indigenous reservoir plants
(Sterculiaceae : Cola species, Bombaceae : Adansonia
digitata -baobab-, Ceiba pentandra -kapoktree-)
End of 19th century
Pst1 3000 pb 1000 pb 2000 pb 6000 pb 5000 pb 4000 pb 7000 pb
CSSV
7-7.3 kpb
3 Movement protein Orf3AF Orf3AR 598pb 534pb 1 2 Y X RTase protein•
Prospections over severalsuccessive years in Ivory Coast, Togo and Ghana
•
Detection of samples with primersspecific to CSSV in two genome regions
•
Variability by direct sequencing ofamplified products (or cloning if mixed infection)
•
Sequence alignment andphylogenetic studies
CSSV Variability studies
Movement protein chosen because assumed to be more conserved RTase protein taxonomical significance according to ICTV100 km
Lôh Djiboua Marahoué Haut Sassandra Guémon Nawa KlotoFoci of CSSV sampled 1998-2016
Volta Western Ashanti Central Eastern CRIG Museum Mé Gôh Gbôkle Brong Ahafo Litimé 250 samples 480 sampes 1150 samples 900 samples Indénié-Juablin Sud ComoéCRIG Museum diversity (CSSV collection)
- CRIG Museum established in the 1940s when the first surveys began - 74 isolates analysed
Maximum likelihood phylogeny, based on alignment of the first part of ORF3 (movement protein)
Gvolta52 Gvolta36 Gashanti81 GBA380 GWR307 GCR228 GWR320 'NJ10-00B' Gash112 Gash108 Gashanti88 GWR315 CI431BC3 CI22209BC4 CI28509BC2 CI30309BC1 CI424BC6 CI27909BC5 Gvolta3 Gvolta1 Gvolta13 Gvolta11 'Agou1-93C' Ghana22 GWR327 GWR324 Gash117 'Ghana6-3' 'Ghana18-5' Gvolta9 Gvolta48 'Ahl1-00A' Gvolta54 'G-AH5-14' 'Ghana2-00' 'CI152-09D' CiYMV outgroup 'GWR3-14' GWR198 CI617F CI632E CI631E GWR148 GWR145 GWR338 GWR344 GWR333 GWR352 GBA388 GBA378 GBA404 GWR212 GWR207 GWR188 GWR194 GBA354 Ghana26 GWR249 GWR244 GWR290 GWR280 GWR166 GWR176 GWR260 GCR346 GCR327 GCR350 Ghana28 'Ghana9-10G' 95 87 95 81 69 100 95 100 91 60 99 100 95 93 99 100 72 86 98 100 99 98 100 90 90 100 100 68 83 73 97 100 12 different phylogenetic groups B A D M F J N K G L Côte d’Ivoire Togo Ghana Origin of isolates: E C P
Maximum likelihood phylogeny of CSSV sequences, based on alignment of the RTase region in ORF3
T R M 0.99 0.90 0.99 0.94 0.88 0.96 1.00 0.96 0.80 1.00 Q E S 0.98 0.97 B D A C G 0 1. Gha47-15Gha46-15 Gha6-00 Gha54-15 Gha12-00 GhaNewJuaben-00 Gha30-15Gha40-3-16 Gha9-00 Gha23-00 Gha21-00 Gha20-00 Gha16-00 Gha36-1-16 Gha26-1-16 Gha26-15 Gha57-15 CI303-09 CI444-09 CI569-10 ToAgou1-93 CSSTAV-ToWOB12-02 Gha22-00 CI152-09 CI632-10 Gha62-15 Gha58-15 Gha52-15 Gha53-2-16 Gha28-15Gha50-15 Gha55-15 Gha29-15 Gha32-11 Gha64-16 Gha16-15 Gha4-15 Gha64-99 Gha1-00 Gha69-16 Gha40-15 Gha2bis-00 Gha2-00 GWR198-13SangerGWR3-14Sanger GWR246-13Sanger Gha28-11 Gha29-16 CI243-09Sanger GCR329-14SangerCI232-09Sanger Gha69bis-16 Gha35-15 Gha68-16 GW145-13Sanger CiYMV outgroup Gha63-15 Côte d’Ivoire Togo Ghana 4 additional phylogenetic groups
A
B
C
D
F
E
J
K
L
M
100 km
Indénié-Juablin Sud Comoé Lôh Djiboua Marahoué Haut Sassandra Guémon Nawa Litimé KlotoGeographical distribution of CSSV diversity
Volta Western Ashanti Brong Ahafo Central Eastern Mé Gôh Gbôkle Gontougo Grands Ponts + species S everywhere S Moronou Agneby Tiassa Iffou
Field CSSV diversity versus CRIG
Museum diversity in Ghana
Cacao farms 9 groups
830 samples from the cacao farms of the six cacao
growing regions
70 isolates (and 123 samples) analysed from CRIG (with 40 mixed infection)
Tafo CRIG Museum 12 groups
A B E K G M N P Q R S T
Conclusions of the CSSV diversity work
➢ High variability of the CSSV populations in West Africa : a complex of different viral species is responsible for the Cacao swollen shoot disease
It remains to describe viral diversity in Nigeria
➢ One ubiquitous group, B, the other groups disseminated more locally.
➢Some groups are only detected in the CRIG Museum (G, N, P, Q, R) Some groups only present in the cacao farms (C, L, J absent from the Museum, E present but underrepresented in the Museum) probably corresponds to recently emerged groups
➢ New outbreaks in the West of Ghana and in the West central area of Côte d’Ivoire associated with new groups, respectively E and D.
Emerging issues
on the CSSV diversity described
➢ A better understanding of the extent of CSSV diversity in terms of different taxonomical species in the different countries
➢ Improving the detection primers because a lot of symptomatic samples are not detected (depending of the region and of the ages of the leaves)
➢ Need to obtain the complete genomes corresponding to the different groups/species.
➢ Need to amplify CSSV sequences without a priori in all symptomatic samples.
The next generation sequencing as Illumina
sequencing was able to help to obtain more
new sequences and CSSV complete genomes
Results obtained from the Illumina
sequencing
➢
Illumina HiSeq sequencing was done on DNA obtained from purified cacao leaves infected by CSSV (by Fasteris SA Switzerland),
- Samples containing new groups of CSSV (and potentially new species) in order to reconstruct complete genomes
- Samples impossible to detect with diagnostic primers (pooled field samples) Reads from Theobroma cacao were removed, and the remaining reads are assembled with a pipeline including SPAdes software
21 complete genomes obtained, of which 18 belong to groups containing only partial sequences.
Maximum likelihood
phylogeny of the
Badnavirus genus,
based on complete viral
sequences
CSSV species A, B, D, E, M and N CSSV species Q and RTwo clades of complete genomes of CSSV with same molecular characteristics as previously sequenced viral genomes 0.96 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99 1.00 1.00 1.00 Around 25 to 30% divergence between species of this clade 40 % divergence 0 1. DrMV PVBVSriLankaNGS YaNMoV TaBV GVCV-VRU GVCV GVBV-RC GVBV-BC GVBV-RIB90 GVBV-GB1 RYNV-BSRYNV PYMAV BCVBV GRLDaV FigBV DBSNV DBALV TaBCHV1PYMoV CiYMV-ROL CiYMV CiYMV-MBV9 HiBV G63-15NGS G57-15NGS CI569NGS CSSV-N1A CSSV-Agou1 G25-15NGS CSSV-Wobe1Divo2015NGS CSSV-CI152 GCR329corrGWR198 CI632NGS GWR3-14corr GWR198-G G34-15R-NGSG37-15NGS G29-15 G54-15NGS G53-15NGS G39-15NGS G40-15NGSG2-15NGS G64-99NGS G34-15Q-NGS SPPV-B SCBMOVSCBIMV BSUIVBSULV BSUMV BSMYV SCBV3D PBCoVHawai PBCOV CLNV BSPEV SCBV-GAR57 SCBV-GAB51 BSOLV BSCAV BSUAV KTSV BSVNV BSAcYuV BSIMV BSGFV ComYMV RTBV-Phi 1.00 1.00 0.96 0.98 0.98 0.96 0.74 1.00 0.92 0.97
Maximum likelihood phylogeny of CSSV sequences, based on alignment of the RTase region in ORF3
10 species according to International Committee of Taxonomy of Viruses recommendations (20% divergence threshold) N R M 0.99 1.00 0.99 0.66 0.85 0.64 1.00 0.99 0.61 1.00 Q E (subgroups F, K, J, L, G) = CSSCEV
New complete viral genomes obtained with NGS analysis 6 4 1 1 1 1 5 0 0.5 CI302-B N1A14-B CI168-B Agou1-C GWR246-Egroup G9-10-GB14 GWR145-E_group GWR198-J GCR329NGS14 GWR3-14-Kgroup CI152-Dgroup Ana1bis-A G63-15-Ngroup GAH514-M G2-15-Qgroup G29-15-Rgroup GWR145-Sgroup GWR198-Sgroup CiYMV Gha2-00 S 0.99 0.84 1 B (subgroups B, C ) = CSSV D = CSSCDV A= CSSTAV T
Four species very
different from the others (<60% identity)
S
R Q
A
B
D
E
M
N
Q
R
S
T
100 km
Indénié-Juablin Sud Comoé Lôh Djiboua Marahoué Haut Sassandra Guémon Nawa Litimé KlotoGeographical distribution of species responsible of CSSD
Volta Western Ashanti Brong Ahafo Central Eastern Mé Gôh Gbôkle Gontougo Grands Ponts + species S everywhere S Moronou Agneby Tiassa Iffou CRIG Museum E N N
Conclusions of NGS work
➢ no new groups identified by the NGS technology but obtention of new complete genomes corresponding to groups already identified with partial sequences
➢ identification of mixed infection in the CRIG Museum and in the field (partial and complete sequences)
➢Identification of viral sequences in some field samples where the PCR detection was not possible (P and N in some Côte d’Ivoire samples)
➢ The results obtained from NGS technology suggest that the problem of detection is more probably due to the low sensitivity of the PCR protocol
Design of new
polyvalent primers for
the detection of the 10
viral species in the
RTase protein
primers ORF3A movement RTase specific RTase Polyvalent RTase Species S samples detection 50% 66% 82% 51%• More polyvalent primers compared to the previously designed in movement protein
• Cleaning and concentration of DNA to be improved for a better efficiency of the PCR detection (Yield of extraction, RCA step, purification step, more efficient Taq Polymerase)
Conclusions
➢ The Cacao swollen shoot disease is caused by a complex of 10 different species as other diseases caused by badnaviruses (Banana streak virus,
Dioscorea bacilliform virus).
➢ From the high variability described on CSSV populations compared to the very short evolutionary history of CSSV on cocoa trees and the differential distribution of the CSSV groups in the different regions, we suggest parallel emergences of the disease in at least two countries (CSSCDV species in Côte d’Ivoire, E
species in Ghana).
Perspectives
Investigate the role of the potential alternative hosts in disease spread and the viral diversity inside these hosts
Understand the role played by the different viral species in epidemics (molecular groups described in the new outbreaks, aggressiveness of the different isolates inside each species, species in coinfection).
Obtention of full genomes of the two species S and T (specially S, ubiquitous in the field as the first described CSSV species -B-)
Screening of replanted cocoa material against the viral species present in the corresponding country or area
Ackowledgements
ITRA- CRA/F Kpalimé Togo Francis ABROKWAH Henry K. DZAHINI-OBIATEY Komlan WEGBE Angèle MISSISSO CRIG, Ghana Koffié KOUAKOU Brice SERIKPA Ismaël KEBE