| Journal of General Virology |
| SUMMARY | INTRO | METHODS | RESULTS | DISCUSSION | FOOTNOTES | REFS |
| First posted online 26 January 2001 | FULL-LENGTH ARTICLE |
| Rec 23 November 2000; Acc 16 January 2001 | DOI: 10.1099/vir.0.17562-0 |
Armando Arias,1 Ester Lázaro,2 Cristina Escarmís1 and Esteban Domingo1
1 Centro de Biología
Molecular 'Severo Ochoa' (CSIC-UAM), Universidad Autónoma de
Madrid, Cantoblanco, 28049 Madrid, Spain
2 Centro de Astrobiología (CSIC-INTA), Carretera de
Ajalvir, km 4, 28850 Torrejón de Ardoz, Madrid, Spain
The mutant spectrum of a virus quasispecies in the process of fitness gain of a debilitated foot-and-mouth disease virus (FMDV) clone has been analysed. The mutant spectrum was characterized by nucleotide sequencing of three virus genomic regions (internal ribosome entry site; region between the two AUG initiation codons; VP1-coding region) from 70 biological clones (virus from individual plaques formed on BHK-21 cell monolayers) and 70 molecular clones (RTPCR products cloned in E. coli). The biological and molecular clones provided statistically indistinguishable definitions of the mutant spectrum with regard to the distribution of mutations among the three genomic regions analysed and with regard to the types of mutations, mutational hot-spots and mutation frequencies. Therefore, the molecular cloning procedure employed provides a simple protocol for the characterization of mutant spectra of viruses that do not grow in cell culture. The number of mutations found repeated among the clones analysed was higher than expected from the mean mutation frequencies. Some components of the mutant spectrum reflected genomes that were dominant in the prior evolutionary history of the virus (previous passages), confirming the presence of memory genomes in virus quasispecies. Other components of the mutant spectrum were genomes that became dominant at a later stage of evolution, suggesting a predictive value of mutant spectrum analysis with regard to the outcome of virus evolution. The results underline the observation that greater insight into evolutionary processes of viruses may be gained from detailed clonal analyses of the mutant swarms at the sequence level.
Introduction |
RNA viruses replicate as complex mutant
distributions termed virus quasispecies (Eigen, 1971
; Domingo et
al., 1978
, 2001
; Eigen & Biebricher, 1988
; Holland et al., 1992
; Nowak, 1992
). RNA genome replication must be viewed as a dynamic
process in which mutants arise at high rates and participate in a
continuous process of competitive rating (Batschelet et al., 1976
; Domingo et al., 1978
, 1988
, 2001
; Eigen & Biebricher,
1988
; Drake & Holland, 1999
). One of the consequences of competition among
mutant genomes is fitness gain when large populations of RNA viruses are
allowed to replicate in a defined environment (Novella et al., 1995
a
,
, c![]()
; Weaver et al., 1999
). Virus fitness is defined as the relative
capacity of a virus to produce infectious progeny under a set of
environmental conditions. It is necessarily a dimensionless, relative
value, which is frequently determined in growth-competition experiments
involving co-infection of cells or animals with the virus to be tested and
a reference virus (Holland et al., 1991
; reviewed in Domingo et al., 2001
).
In contrast to large population passages, repeated
plaque-to-plaque transfers of RNA viruses result in fitness loss (Chao,
1990
; Duarte et al., 1992
; Escarmís et al., 1996
; Yuste et al., 1999
). This is due to the accumulation of deleterious mutations
as a result of repeated sampling (bottlenecking) of components of the
mutant spectrum of the virus quasispecies. It must be stressed that a
virus population may either gain or lose fitness depending on both the
initial fitness of the population and the size of the genetic bottleneck,
as shown by Novella et al. (1995 c
) using vesicular stomatitis virus clones and
populations of different initial fitness. In a study with the animal
picornavirus foot-and-mouth disease virus (FMDV), a highly debilitated
(low fitness) clone was derived by 22 successive plaque-to-plaque
transfers of an FMDV clone termed C91; the debilitated
clone was termed C922 and its fitness value was 0.1
times that of C91 (Escarmís et al., 1996
). The consensus genomic nucleotide sequence of
C922 differed from that of C91 in
seven point mutations and in the acquisition of an internal polyadenylate
tract as an extension of four adenylate residues at genomic positions
11191122, preceding the second functional AUG initiation codon
(Fig. 1). FMDV genome residues have been
numbered according to Escarmís et al. (1996
). One of the mutations was a deletion of a U
residue at position 1056 (
U-1056), which rendered the region between the two
AUG initiation codons non-functional with regard to protein-coding, since
U-1056 led to a termination codon at the position that would
encode the eleventh residue of the large form of L protease (termed Lab)
(Escarmís et al., 1996
). The internal polyadenylate at genomic positions
11191122 was the first genetic lesion to revert to the wild-type
sequence in the process of fitness gain when C922 was
subjected to large population passages, while
U-1056
did not revert, and served as a specific genetic marker for the
C922 lineage (Escarmís et al., 1999
) (Fig. 1).
Fig. 1. Scheme of the FMDV
C-S8c1 genome and the location of some genetic alterations within residues
6051122 and 32083834 in C922 p0,
C922 p20, C922 p50 and
C922 p100. VPg is the protein covalently linked to the
5´ end of FMDV RNA and (A)n is the polyadenylate tract at
the 3´ end of the genome. The C-S8c1 genome is 8115 nucleotides in
length, not counting the heterogeneous internal poly(C) and terminal
poly(A) tracts (Toja et al., 1999
). The positions of the non-structural (L and 2A to 3D) and
structural (1A or VP4 to 1D or VP1) proteins are indicated. Below the
C-S8c1 genome, the genetic lesions within residues 6051122 and
32083834 that are relevant to the interpretation of the
composition of the mutant spectrum of C922 p50 are
indicated.
U-1056 is the deletion of U-1056 of the C-S8c1 genome; the
deletion is present in and serves as a genetic marker for the
C922 lineage. C922 p0 refers to the
virus population amplified minimally from about 105 p.f.u.
(C922) to about 107 p.f.u.
(C922 p0) prior to measurement of fitness and nucleotide
sequencing (Escarmís et al., 1996
). (A)n in C922 p0 indicates
the heterogeneous internal polyadenylate tract which was the first lesion
to revert upon large population passages of C922 p0
(Escarmís et al., 1999
) (reversions are indicated by asterisks). Within the 1D
(VP1)-coding region, transversion C-3653
A, which leads to amino acid replacement T-149
K, was present transiently in C922 p50 and was then
replaced by two neighbouring transversions in C922 p100;
amino acid replacements are boxed. The entire genomic consensus nucleotide
sequences of C-S8c1, C922 p0, C922
p20, C922 p50 and C922 p100 have been
determined (Escarmís et al., 1996
, 1999
; Toja et al.,
1999
). For general reviews of the FMDV
genome and its expression, see Belsham (1993
), Rueckert (1996
) and Sobrino et al. (2001
); for additional information on the origin of C-S8c1,
C922 and derived populations, see Escarmís et
al. (1996
, 1999
).
The change of fitness values from FMDV clone
C91 to C922 and then to
C922 p100 (C922 subjected to 100 large
population passages in BHK-21 cells) is depicted in Fig.
2. The relative fitness of C922 p100 was about
30-fold that of C922 (Escarmís et al.,
1999
). In the present report, we analyse
the composition of the mutant spectrum of population FMDV
C922 p50, positioned half-way in the exponential phase
of the process of fitness recovery of clone C922 (Fig. 2). The mutant spectrum has been analysed by two
different procedures: nucleotide sequences of genomic RNA from biological
clones of C922 p50 (referred to as biological cloning)
and nucleotide sequences of molecular clones obtained in E. coli
following RTPCR of RNA from C922 p50 (referred
to as molecular cloning). The analysis of the mutant spectrum of
C922 p50 had three objectives: (i) to determine the
complexity of the mutant spectrum of a virus quasispecies during fitness
gain in a constant environment; (ii) to define molecular intermediates of
fitness gain in relation to the initial (C922 p0) and
subsequent (C922 p100) consensus genomic sequences (Fig. 2); and (iii) to compare the composition of the
mutant spectrum as deduced from the analysis of biological clones and
molecular clones.
Methods |
Cells, viruses and infections. Host cells
used for all infections were derived from a clone of BHK-21 cells obtained
by limiting dilution, as described previously (de la Torre et al.,
1988
). Procedures for infections with FMDV in liquid
culture medium and in semi-solid agar medium for titration of FMDV
infectivity were described in detail elsewhere (Sobrino et al.,
1983
; Escarmís et al.,
1996
). FMDV C-S8c1 is a biological clone
derived from the natural isolate C-Sta Pau Sp/70 (Sobrino et al.,
1983
) and FMDV C91 is
a clone derived from C-S8c1 passaged twice in BHK-21 cells
(Escarmís et al., 1996
). The origin of C922 p50 is described in
Escarmís et al. (1999
) and in Fig. 2. Biological clones from
C922 p50 were obtained by isolating virus from randomly
chosen individual virus plaques, as described previously (Sobrino et
al., 1983
). RNA extraction, cDNA synthesis,
PCR amplification and nucleotide sequencing were carried out as detailed
in Escarmís et al. (1999
).
Fig. 2. Change in relative
fitness of FMDV clone C91 upon plaque-to-plaque transfer
(left) and large population passage (right) in BHK-21 cells. Plaque
isolations (progeny from a single genome) are indicated as filled squares
and uncloned populations as open circles. Clone C91 was
obtained from population C-S8c1 p2 and subjected to 22 serial plaque
transfers, as detailed in Escarmís et al. (1996
). Each large population passage of
C922 p0 (the initial preparation derived from clone
C922 by amplification from about 105 to
107 p.f.u.) involved infection of 4x106 BHK-21 cells
with 106107 p.f.u. of the virus progeny of
the previous infection. The number of p.f.u. employed ensured exponential
fitness gain over the relative fitness range covered from passage 0 to
passage 100 (Novella et al., 1995 c
). Relative fitness values are taken from
Escarmís et al. (1999
) and were determined by growth-competition experiments of
the different populations to be tested and a reference virus derived from
C-S8c1, as described previously (Holland et al., 1991
; Duarte et al., 1992
; Escarmís et al., 1996
, 1999
). The short horizontal segments above populations indicate
that the consensus nucleotide sequence of the entire FMDV genome is known
and can be found in Escarmís et al. (1996
, 1999
). The mutant spectrum of the C922 p50
quasispecies was analysed by biological and molecular cloning (filled
rectangles). Additional details of the procedures used are given in
Methods.
Molecular cloning. RNA extracted from C922 p50 was amplified by RTPCR using the
thermostable Pfu polymerase (Promega), which has a proof-reading
activity (Cline et al., 1996
). The synthesis of cDNA was carried out with 5 U AMV RT
(Promega) in a final volume of 25 µl in a buffer containing 20 U
RNasin (Promega), 10 mM TrisHCl, pH 8.3, 1.5 mM
MgCl2, 50 mM KCl, 0.8 mM dNTPs, 200 ng oligodeoxynucleotide
primer and about 7 ng FMDV RNA. For amplification with Pfu, the
buffer recommended by the supplier was added to a final volume of 100
µl plus 200 ng of the second primer and 2.5 U of enzyme. The primers
used contained restriction sites at their 5´ ends to facilitate
cloning in appropriately digested pGEM4Z. To amplify the internal ribosome
entry site (IRES) and the region between the two initiation AUG codons,
the primer for cDNA synthesis was complementary to positions
12001183 of FMDV RNA and contained a SacI restriction
site; the second primer for PCR amplification corresponded to nucleotides
569587 of FMDV RNA and had a BamHI restriction site. To
amplify the VP1-coding region, the primer for cDNA synthesis was
complementary to positions 38883869 of FMDV RNA and contained a
BamHI restriction site; the second primer for PCR amplification
corresponded to nucleotides 31713192 of FMDV RNA and had a
SacI restriction site. After treatment with phenol to inactivate
the DNA polymerase, the DNA was recovered by ethanol precipitation
(Sambrook et al., 1989
). After digestion of both the PCR-amplified products and
the vector pGEM4Z with SacI and BamHI, the DNA was
electrophoresed through 1 % SeaPlaque agarose in 40 mM
Trisacetate, 1 mM EDTA, pH 8.0, the appropriate bands were
excised from the gel and the DNA was purified by using the Gene Clean II
kit as indicated by the manufacturer (Bio 101). After ligation overnight
at 16 °C, the ligation product was transformed in E. coli
DH-5
and transformants were isolated and analysed following
standard procedures (Sambrook et al., 1989
). Plasmid DNA was purified by using the Wizard Plus SV
Minipreps kit (Promega). Two separate sets of molecular clones were
obtained and analysed: 70 clones spanning the IRES and the region between
the two AUG initiation codons and 70 clones corresponding to the
VP1-coding region. Nucleotide sequencing was carried out in an ABI 373
automatic sequencer, as described previously (Escarmís et
al., 1999
).
Statistics. Standard statistical procedures used were those described in the package Hypothesis tests of the program Mathematica 3 (Wolfram Research).
Results |
Mutant spectrum complexity and composition during the process of fitness gain of a virus clone
Large population passages of FMDV C922 p0
led to an exponential increase in virus fitness (Escarmís et
al., 1999
) (Fig. 2). In
order to quantify the genetic heterogeneity of the virus population during
the process of fitness gain, 70 biological clones and 70 molecular clones
from population C922 p50 were analysed by nucleotide
sequencing. Three genomic regions were sequenced: residues
6051038 (spanning the IRES), 10391122 (those between
the two functional AUG initiation codons) and 32083834 (the
capsid protein VP1-coding region) (Fig. 1). Nucleotide
sequences of individual biological and molecular clones were compared with
the consensus nucleotide sequence of C922 p50. The
numbers and types of mutations did not reveal significant differences
between the analysed biological and molecular clones. Two independent
RTPCR amplifications of the same template RNA population yielded
molecular clones with similar distributions of mutations (Table 1). In addition to the standard (minimum)
mutation frequency, calculated by counting repeated mutations only once,
Table 1 also includes values for the maximum mutation
frequency. The latter was calculated by considering the repeated
mutations, and it has been used for some statistical evaluations of the
mutant distributions (described in the next section and in the
Discussion). In the mutant spectrum of C922 p50, the
mean minimum mutation frequency found in the IRES element from the
analysis of biological and molecular clones was
4.1x104 substitutions per nucleotide. This value is
3.5- to 5.6-fold smaller than the mutation frequency found in the region
between the two AUG codons (mean 1.8x103
substitutions per nucleotide). This difference is expected, since the
region between the two AUGs has been rendered non-functional with regard
to the synthesis of Lab by
U-1056 (Escarmís et al.,
1996
; Fig. 1). The
mean minimum mutation frequency for the VP1-coding region was
5.3x104 substitutions per nucleotide, only 1.3-fold
larger than in the IRES. The ratio of transitions to transversions
amounted to more than 29 in the IRES, 6 in the region between the two AUGs
and 4 in the VP1-coding region. This abundance of transition over
transversion mutations is expected from the misincorporation tendencies of
RNA-dependent RNA polymerases (Domingo et al., 1978
; Kuge et al., 1989
; Schneider & Roossinck, 2000
) and both biological and molecular clones reflected such a
tendency (Table 1).
Table 1. Characterization of the mutant spectrum of C922 p50 quasispecies by analysis of biological and molecular clones
A total of 70 biological clones and 70 molecular clones was analysed; the molecular clones were derived from two independent RTPCR amplifications (35 clones each). Procedures for the preparation and analysis of biological clones (BIO) and molecular clones (MOL) are described in Methods.
|
Mutation type |
Mutation frequency§ |
||||||
|
Type of analysis |
Number of mutations |
Ts |
Tv |
Nsyn |
Syn |
Minimum |
Maximum |
| IRES (6051038)* | |||||||
|
BIO |
14 |
14 |
0 |
|
|
4.3x10 4 |
4.6x10 4 |
|
MOL |
15 |
15 |
0 |
|
|
3.9x10 4 |
4.9x10 4 |
| Between the two AUGs (10391122) | |||||||
|
BIO |
21 |
19 |
2 |
|
|
1.5x10 3 |
3.6x10 3 |
|
MOL |
30 |
25 |
5 |
|
|
2.2x10 3 |
5.2x10 3 |
| VP1-coding region (32083834) | |||||||
|
BIO |
34 |
26 |
8 |
13 |
21 |
6.4x10 4 |
7.7x10 4 |
|
MOL |
34 |
28 |
6 |
9 |
25 |
4.1x10 4 |
7.7x10 4 |
* Numbering of residues is according to
Escarmís et al. (1996
).
Mutant residues are those that vary relative to
the consensus nucleotide sequence population C922 p50.
Numbers of mutations in MOL found in the two independent RTPCR
amplifications were: IRES, 9 and 6; between AUGs, 14 and 16; VP1-coding
region, 20 and 14.
The mutations found have been separated into Ts
(transitions), Tv (transversions), Nsyn (non-synonymous, which lead to an
amino acid replacement) and Syn (synonymous or silent, which do not lead
to an amino acid replacement). The latter division does not apply to
non-coding regions, as indicated by .
§ The minimum mutation frequency is the number of
different mutations found divided by the total number of nucleotides
sequenced. The maximum mutation frequency is the total number of mutations
found (as given in Table 2) divided by the total
number of nucleotides sequenced. Mutation frequencies are expressed as
substitutions per nucleotide.
Repeated mutations and mutational hot spots are identified in both biological and molecular clones
In order to compare the numbers and types of point mutations
found in biological clones and in molecular clones derived from population
C922 p50, all variant positions have been listed,
together with the number of clones in which each mutation was found (Table 2). Of 25 point replacements that occurred
more than once in either biological or molecular clones, 19 (76 %) were
found both in biological and molecular clones. At four sites (genomic
residues 1069, 1118, 3780 and 3801), the same mutation was found in six or
more clones, and the mutation was a G
A transition in all cases. Particularly striking was the occurrence of
G-1118
A in seven biological clones and 12 molecular clones. This mutational hot
spot may be facilitated by
U-1056, found in clone
C922 and all its progeny populations (Escarmís
et al., 1996
, 1999
) (Fig. 1) (see also
Discussion).
The
2-test was applied to assess whether
biological and molecular clones could be distinguished with regard to the
numbers and types of mutations found in the two sets. No significant
difference was found between biological clones and molecular clones in the
number of times that a given mutation was represented (0.1<P<0.3;
2, 1 degree of freedom). Equally indistinguishable
were the distribution of mutations among the three genomic regions
analysed (0.3<P<0.7;
2, 2 degrees of freedom), the ratio of
transition to transversion mutations (0.7<P<0.9;
2, 4 degrees of freedom) and the ratio of
synonymous to non-synonymous mutations (0.1<P<0.3;
2, 1 degree of freedom). Student's t-test
also indicated that the mean number of times that mutations were
represented in the mutant spectrum was indistinguishable for biological
clones and molecular clones (P=0.08). However, those mutations
found three or more times among biological and molecular clones were
slightly overrepresented in the latter (a total of 37 in molecular clones
and 28 in biological clones). An exception was the case of C-886
U in the IRES; the mutation was found exclusively in three molecular
clones (data in Table 2).
Insertions and deletions (indels) were found
in the IRES and in the region between the two AUGs (Table
3). G-1118, the site of a mutational hot spot for G
A transitions, was deleted in one molecular clone. A rich repertoire of
adenylate insertions was found within positions 11191122,
including one molecular clone with 20 additional adenylate residues, a
length which is very close to the average present in the dominant sequence
of the parental clone C922 (Escarmís et
al., 1996
).
Table 2. Point mutations in the mutant spectrum of C922 p50 identified by biological and molecular cloning
BIO, MOL refer to biological and molecular clones as
defined in the Introduction. The numbers under BIO and MOL indicate the
number of independent biological and molecular clones in which the
mutation indicated was found. Procedures for preparation and analysis of
biological and molecular clones are detailed in Methods. Nucleotide
residues are numbered according to Escarmís et al. (1996
). The location in the FMDV genome of the
regions analysed is depicted in Fig. 1. Mutations in
the IRES that are predicted to alter secondary structure or base pairings
are underlined (see Discussion). Amino acid replacements are indicated in
parentheses for non-synonymous mutations in the VP1-coding region. No
amino acid replacements are indicated for genomic region
10391122 because a point deletion at residue 1056 rendered this
region functionally irrelevant regarding the encoding of the Lab form of L
(see Introduction). Amino acids are numbered according to their position
in capsid protein VP1.
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
* These two mutations
occurred in the same clones (see Discussion).
This biological clone was heterogeneous and
contained A and G at this position, with A dominant (about 70 % of the
genomes, according to the peak pattern in the sequence
obtained).
Clonal analysis of C922 p50 may also provide information on possible subsets of mutations present at different frequencies in the mutant spectrum. To approach this question, the probability that the point substitutions found in biological clones are also found in molecular clones (probability of coincidence) was calculated and compared with the actual experimental values (Table 2). For the IRES region (maximum mutation frequency of 4.75x104 substitutions per nucleotide with 434 residues compared per genome; values taken from Table 1), the expected probability that any mutation found one, two, three, four or five times in one of the subsets of clones (biological or molecular) is also found one, two, three, four or five times in the other subset is 1.07x103, a value 4.3-fold smaller than that found experimentally (data in Table 2). This overrepresentation of coincident mutations is significant (P=0.01), assuming that the probability of coincidence of mutations per nucleotide in biological and molecular clones follows a binomial distribution. For the VP1-coding region, the expected coincidences were 6.9fold smaller than that found experimentally (P<0.0001), while in the region between the two AUGs there was no significant difference between the expected and actual number of mutations found in the two subsets of clones (P=0.36). Identical conclusions were obtained using the minimum mutation frequency values for the IRES (P=0.002) and VP1-coding region (P<0.0001), but not in the region between the two AUGs (P=0.003) (Table 1) (see Discussion). Thus, in the IRES and VP1-coding regions, some mutations from the mutant spectrum of C922 p50 occur more frequently than expected from the mean values of mutation frequency.
Point replacements, insertions and deletions may reflect quasispecies memory or be predictive of dominant genomes
Virus quasispecies may contain
memory genomes in the form of minority components of the mutant spectra
that reflect their past evolutionary history (Ruiz-Jarabo et al.,
2000
). The clonal analysis of
C922 p50 reinforces the previous evidence of
quasispecies memory and, furthermore, reveals that some mutations may
anticipate a future, dominant sequence in quasispecies evolution.
Specifically, A-3653
C, which results in the amino acid replacement K-149
T in VP1, was present in two biological clones and two molecular clones
from C922 p50, and also in genomes that had been
dominant at passages 0 and 20, and which became dominant again at passage
100 (Escarmís et al., 1999
). Mutation C-3650
A, which results in the amino acid replacement T-148
K in VP1, was found in two biological clones and two molecular clones and
it became dominant at passage 100 (Escarmís et al., 1999
) (compare Fig. 2 and Table 2). The clonal analysis of C922
p50 extends the previous study on memory genomes harbouring an internal
polyadenylate tract as a memory marker (Ruiz-Jarabo et al., 2000
) by 70 additional clones. Despite their
debilitating effect on virus fitness (Escarmís et al.,
1999
), additional adenylate residues
were detected in 17 % of the clones analysed (Table
3). Thus, at a passage in which the virus population was actively
gaining fitness, sequence analysis of individual clones revealed mutations
that anticipated the future evolution of the population and other
mutations that were remains of its evolutionary history.
Discussion |
The mutant spectrum of FMDV population
C922 p50, in the exponential phase of fitness gain of a
very debilitated virus (Fig. 2), has been analysed by
nucleotide sequencing of 70 biological clones and 70 molecular clones.
Three genomic regions have been sequenced, amounting to a total of 160160
nucleotides (Tables 1, 2 and 3). The results indicate that analyses of
biological and molecular clones are equally valid experimental approaches
to providing a representation of a virus quasispecies with regard to point
substitutions, insertions and deletions. The initial studies that
established the quasispecies structure and dynamics of RNA virus
populations involved the sampling of nucleotide sequences and measurements
of relative fitness of biological clones of a number of prokaryotic and
eukaryotic viruses (Batschelet et al., 1976
; Domingo et al., 1978
; Holland et al., 1979
; Spindler et al., 1982
; for a review of the early work, see Domingo et
al., 1988
). However, the characterization of
the mutant spectra of virus quasispecies is finding increasing application
to the understanding of virus pathogenesis and evolution of viruses that
either do not grow in cell culture or grow poorly (several human hepatitis
viruses, the caliciviruses Norwalk virus and rabbit haemorrhagic disease
virus, some enteric coronaviruses and papillomaviruses, among others;
examples in Esteban et al., 1999
; Forns et al., 1999
; Flint et al., 2000
; Domingo et al., 2001
). For these systems, the characterization of mutant
spectra may often necessitate the sequencing of molecular clones obtained
after PCR or RTPCR amplification of intracellular viral DNA or
RNA. The conditions detailed in Methods using the high-fidelity Pfu
DNA polymerase (Cline et al., 1996
) provide a straightforward protocol to derive an accurate
representation of point substitutions, indels and mutational hot
spots. Under conditions of excess template (to avoid a molecular
bottleneck in the amplification process), the reliability of the procedure
is supported by the fact that the numbers and types of mutations and their
distribution were statistically indistinguishable when biological clones
and molecular clones were compared (Tables 1, 2 and 3). Although DNA
polymerases of lower fidelity may also be appropriate when mutation
frequencies in the target population are one or more orders of magnitude
larger than in C922 p50 (examples in Nájera et
al., 1995
; Esteban et al., 1999
; and references therein), the availability of
increasing numbers of DNA polymerases with high fidelity and good
processivity (Barnes, 1994
; Cline et al., 1996
) allows a reliable description of mutant spectra from the
vast majority of populations of RNA and DNA viruses.
The genetic heterogeneity in the IRES of
C922 p50 was comparable to that in the VP1-coding region
(Table 1). Application of the M-fold program (included
in the GCG package) indicated that 50 % of all point substitutions and the
C insertion within residues 810814 found in the IRES did not
alter its predicted secondary structure, because these substitutions are
located in loops or bulges (Pilipenko et al., 1989
; Martínez-Salas et al., 1996
). One of the molecular clones [MOL (IA)56; Table 4] included C-766
U and C-886
U, which are predicted to convert G:C into G:U base pairs in domains 3 and
4, respectively, with a total
G of +3.8 kcal/mol (15.9 kJ/mol). The
most drastic mutation was U-799
C, present in one biological clone and one molecular clone, which is
predicted to disrupt a stem in a very conserved hammerhead structure
present in domain 3 with
G=+5.5 kcal/mol (23.0 kJ/mol) relative to the same
IRES domain with the consensus nucleotide sequence. The remaining
mutations are predicted to cause little modification in the secondary
structure but various degrees of alteration of base pairing. The
calculated differences in
G never exceeded 2.8 kcal/mol (11.7 kJ/mol).
Mutations that are not in predicted loops or bulges in the IRES are
underlined in Table 2.
The statistical analysis of the distribution of
mutations among biological and molecular clones suggests that some subsets
of mutations are overrepresented in the mutant spectrum of
C922 p50. This observation agrees with evidence from
model studies with small RNA molecules amplified by Q
replicase, which indicate that the mutant spectra were determined
primarily by selection values rather than by mutation rates (Rohde et
al., 1995
; Biebricher, 1999
). Interestingly, a difference between the
expected and actual number of mutations present in biological and
molecular clones was not observed in the region between the two AUGs,
provided that maximum mutation frequencies (Table 1)
were used in the statistical calculation. This is consistent with the
facts that functional constraints have been largely relaxed by
U-1056
(Escarmís et al., 1996
) and that repeated mutations are the main contributors to
blurring the difference between the observed and expected numbers of
mutations. Either the virus replicase is more error-prone when copying
some specific template residues or functional relaxation in the region
between the two AUGs is non-uniform along the 84 positions. These results
support the notion that virus quasispecies may enclose some sort of
substructuring with regard to the abundance of subsets of mutants. This
point is under further investigation.
Table 3. Insertions and deletions (indels) in the mutant spectrum of C922 p50 identified by biological and molecular cloning
BIO and MOL refer to biological and molecular
clones. The numbers under BIO and MOL indicate the number of independent
biological and molecular clones in which the indicated indel was
found. Procedures for preparation and analysis of biological and molecular
clones are detailed in Methods. Nucleotide residues are numbered according
to Escarmís et al. (1999
). The location in the FMDV genome of the regions analysed
is depicted in Fig. 1. When the insertion of adenylate
residues at positions 11191123 involved more than one additional
adenylate, the number is given as a subscript; (A)36
and (A)813 mean that the numbers of additional
adenylates in these two biological clones were heterogeneous.
|
Residue(s) inserted or deleted (flanking or affected positions) |
BIO |
MOL |
| IRES (6051038) | ||
|
Insertion C (811, 814) |
1 |
0 |
| Between the two AUGs (10391123) | ||
|
Deletion G (1118) |
0 |
1 |
|
Insertion A (11191123) |
7 |
3 |
|
Insertion (A)2 (11191123) |
1 |
0 |
|
Insertion (A)3 (11191123) |
0 |
1 |
|
Insertion (A)36 (11191123) |
1 |
0 |
|
Insertion (A)4 (11191123) |
0 |
1 |
|
Insertion (A)5 (11191123) |
0 |
1 |
|
Insertion (A)6 (11191123) |
0 |
2 |
|
Insertion (A)7 (11191123) |
0 |
3 |
|
Insertion (A)813 (11191123) |
1 |
0 |
|
Insertion (A)10 (11191123) |
0 |
1 |
|
Insertion (A)11 (11191123) |
0 |
1 |
|
Insertion (A)20 (11191123) |
0 |
1 |
|
No insertion (11191123) |
60 |
56 |
A second question that the analysis of the mutant
spectrum of C922 p50 allows to be addressed is the
comparison between the expected and actual number of genomes harbouring
one, two, three, four or five mutations (Table 4).
Considering the analysis of biological clones for the IRES and the
VP1-coding region (a length of 1061 nucleotides), the maximum mutation
frequency is 6.5x104 substitutions per nucleotide,
which gives an expected mean number of mutations within these regions of
0.69 (6.5x104x1061). The expected proportions of
clones with no mutations or one, two, three, four or five mutations in the
1061 nucleotides analysed in biological clones are respectively 50, 35,
12, 2.7, 0.46 and 0.07 % (calculated according to the Poisson distribution
PK=mKem/K!,
where PK is the probability of a genome having K
mutations and m is the mean number of mutations per genome); the
actual experimental values were respectively 50, 37, 10, 1.4, 0 and 1.4 %.
The same calculation for the VP1-coding region among molecular and
biological clones indicates that the predicted proportions of sequences
with no mutations or one, two, three, four or five mutations are 62, 30,
7.3, 1.2, 0.14 and 0.01 %; the actual values were 64, 26, 7.9, 2.1, 0 and
0 %. In all cases, there is good agreement between the expected and actual
distribution of mutations among the clones analysed (in both cases:
0.7<P<0.9;
2, 1 degree of freedom), further supporting the
conclusion that molecular and biological clones provided an
indistinguishable representation of the C922 p50
quasispecies.
Analysis of the mutant spectrum of
C922 p50 has detected the presence of memory genomes in
virus quasispecies, documented previously in this and in another
evolutionary lineage of FMDV (reviewed in Domingo, 2000
; Ruiz-Jarabo et al., 2000
). The memory markers identified were the
replacement K-149
T in VP1 (Table 2) and the heterogeneous internal
polyadenylate tract dominant in C922 p0, which contained
an average of 19 additional adenylate residues (Escarmís et
al., 1996
) (Fig. 1). One
molecular clone from C922 p50 included 20 additional
adenylates (Table 3) and thus represents an accurate
memory genome of the ancestral C922 p0 population, while
other genomes with smaller numbers of adenylates must be regarded as
derivatives of the founder genomes.
Additions or deletions of adenylate residues within
positions 11191122, the hot-spot G-1118
A transition and the rare deletion of G-1118 can occur as a result of
misalignment of the growing RNA strand or the template strand (Ripley,
1990
) during viral RNA synthesis (Fig. 3).
Misalignment mutagenesis events tend to occur in repeated sequences
(Streisinger et al., 1966
; Ripley, 1990
; Denver et al., 2000
; Funchain et al., 2000
) and they may be favoured by the low stability of the
poly(A)·(U) duplex. Homopolymeric poly(A)·poly(U) displays low melting
temperature and a tendency to form a triple helix as the salt
concentration is raised (Saenger, 1984
; and references therein).
Obviously, other molecular mechanisms are possible to account for the
repertoire of point substitutions and indels in
C922 p50 (Ripley, 1990
). The presence of
U-1056
prevents the expression of Lab and hence may explain both the higher
mutation frequency observed in the region located between the two AUG
initiation codons and the high frequency of the transition G-1118
A. This mutation would lead, in the absence of
U-1056,
to the replacement G-27
E in protease Lab. Since G-27 in Lab is conserved among European FMDV
isolates (Ryan & Flint, 1997
), its substitution by E may be deleterious when Lab is
functional.
Fig. 3. Model of
misalignment mutagenesis to explain insertions and deletions in the
genomic region preceding the second functional AUG in the mutant spectrum
of C922 p50. Sequences and residue numbers are from
Escarmís et al. (1996
, 1999
).
A. (D) Misalignment of the template plus-strand, once uridine residues in
the minus strand have been incorporated up to position 1119, may lead to
deletion of G-1118. Note that this is an unlikely event, due to the
necessity of the misalignment occurring when the minus strand has grown to
a specific position, while in (A)(C), a number of alternative
misalignment events during minus-strand synthesis can lead to insertion of
A, deletions or the G-1118
A transition. The schemes are based on mechanisms proposed previously and
reviewed by Ripley (1990
).
The mutant spectrum of C922 p50
also revealed two biological clones and two molecular clones with the
replacement T-148
K, which predicted the dominance of this replacement in
C922 p100 (Fig. 1; Table 2). The replacement T-148
K has previously been shown to occur in the course of passaging of FMDV
C-S8c1 in BHK-21 cells (Díez et al., 1989
; Borrego et al., 1993
; Sevilla et al., 1996
). Therefore, it could be argued that its
presence in the mutant spectrum is fortuitous and unrelated to the
dominance of this substitution in C922 p100. Although
definitive proof that this replacement had a predictive value has not been
obtained, other substitutions that have similarly been found upon passage
of C-S8c1 in BHK-21 cells were not detected among the 140 clones analysed;
such VP1 substitutions are Q-23
K, S-139
R, A-140
V, L-144
V, A-145
P, T-148
A, T-149
A, T-149
K and T-150
K (Díez et al., 1989
; Borrego et al., 1993
; Sevilla et al., 1996
). None of these replacements, with the exception of
T-148
K and T-148
A (which are mutually exclusive as dominant in an FMDV population), was
represented among the 13 amino acid replacements in VP1 scored in the
mutant spectrum of C922 p50 (Table
2). The biological clones BIO 12 and BIO 31 (Table
4) included C-3650
A (T-148
K) and A-3653
C (K-149
T) and predicted the dominance of K-148 and T-149 in
C922 p100. The same biological clones, however, did not
include K-150, which also became dominant in C922 p100.
Two consecutive lysine residues have never been observed within VP1
residues 148150 in consensus sequences or in individual clones,
even after strong selection with polyclonal antibodies directed
specifically to this VP1 loop (Borrego et al., 1993
). Such a K duplex in a helical region of the
GH loop of VP1 is likely to be deleterious for integrin
recognition (Mason et al., 1994
; Berinstein et al., 1995
; Verdaguer et al., 1995
). Therefore, it is likely that some components of a mutant
spectrum may have a predictive value of those genomes that will become
dominant later in the evolutionary lineage. This interesting possibility
will be investigated further.
Table 4. Biological and molecular clones from the mutant spectrum of C922 p50 that contained more than one mutation
BIO and MOL indicate biological and molecular
clones. Because the IRES and the region between the two AUGs were cloned
in E. coli independently of the VP1-coding region, clones
with two or more mutations had to be counted separately; this is indicated
by MOL (IA) (molecular clones of the IRES and the region between the two
AUGs) and MOL (VP) (molecular clones of the VP1-coding region). A total of
70 biological clones, 70 MOL (IA) clones and 70 MOL (VP) clones was
analysed. Procedures are detailed in Methods. Residues are numbered
according to Escarmís et al. (1996
). All mutations found in the clones analysed are listed in
Table 2.
|
Clone |
Mutations |
|
BIO 2 |
C3294U, C3796U |
|
BIO 7 |
G724A, U3606C |
|
BIO 10 |
C1077U, U3491C, A3822G |
|
BIO 11 |
U1008C, A1078G |
|
BIO 12 |
U612C, U1028C, U3432C, C3650A, A3653C |
|
BIO 13 |
U914C, G1087A |
|
BIO 15 |
G1118A, A3775C |
|
BIO 16 |
C3291U, G3780A |
|
BIO 20 |
C3726A, G3762A/g |
|
BIO 24 |
C706U, A1039C |
|
BIO 29 |
U799C, G1118A, C3354U |
|
BIO 31 |
C3650A, A3653C, G3697A |
|
BIO 35 |
A650G, C3750U |
|
BIO 40 |
G1118A, C3525A |
|
BIO 42 |
G1118A, A3347G |
|
BIO 47 |
G1118A, U3278C |
|
BIO 62 |
G1069A, C3645U |
|
BIO 66 |
G1118A, A3649G |
|
BIO 67 |
U1038C, G1091A |
|
BIO 68 |
G1069A, G3592U |
|
MOL (IA)2 |
U799C, A1107G |
|
MOL (IA)10 |
G676A, G1118A |
|
MOL (IA)21 |
G1069A, G1118A |
|
MOL (IA)26 |
U914C, A1082G |
|
MOL (IA)52 |
G1069A, A1122G |
|
MOL (IA)56 |
C766U, C886U, G1118A |
|
MOL (IA)61, 63 |
A701G, G1118A |
|
MOL (IA)70 |
A1039C, G1114A |
|
MOL (VP)14 |
U3278C, G3801A |
|
MOL (VP)21 |
U3432C, C3650A, A3653C |
|
MOL (VP)24 |
U3402C, G3801A |
|
MOL (VP)27 |
U3405C, G3697C |
|
MOL (VP)28, 40 |
C3291U, G3780A |
|
MOL (VP)46 |
C3354U, C3662U |
|
MOL (VP)69 |
C3650A, A3653C |
In conclusion, a reliable characterization of the
mutant spectrum of a virus quasispecies at the nucleotide sequence level
can be obtained through the analysis of biological clones or molecular
clones. The results revealed the great complexity of a mutant spectrum of
a clonal population in the process of fitness gain in a constant
biological environment. The distribution of mutations provided evidence of
substructuring within the mutant spectrum of the C922
p50 quasispecies. Some mutants reflected the past evolutionary history of
the population, while other mutants anticipated those that will become
dominant at a later stage of the evolutionary process. This observation
may be of practical relevance, in that the quantification of mutations
related to variations in B cell or T cell epitopes and to resistance to
antiviral agents, which may be present in different proportions in the
mutant spectrum, may guide decisions on alternative immunotherapeutic or
antiviral regimens. There is increasing evidence that fitness values,
virus load and quasispecies complexity may be relevant to the pathogenic
potential of viruses (Rowe et al., 1997
; Farci et al., 2000
; Quiñones-Mateu et al., 2000
) and to the response of an infected host to
antiviral treatment (Pawlotsky et al., 1998
). The present study encourages quasispecies composition
analyses at the nucleotide sequence level for diagnostic and therapeutic
purposes.
We are indebted to J. Perez-Mercader for support, to M. Dávila and G. Gómez-Mariano for expert technical assistance and to F. J. Doblas-Reyes for help with statistical procedures. Work at the CBMSO was supported by grants FIS 98/0054-01 and PM 97-0060-C02-01 and Fundación R. Areces. Work at CAB was supported by the EU and INTA. A.A. was supported by fellowships from UAM and CAM.
References |
Chao, L. (1990). Fitness of RNA virus decreased by Muller's ratchet. Nature 348, 454455.
Domingo, E. (2000). Viruses at the edge of adaptation. Virology 270, 251253.
Nowak, M. A. (1992). What is a quasispecies? Trends In Ecology & Evolution 4, 118121.
Saenger, W. (1984). Principles of Nucleic Acid Structure. New York: Springer.
© 2001 SGM
This article is now available in the May 2001 print issue of JGV (vol. 82, 10491060). The complete issue of the journal may be seen in electronic form on JGV Online.