https://github.com/cov-lineages/pango-designation/issues/1651
ryhisner
commented
Feb 13, 2023
Description
Sub-lineage of: XBF
Earliest sequence: 2023-1-10, Australia, New South Wales — EPI_ISL_16584770
Most recent sequence: 2023-2-5, Australia, South Australia — EPI_ISL_16902443
Countries circulating: Australia (15), Singapore (2)
Number of Sequences: 17
GISAID Query: Spike_F486P, Spike_P621S, NSP1_K120N
CovSpectrum Query: Nextcladepangolineage:XBF* & [5-of: T3442C, T9931C, C12970T, C13255T, C22000T, C23423T, T25039C]
Substitutions on top of XBF:
Spike: P621S
Nucleotide: T3442C, T9931C, C12970T, C13255T, C22000T, C23423T, T25039C
Evidence
For reasons unknown, S:P621S has become a mutational hotspot of ate,
particularly in lineages and individual sequences with other notable
mutations. It seems meaningful, though I have no idea what is behind it.
This lineage only recently appeared but seems to be growing quickly.
Whether that is due to chance or an isolated outbreak or because it
possesses some advantage isn't yet clear, but I think it bears watching.
There are an unusual number of synonymous mutations in this lineage,
all C->T or T->C.
Genomes
Genomes
FedeGueli
i think two new ones have been added from England! it seems really fast, unluckily
maybe unrelated @ryhisner could u check them: https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_3be10_a668b0.json?c=userOrOld&label=id:node_8273022
ryhisner
commented
Feb 13, 2023 Oh, wow, nice spot, Fede! It looks like there are actually six sequences
from England altogether, though four of them are not yet on GISAID.
Usher puts the English sequences on a separate branch from the
Australian ones, but I'm almost certain they really belong together. The
giveaway is that both branches have the synonymous mutation C23423T.
Hard to believe that's a coincidence. The four artifactual "reversions"
(T3796C, T3927C, T4586C, T5183C) that appear on every uploaded XBF
sequence (but not on the other sequences on the tree) might be mucking
things up somehow.'
AngieHinrichs
commented
Feb 14, 2023
Those two branches have C23423T (S:P621S, non-synonymous) in common, but they each have several other mutations after XBF with no overlap aside from C23423T. The branch with most of the uploaded sequences is
XBF > C12970T > T9931C > T3442C,C22000T > C13255T,C23423T,T25039C
and the branch with the two sequences from England with C23423T (S:P621S) is
XBF > C24378T > G3728T > C23423T,A23989G
(The four artifactual "reversions" (T3796C, T3927C, T4586C, T5183C) that
appear on every uploaded XBF sequence (but not on the other sequences on
the tree) might be mucking things up somehow).
Sorry about the reversions. Those are true for XBF (because that part of it comes from BA.5.2 not BA.2.75), but are masked out of the entire BA.2.75 branch of the UShER tree (where XBF is placed) because false reversions were an awful problem with BA.2.75 sequences. The masking on the BA.2.75 branch (but not in uploaded sequences) mucks up the details of how sequences are placed in the web interface relative to existing sequences in the tree, but should not affect which existing branches the sequences are placed on within XBF (because all of the BA.2.75 branch has those sites masked).
BTW there is a new, hopefully more readable, source of info about branch-specific masking in the UShER tree -- instead of a bash script that just performed the masking, there is now a YAML specification of the masking (and a separate script that performs the masking according to the spec):
2 days ago Label: recombinant sublineage
-----
oobb45729
commented
Feb 15, 2023
S:P621S may be about immune escape.Initially I have some doubts about it since P621 is not in the main
epitope. However, a closer look reveals that the growth of P621S or
P621H does follow a similar pattern like E554K or E554V.
E554K
-----
ryhisner
commented
Feb 15, 2023
@AngieHinrichs, thanks for the insight on the tree! My assumption was that the South Australian sequences in particular are so shabby as to make branches that belong together appear as if they don't belong together. Possibly the tree is different now, but when I last checked, it showed four independent acquisitions of ORF1ab:N4899I on this lineage's branch as well as at least two independent acquisitions of ORF1a:L3715N and N:L139F, which seems unlikely. But since South Australia apparently doesn't show where there is missing coverage in their sequences, putting those branches together would probably require positing multiple reversions, which is even more unlikely.
Are there any efforts to standardize certain ways of reporting sequences? I imagine it would make things a lot clearer—and make your job a lot easier—if all labs indicated where their sequences lack coverage instead of reverting to the reference genome in those places, producing multitudes of artifactual reversions. The public health lab in the state of Utah in the USA is probably the worst offender in this regard. Sometimes I wish there was a way to screen out sequences from certain countries (Pakistan, Turkey, Chile, etc) or regions (Utah, Louisiana, Kerala) from GISAID/Covspectrum searches and Usher trees.
Inga kommentarer:
Skicka en kommentar