Author:
Chen Xiangzhong,Wang Mingzhao,Liu Xinglin,Zhang Wenjie,Yan Huan,Lan Xiang,Xu Yandi,Tang Sanyi,Xie Juanying
Abstract
AbstractTo explore the differences and relationships between the available SARS-CoV-2 strains and predict the potential evolutionary direction of these strains, we employ the hierarchical clustering analysis to investigate the evolutionary relationships between the SARS-CoV-2 strains utilizing the genomic sequences collected in China till January 7, 2023. We encode the sequences of the existing SARS-CoV-2 strains into numerical data through k-mer algorithm, then propose four methods to select the representative sample from each type of strains to comprise the dataset for clustering analysis. Three hierarchical clustering algorithms named Ward-Euclidean, Ward-Jaccard, and Average-Euclidean are introduced through combing the Euclidean and Jaccard distance with the Ward and Average linkage clustering algorithms embedded in the OriginPro software. Experimental results reveal that BF.28, BE.1.1.1, BA.5.3, and BA.5.6.4 strains exhibit distinct characteristics which are not observed in other types of SARS-CoV-2 strains, suggesting their being the majority potential sources which the future SARS-CoV-2 strains’ evolution from. Moreover, BA.2.75, CH.1.1, BA.2, BA.5.1.3, BF.7, and B.1.1.214 strains demonstrate enhanced abilities in terms of immune evasion, transmissibility, and pathogenicity. Hence, closely monitoring the evolutionary trends of these strains is crucial to mitigate their impact on public health and society as far as possible.
Funder
National Natural Science Foundation of China
Publisher
Springer Science and Business Media LLC
Reference36 articles.
1. Tawhid, M. N. A., Siuly, S., Wang, K. & Wang, H. Automatic and efficient framework for identifying multiple neurological disorders from EEG signals. IEEE Trans. Technol. Soc. 4, 76–86 (2023).
2. Sarki, R., Ahmed, K., Wang, H., Zhang, Y. & Wang, K. Convolutional neural network for multi-class classification of diabetic eye disease. EAI Endorsed Trans. Scal. Inf. Syst. 9, 25 (2021).
3. Alvi, A. M., Siuly, S. & Wang, H. A long short-term memory based framework for early detection of mild cognitive impairment from EEG signals. IEEE Trans. Emerg. Top. Comput. Intell. 7, 375–388 (2022).
4. Zhu, N. et al. A novel coronavirus from patients with pneumonia in China, 2019. N. Engl. J. Med. 382, 727–733 (2020).
5. Lineage List. https://cov-lineages.org/lineage_list.html (2023).