Abstract
Background Bacteria within the Staphylococcus genus are notorious for causing a wide range of infections, and they possess genes that play a pivotal role in determining their pathogenicity. In this study, we characterized open reading frames (ORFs), which represent potential functional gene sequences, from selected staphylococcal genomes. Methods Our study involved the extraction, categorization, and annotation of ORFs using diverse analytical methods. This approach unveiled distinct ORFs in both pathogenic and non-pathogenic species, with some commonalities. To assess the conservation of these ORFs and their relevance to pathogenicity, we employed tblastn and Clustal Omega-Multiple Sequence Alignment (MSA) methods. Results Remarkably, we identified 23 ORFs that displayed high conservation among pathogenic staphylococci, with five of them extending beyond the Staphylococcus genus. These particular ORFs may encode products associated with RNA catabolism and could potentially function as regulatory small open reading frames (smORFs). Of particular interest, we found a single smORF situated within a conserved locus of the 50S ribosomal protein L1, present in 200 genomes, including 102 pathogenic strains. Conclusions Our findings highlight the existence of ORFs with highly conserved elements, proposing the existence of 23 novel smORFs that may play a role in the pathogenicity of Staphylococcus species.