top of page

Genome Sequencing, Annotation & Structural Analysis of Bacteriophage Renkei

First identification of a small terminase in the B4 phage subcluster, confirmed through AlphaFold2 structural modeling.


Overview

Building on phage discovery work from Year 1, this project moved into the deeper challenge of understanding what a phage does at the molecular level. Working within a multi-institution collaborative team as part of the SEA-PHAGES program, I led genome annotation efforts and took full ownership of protein structural modeling for Mycobacterium smegmatis phage Renkei — a novel siphovirus isolated from a South Florida soil sample.


PhagesDB entry: Renkei ↗

MRA Abstract (SEA-Phages) ↗


The work produced a complete 97-gene genome annotation and, most significantly, the structural identification of gp65 as a small terminase — a DNA packaging protein never previously reported in any B4 subcluster phage. These findings are documented in a manuscript submitted to Microbiology Resource Announcements (MRA) and a research poster co-presented at the SEA-PHAGES national symposium.


Key Findings:

Gene 65 (gp65) was identified as the first small terminase in the B4 phage subcluster. Using AlphaFold2 and ChimeraX, I modeled its monomer (revealing a conserved helix-turn-helix DNA-binding motif) and its nonamer assembly (a ring with a large central channel) — the canonical quaternary architecture of functional small terminases. This structural evidence directly supports its functional classification and expands our understanding of DNA packaging evolution in this phage group.



Gene 65 (gp65) was identified as the first small terminase in the B4 phage subcluster. Using AlphaFold2 and ChimeraX, I modeled its monomer (revealing a conserved helix-turn-helix DNA-binding motif) and its nonamer assembly (a ring with a large central channel) — the canonical quaternary architecture of functional small terminases. This structural evidence directly supports its functional classification and expands our understanding of DNA packaging evolution in this phage group.


Objectives

  • Sequence and fully annotate the Renkei genome using multi-tool bioinformatics pipeline

  • Assign putative functions to gene products through structural homology and sequence analysis

  • Investigate gp65, a hypothetical protein with no known analog in B4 phages, using AlphaFold2

  • Model gp65 as both monomer and nonamer to assess structural homology to canonical small terminases

  • Contribute findings to peer-reviewed literature as MRA co-author


Methods

Step 1: Genome Sequencing & Assembly

Phage DNA was extracted and sequenced on the Illumina platform (single-end, 100 bp reads) at approximately 465× shotgun coverage. The 70,746 bp genome was assembled with circularly permuted ends and confirmed at 69.0% GC content, placing Renkei in the B4 subcluster.

Step 2: Genome Annotation

I coordinated gene start prediction using Glimmer and GeneMark within PECAAN, then systematically evaluated all 97 predicted genes using NCBI-BLAST, HHpred, the Conserved Domain Database, Phamerator, and PhagesDB. Of 97 genes, 33 received putative functional annotations. No tRNA genes, integrase, or immune repressor genes were detected — consistent with a lytic lifecycle.

Step 3: gp65 Monomer Structural Modeling (AlphaFold2 + ChimeraX)

I generated a predicted 3D structure of the gp65 monomer using AlphaFold2, then superimposed it onto canonical small terminase structures in ChimeraX. The model revealed a conserved helix-turn-helix (HTH) motif at the N-terminus , the defining structural feature of DNA-binding small terminase subunits.

Step 4: gp65 Nonamer Assembly Modeling

To assess oligomeric behavior, I modeled gp65 as a nonamer (9-subunit ring assembly). The resulting quaternary structure forms a ring with a large internal channel, directly matching the conserved architecture of functional small terminases involved in DNA packaging. This was the critical structural evidence supporting functional classification.


Results


Relevance to engineering & applied research

Structural Modeling

AlphaFold2 and ChimeraX are transforming protein engineering, drug discovery, and biosensor design

Bioinformatics Pipeline

Genome annotation requires synthesizing outputs across multiple probabilistic tools — the same multi-source data integration central to environmental monitoring and sensing systems.

Phage Therapy

Mycobacteriophages show clinical promise against antibiotic-resistant infections. Understanding DNA packaging proteins like terminases is foundational to engineering phages for therapeutic use.

Scientific Publication

Contributing to peer-reviewed literature as an undergraduate demonstrates the ability to design experiments, interpret ambiguous data, and communicate findings at a professional research standard.


Project Gallery

bottom of page