The human pathogen Clostridium difficile is responsible for many severe cases of hospital-acquired diarrhoea, and the many mobile genetic elements (MGEs) in the C. difficile genome contribute to its virulence through the transfer of toxin and antibiotic resistance genes. Resultant repeat-rich regions makes it difficult to resolve these elements and generate high quality draft genomes with short read NGS technologies.
Using Illumina reads from a previous sequencing project, an assembly strategy was developed to create an improved draft genome of the C. difficile R078 estuarine isolate CD105HS26. The newly SMRT-sequenced genomes of the clinical reference strain M120 and estuarine isolate CD105HS27 (both R078) were used as references to resolve difficult sequence regions. Annotation of the draft assembly identified MGE content, and showed improved resolution of repeat regions and pathogenicity-related genes. CD105HS26 was found to contain both transposons present in M120 and a unique transposon-like element from CD105HS27, which could be partly resolved using reference sequences during assembly.
The CRISPR/cas system provides adaptive immunity against bacteriophage infections by storing viral sequences as spacers. Spacers from the R078 CRISPR arrays were searched for identical matches to C. difficile phages, genomes and plasmids to determine phage resistance. Non-identical matches were used to assess potential phage evolution, predict the existence of novel phages and establish a potential host range of existing uncharacterised phages. This analysis indicates the R078 isolates have high resistance to bacteriophage infection and shows their CRISPR arrays acquire new spacers slowly, suggesting CRISPR spacer analysis could be used for strain typing.