Development of a deep learning-based pipeline for detection of nosocomial pathogens from metagenomic data
Implementing Organization
Principal Investigator
Dr. Rachana Banerjee
JIS University, Kolkata, West Bengal
CO-Principal Investigator
Dr. Sandip Paul
JIS University, Kolkata, West Bengal-700109
CO-Principal Investigator
Dr. Kausik Basak
JIS University, Kolkata, West Bengal-700109
About
Multi-drug resistant nosocomial infections are a global concern, particularly in India due to overcrowded and unhealthy hospital environments and unrestrained use of broad-spectrum antibiotics. Rapid identification of nosocomial strains can improve patient outcomes, antibiotic effects, and hospital stay length. Current computational tools often rely on sequence homology methods, which are often unsuccessful in revealing novel strains if closely related genomes are unavailable or absent in the reference database. This proposal aims to develop a deep learning-based approach for extracting nosocomial strain-specific features by training on a broad number of species with known nosocomial characteristics. Common nosocomial bacteria include Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species. These strains do not restrict their normal habitats to human hosts and can be found in other animals without causing disease in humans. The success of these nosocomial strains is mainly due to mutations in their virulence factors (VFs) and antibiotic resistance genes (ARGs), which help these superbugs escape antibiotics. Searching these genes in metagenomic data is the most suitable method for unbiased results.
The proposed deep learning algorithm can be implemented to reconstruct a pipeline for fast detection of pathogenic potential and taxonomic composition of nosocomial strains directly from metagenomic reads. The efficacy of the pipeline will be tested by typifying publicly available metagenomic shotgun datasets with well-established pathogenic characteristics. The pipeline can be compiled into user-friendly software for clinicians and scientists.