Open Access Open Access  Restricted Access Subscription Access

Metagenomic and Bioinformatics Tools for Marine Actinomycetes Genome Mining to Discover Novel Biomolecules

Deepa P Mathew


Microbial secondary metabolites especially from marine actinomycetes are one of the prominent resources for the discovery of novel biomolecules. The major hurdles in the drug discovery from marine actinomycetes are their culture resistance in nature and lack of optimum bioinformatics tools for mining the bacterial genomes for novel biosynthetic genes. The advancement in the field of microbial genomics and bioinformatics that occurred in the last few decades paced up the rate of drug discovery from microbes. Present review focused on the culture independent metagenomic approach and introduces various bioinformatics tools and platforms available for mining the bacterial genomes especially marine actinomycetes to discover  novel biomolecules which can be used for human welfare. 


Bioinformatics, Biosynthetic Gene Clusters, Genome mining, Metagenomics, Biomolecules, Metabolites

Full Text:



Afendi, F.M., Okada, T., Yamazaki, M., Hirai-Morita, A., Nakamura, Y., Nakamura, K., et al. (2012). KNApSAcK family databases: integrated metabolite-plant species databases for multifaceted plant research. Plant Cell Physiol, 53 (2), e1.

Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., et al. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res, 25, 3389–402.

Amann, R.I., Ludwig, W., Schleifer, K.H. (1995). Phylogenetic identification and in-situ detection of individual microbial cells without cultivation. Microbiol Rev, 59, 143-169.

Anand, S., Prasad, M.V., Yadav, G., Kumar, N., Shehara, J., Ansari, M.Z., et al. (2010). SBSPKS: structure-based sequence analysis of polyketide synthases. Nucleic Acids Res, 38, W487–96.

Ansari, M.Z., Yadav, G., Gokhale, R.S., Mohanty, D. (2004). NRPS-PKS: a knowledge-based resource for analysis of NRPS/PKS megasynthases. Nucleic Acids Res, 32, W405–13.

Bachmann, B.O., Ravel, J. (2009). Chapter 8. Methods for in silico prediction of microbial polyketide and non-ribosomal peptide biosynthetic pathways from DNA sequence data. Methods Enzymol, 458, 181–217.

Begon, M., Harper, J.L., Townsend, C.R. (1986). Ecology: Individuals populations and communities. Oxford: Blackwell Scientific Publications, 876.

Bento, A.P., Gaulton, A., Hersey, A., Bellis, L.J., Chambers, J., Davies, M., et al. (2014). The ChEMBL bioactivity database: an update. Nucleic Acids Res, 42, D1083–90.

Blin, K., Kazempour, D., Wohlleben, W., Weber, T. (2014). Improved lanthipeptide detection and prediction for antiSMASH. PLoSONE , 9 (2), e89420.

Blin, K., Medema, M.H., Kazempou, r D., Fischbach, M., Breitling, R., Takano, E., et al. (2013). antiSMASH 2.0 – a versatile platform for genome mining of secondary Metabolite producers. Nucleic Acids Res, 41, W204–12.

Bolton, E., Wang, Y., Thiessen, P.A., Bryant, S.H. (2008). PubChem: integrated platform of small molecules and biological activities. In: Wheeler R & Spellmeyer D, (Eds). Annual reports in computational chemistry, 4. Washington, DC: American Chemical Society, pp 217–41.

Caboche, S., Pupin, M., Leclere, V., Fontaine, A., Jacques, P., Kucherov G. (2008). NORINE: a database of nonribosomal peptides. Nucleic Acids Res, 36, D 326–31.

Cimermancic, P., Medema, M.H., Claesen, J., Kurita, K., Wieland Brown, L.C., Mavrommatis, K., et al. (2014). Insights in to secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters. Cell, 158, 412–21.

Conway, K.R., Boddy, C.N. (2013). ClusterMine360: a database of microbial PKS/NRPS biosynthesis. Nucleic Acids Res, 41, D402–7.

Cruz-Morales, P., Martinez-Guerrero, C.E., Morales-Escalante, M.A., Yanez-Guerra, L.A., Kopp, J.F., Feldmann, J., et al. (2015). Recapitulation of the evolution of biosynthetic gene clusters reveals hidden chemical diversity on bacterial genomes. bioRxiv, doi, 10.1101/020503.

Das, S., Lyla, P.S., Khan, S.A. (2006). Marine microbial diversity and ecology: importance and future perspectives. Curr Sci, 90, 1325-1335.

de Jong, A., van Heel, A.J., Kok, J., Kuipers, O.P. (2010). BAGEL2: mining for bacteriocins in genomic data. Nucleic Acids Res, 38, W 647–51.

de Jong, A., van Hijum, S.A., Bijlsma, J.J., Kok, J., Kuipers, O.P. (2006). BAGEL: a web-based bacteriocin genome mining tool. Nucleic Acids Res, 34, W 273–9.

Degtyarenko, K., de Matos, P., Ennis, M., Hastings, J., Zbinden, M., McNaught, A., et al. (2008). ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res, 36, D 344–50.

Diminic, J., Zucko, J., Ruzic, I.T., Gacesa, R., Hranueli, D., Long, P.F., et al. (2013). Databases of the thiotemplate modular systems (CSDB) and their in silico recombinants (r-CSDB). J Ind Microbiol Biotechnol, 40, 653–9.

Eddy, S.R. (2011). Accelerated profile HMM searches. PloS Comput Biol, 7(10), e1002195

Fichot, E.B., Norman, R.S. (2013). Microbial phylogenetic profiling with the pacific biosciences sequencing platform. Microbiome, 1(1), 1-5

Flissi, A., Dufresne, Y., Michalik, J., Tonon, L., Janot, S., Noe, L., et al. (2015). Norine, the knowledgebase dedicated to non-ribosomal peptides, is now open to crowdsourcing. Nucleic Acids Res, 44 (D1), D 1113–8.

Gaulton, A., Bellis, L.J., Bento, A.P., Chambers, J., Davies, M., Hersey, A., et al. (2012). ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res, 40, D 1100–7.

Glenn, T. (2014). NGS Field Guide: Overview-The Molecular Ecologist Available online at: http://wwWmolecularecologisTcom/next-gen-fieldguide-2014/ (Accessed August 24, 2015).

Hadjithomas, M., Chen, I.M., Chu, K., Ratner, A., Palaniappan, K., Szeto, E., et al. (2015). IMG-ABC: a knowledge base to fuel discovery of biosynthetic gene clusters and novel secondary metabolites. M Bio, 6 (4), e00932-15.

Hammami, R., Zouhir, A., Ben Hamida, J., Fliss, I. (2007). BACTIBASE: a new web-accessible database for bacteriocin characterization. BMC Microbiol, 7(1), 1-6.

Hammami, R., Zouhir, A., Le Lay, C., Ben Hamida, J., Fliss, I. (2010). BACTIBASE second release: a database and tool platform for bacteriocin characterization. BMC Microbiol, 10(1), 1-5.

Handelsman, J., Rondon, M.R., Brady, S.F., Clardy, J., Goodman, R.M. (1998). Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products. ChemBiol, 5, R 245-R249.

Hastings, J., de Matos, P., Dekker, A., Ennis, M., Harsha, B., Kale, N., et al. (2013). The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res, 41, D 456–63.

Ibrahim, A., Yang, L., Johnston, C., Liu, X., Ma, B., Magarvey, N.A. (2012). Dereplicating non-ribosomal peptides using an informatic search algorithm for natural products (iSNAP) discovery. Proc Natl Acad Sci U S A, 109, 19196–201.

Ichikawa, N., Sasagawa, M., Yamamoto, M., Komaki, H., Yoshida, Y., Yamazaki, S., et al. (2013). DoBISCUIT: a database of secondary metabolite biosynthetic gene clusters. Nucleic Acids Res, 41, D408–14.

Johnston, C.W., Skinnider, M.A., Wyatt, M.A., Li, X., Ranieri, M.R., Yang, L., et al. (2015). An automated Genomes-to-Natural Products platform (GNP) for the discovery of modular natural products. Nat Commun, 6 (1), 1-11

Kakirde, K.S., Parsley, L.C., Liles, M.R. (2010). Size does matter: application-driven approaches for soil metagenomics. Soil Biol Biochem, 42, 1911-1923.

Kamra, P., Gokhale, R.S., Mohanty, D. (2005). SEARCHGTr: a program for analysis of glycosyltransferases involved in glycosylation of secondary metabolites. Nucleic Acids Res, 33, W220–5.

Klementz, D., Doring, K., Lucas, X., Telukunta, K.K., Erxleben, A., Deubel, D., et al. (2016). StreptomeDB 2.0 – an extended resource of natural products produced by streptomycetes. Nucleic Acids Res, 44(D1), D 509–14.

Li, M.H., Ung, P.M., Zajkowski, J., Garneau-Tsodikova, S., Sherman, D.H. (2009). Automated genome mining for natural products. BMC Bioinformatics, 10 (1), 1-10.

Logares, R., Haverkamp, T.H.A., Kumar, S., Lanzén, A., Nederbragt, A. J., et al. (2012). Environmental microbiology through the lens of high-throughput DNA sequencing: synopsis of current platforms and bioinformatics approaches. J Microbiol Methods, 91, 106–113.

Lucas, X., Senger, C., Erxleben, A., Gruning, B.A., Doring, K., Mosch, J., et al. (2013). StreptomeDB: a resource for natural compounds isolated from Streptomyces species. Nucleic Acids Res, 41, D 1130–6.

Medema, M.H., Blin, K., Cimermancic, P., de Jager, V., Zakrzewski, P., Fischbach, M.A., et al. (2011). antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Res, 39, W339–46.

Medema, M.H., Paalvast, Y., Nguyen, D.D., Melnik, A., Dorrestein, P.C., Takano, E., et al. (2014). Pep2Path: automated mass spectrometry-guided genome mining of peptidic natural products. PloS Comput Biol, 10, e1003822.

Medema, M.H., Takano, E., Breitling, R. (2013). Detecting sequence homology at the gene cluster level with MultiGeneBlast. Mol Biol Evol, 30 (5) 1218-1223.

Mohimani, H., Kersten, R.D., Liu, W.T., Wang, M., Purvine, S.O., Wu, S., et al. (2014). Automated genome mining of ribosomal peptide natural products. ACS Chem Biol, 9, 1545–51.

Mohimani, H., Liu, W.T., Kersten, R.D., Moore, B.S., Dorrestein, P.C., Pevzner, P.A. (2014). NRPquest: coupling mass spectrometry and genome mining for nonribosomal peptide discovery. J Nat Prod, 77, 1902–9.

Mohimani, H., Liu, W.T., Mylne, J.S., Poth, A.G., Colgrave, M.L., Tran, D., et al. (2011). Cycloquest: identification of cyclopeptides via database search of their mass spectra against genome databases. J Proteome Res 10, 4505–12.

Nakamura, Y., Afendi, F.M., Parvin, A.K., Ono, N., Tanaka, K., Hirai Morita, A., et al. (2014). KNApSAcK Metabolite Activity Database for retrieving the relationships between metabolites and biological activities. Plant Cell Physiol, 55, e7.

Reddy, B.V., Kallifidas, D., Kim, J.H., Charlop-Powers, Z., Feng, Z., Brady, S.F. (2012). Natural product biosynthetic gene diversity in geographically distinct soil microbiomes. Appl Environ Microbiol, 10, 3744-52.

Rondon, M.R., August, P.R., Bettermann, A.D., Brady, S.F., Grossman, T.H., et al. (2000). Cloning the soil metagenome: a strategy for accessing the genetic and functional diversity of uncultured microorganisms. Appl Environ Microbiol, 66, 2541-2547.

Sanchez-Flores, A., Abreu-Goodger, C. (2014). A practical guide to sequencing genomes and transcriptomes. Curr Top Med Chem, 14, 398–406.

Sanger, F., Nicklen, S., Coulson, A.R. (1977). DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA, 74, 5463–5467.

Schloss, P.D., Handelsman, J. (2003). Biotechnological prospects from metagenomics. Curr Opin Biotechnol, 14, 303-310.

Skinnider, M.A., Dejong, C.A., Rees, P.N., Johnston, C.W., Li, H., Webster, A.L., et al. (2015). Genomes to natural products PRediction Informatics for Secondary Metabolomes (PRISM). Nucleic Acids Res, 43(20), 9645-9662.

Skinnider, M.A., Johnston, C.W., Zvanych, R., Magarvey, N.A. (2015). Automated identification of depsipeptide natural products by an informatic search algorithm. Chembiochem, 16, 223–7.

Starcevic, A., Zucko, J., Simunkovic, J., Long, PF., Cullum, J., Hranueli, D. (2008). ClustScan: an integrated program package for the semi-automatic annotation of modular biosynthetic gene clusters and in silico prediction of novel chemical structures. Nucleic Acids Res, 36, 6882–92.

Starcevic, A., Wolf, K., Diminic, J., Zucko, J., Ruzic, I.T., Long, P.F., et al. (2012). Recombinatorial biosynthesis of polyketides. J Ind Microbiol Biotechnol, 39, 503–11.

Subramani, R., Aalbersberg, W. (2012). Marine actinomycetes: an ongoing source of novel bioactive metabolites. Microbiological research, 167(10), 571-580.

Torsvik, V.L., Ovreås, L. (2011). DNA reassociation yields broad-scale information on metagenome complexity and microbial diversity. In: F. J. de Bruijn (Ed.), Handbook of Molecular Microbial Ecology I: Molecular Metagenomics and Complementary Approaches. John Wiley & Sons, Inc., Hoboken, NJ, USA, doi;10.1002/9781118010518.ch2.

van Heel, A.J., de Jong, A., Montalban-Lopez, M., Kok J., Kuipers, O.P. (2013). BAGEL3: automated identification of genes encoding bacteriocins and (non-) Bactericidal post translationally modified peptides. Nucleic Acids Res, 41, W 448–53.

Van-Elsas, J.D., Speksnijder, A.J., van-Overbeek, L.S. (2008). A procedure for the metagenomics exploration of disease-suppressive soils. J MicrobiolMethods, 75, 515-522.

Vijayan, M., Chandrika, S.K., Vasudevan, S.E. (2011). PKSIIIexplorer: TSVM approach for predicting Type III polyketide synthase proteins. Bioinformation, 6,125–7.

Weber, T., Blin, K., Duddela, S., Krug, D., Kim, H.U., Bruccoleri, R., et al. (2015). antiSMASH 3.0 – a comprehensive resource for the genome mining of biosynthetic gene clusters. Nucleic Acids Res, 43, W237–43.

Weber, T., Rausch, C., Lopez, P., Hoof, I., Gaykova, V., Huson, D.H., et al. (2009). CLUSEAN: a computer-based framework for the automated analysis of bacterial secondary metabolite biosynthetic gene clusters. J Biotechnol, 140, 13–7.

Wu, C., Choi, Y.H., van Wezel, G.P. (2015). Metabolic profiling as a tool for prioritizing antimicrobial compounds. J Ind Microbiol Biotechnol, 43(2-3), 299-312. doi, 10.1007/s10295-015-1666-x.

Wu, C., Kim, H.K., van Wezel, G.P., Choi, Y.H. (2015). Metabolomics in the natural products field – a gateway to novel antibiotics. Drug Discov Today Technol, 13, 11–7.

Yadav, G., Gokhale, R.S., Mohanty, D. (2003). SEARCHPKS: a program for detection and analysis of polyketide synthase domains. Nucleic Acids Res, 31, 3654–8.

Yang, L., Ibrahim, A., Johnston, C.W., Skinnider, M.A., Ma, B., Magarvey, N.A. (2015). Exploration of nonribosomal peptide families with an automated informatic search algorithm. Chem Biol, 22, 1259–69.

Ziemert, N., Podell, S., Penn, K., Badger, J.H., Allen, E., Jensen, P.R. (2012). The natural product domain seeker NaPDoS: a phylogeny based bioinformatic tool to classify secondary metabolite gene diversity. PLoSONE, 7, e34064.


  • There are currently no refbacks.

Informatics Studies:  ISSN: 2583-8994 (Online), 2320-530X (Print)