[beasiswa] [info] Bioinformatics analysis of complex DNA-sequencing data to assess microbial diversity in geological radioactive waste disposal environments (PhD)

Bioinformatics analysis of complex DNA-sequencing data to assess microbial diversity in geological radioactive waste disposal environments (PhD)


Recent developments in new high-throughput technologies have revolutionized molecular biology. This technological progress has led to an explosive growth of the biological information (e.g. via DNA sequencing, RNA microarrays, proteomics), creating new opportunities in the field of bioinformatics in order to computationally deal with the dramatic increase of data. More specifically, the recent availability of next-generation sequencing methods for DNA sequencing has opened up new horizons in microbial community analysis. Such ‘metagenomics’ applications allow the simultaneous high-throughput analysis of genetic material of most of the microbes present in a given sample, without the need for culturing the bacteria first. Over the last years, this approach has been used to reveal the complex microbial diversity in a wide range of previously unexplored environmental samples (e.g. oceans, human gut, etc.), and is certainly of interest for those environments containing microorganisms that are difficult to be cultured under laboratory conditions. Deep subsurface geological clay formations, which were selected and currently studied as candidate host rocks for geological radioactive waste disposal (HADES, Boom clay, Belgium; and Mont Terri, Opalinus clay Swiss), can definitely be classified under these type of environments.
However, these high-throughput sequencing methods produce a massive amount of data. For example, today, one run on the HiSeq2000 sequencing platform of Illumina produces gigabytes of DNA sequencing data Moreover, the domain of next-generation sequencing data analysis is still maturing, and even more powerful sequencing methods are being developed. Therefore, next to important challenges on the molecular biology side (e.g. good quantity and quality DNA extraction from environmental samples), any metagenomic project will require reliable and high performance bioinformatics methods to extract the useful biological information out of these data. However, as the suitability of existing bioinformatics algorithms for the analysis of this type of new large-scale community data remains unknown (Caporaso et al., 2011), benchmarking and eventually adapting existing open-source algorithms (Schloss et al., 2009; Caporaso et al., 2010; Zhou et al., 2011), will be needed.  


The objective of this PhD research is the development of a bioinformatics pipeline able to determine the microbial diversity in environmental samples based on the next-generation DNA-sequencing technologies. 
As a proof of concept, the bioinformatics algorithms developed in this research proposal will be used to assess the presence of microbial activity in deep subsurface geological clay formations in the context of radioactive waste disposal (Boom and Opalinus clay). Preliminary results from our running projects (e.g. AWM PostDoc of Katinka Wouters at SCK•CEN, and Bitumen-Nitrate experiment in Mont Terri), indicate that complex microbial communities are present and active in the Boom and Opalinus clay, but that a significant fraction of these communities are ‘novel’ bacteria (not cultivated and not characterized or described yet), and their methods of survival in these clay layers and metabolic properties are unknown. Thus it is our interest to further explore these environments with cultivation-based and DNA-based technologies. First, a methodology focusing only the 16S rRNA genes, which are genetic markers for microbe identification and taxonomic classification, will be developed. Second, the complete gene pool of the full microbial community will be analyzed, to assess the metabolic potential of the microbial population in the clay layers. The bioinformatics analysis of the DNA-sequencing data obtained from these kind of environments are challenging and will require optimization and adaptation of algorithms. As these data sets cannot be processed anymore with conventional bioinformatics tools, the adaptation and potentially development of novel algorithms will be needed.

Required education level of potential candidates: master in engineering sciences, master in sciences
Candidates must have a background in: Biology, Informatic


66 total views, 1 views today