This course is an introduction to the management and use of data produced by high throughput technologies (HTS). These technologies have in common the production of a high volume of biological data (DNA sequences) which in turn produces a high demand on computer resources (i.e. processing and storage) and tools most of the time not available in personal computers. There is a continuous “arm race” between the speed of DNA data production and the availability of computational resources to store and analyze these data.
Using this HTS data, scientists have been developing bioinformatic tools to help with their biological analysis. Following this evolution of bioinformatic needs, a) processing and storage and b) biological analysis, several software tools from the UNIX world are working hand in hand with bioinformatic tools in order to analyze these biological data.
In this course we will learn the basis of the use of a UNIX tool, BASH, in order to manage HTS data, launch bioinformatic tools and storage results in a remote server to take advantage of their shared resources and their high computational resources. Then, we will use these HTS data to assemble DNA sequences in order to analyze genomes and transcriptomes.