Analytics and Algorithms for Omics Data

for Life Sciences

Advanced Omics for Life Sciences

Aims

At the end of the course the student:

can read and understand a paper in current computational and systems biology literature,
identify relevant parts in the paper on the topic of data generation and the algorithms used to analyse these data and criticise the computational approaches taken,
list and describe several high-throughput data types and computer algorithms to analyse these data and motivate why a certain algorithm is suitable for the analysis of a certain data type,
apply the algorithms discussed in this course to toy problems, and derive and design adaptations of these algorithms for new data types,
draw biologically meaningful conclusions from results obtained with a analysis algorithm.
understands and can explain the basics of unsupervised Machine learning (ML) and the specifics of k-means, hierarchical and spectral clustering
understands and can explain the basics of supervised Machine learning (ML), including concepts such as cross-validation and overtraining and the specifics of probabilistic, knn and random forest classifiers
understands and can explain the basics of dimension reduction and the specifics of PCA, NMF and tSNE.
understands and can explain the basics of Hidden Markov Models and their application to (epi)genomic data
understands and can explain the basics of sequence analysis and alignment and the specifics of dynamic programming, variant calling and modern next generation sequencing analysis

Content

Lecturer(s):
Name, faculty/department, participation (%) in course
Dr. Jeroen de Ridder, UMC University, 60%
Dr. Alexander Schoenhuth, Utrecht University, 40%

Extended course description (for Osiris):
Bioinformatics is at the heart of many modern genomics research, and encompasses the application of statistics and computer science to (large-scale) biomolecular datasets. In essence, bioinformatics is about smart ways of extracting knowledge from the enormous amounts of data that can be generated using modern measurement techniques. For instance, it plays an important role in finding the genetic origins of various diseases, such as cancer, diabetes or alzheimer.

In this course we will study some key examples of bioinformatics analyses, i.e. data analytics and computational algorithms, by reading a set of selected papers that present some significant biological conclusions. Instead of the teachers giving lectures about the methodologies, the students are stimulated to read, study and comprehend the available course material. Some lectures will be provided to ensure the basic concepts are clear.

Schedule: The course runs for five days from 9.00 till approximately 17.00. Each day will start with a lecture followed by two rounds of paper discussions that goes into depth with regards to the computational approaches taken.

Content:

Unsupervised learning, Hierarchical and k-means clustering, spectral clustering
Supervised learning, cross-validation, overtraining, Bayes classifier, Random Forest classifier
Dimension reduction, PCA, NMF, tSNE
Hidden Markov Models, Forward Backward algorithm, Viterbi
Sequence alignment, Dynamic programming
Read mapping techniques
Sequence data indexes, such as Burrows-Wheeler Transform
Genome assembly basics, de Bruijn graphs, overlap graphs
Hash-based techniques, for example for overlap detection

Literature/study material used:
Provided course materials (slides) will be made available through our online learning platform: elearning.ubc.uu.nl

Mandatory for students in own Master’s programme:
No

Optional for students in other GSLS Master’s programme:
Yes

Prerequisite knowledge:
Basic knowledge of Linear Algebra and Statistics.

Registration:

Please register online on the CS&D website: www.CSnD.nl/courses.
Bioinformatics Profile students will have priority when this course is followed as a part of their profile.
Thereafter, registration is on 'first-come-first-serve' basis until the maximum number of 20 participants is reached.

Services

NGS analysesNGS data processing

Grant applicationsTo maximize your chance

F.A.I.R.Findable, Accessible, Interoperable and Reproducible

Supported SpeciesVarious species out of the box

ConsultancyAdvice on your research

Consultancy service

For expertise advice on your research and the possibilities UBEC offers to get the most out of your data: request a meeting by emailing us. Experimental design, data management, bioinformatics analysis, results and follow-up experiments are discussed. The facility manager ensures that experts from participating organizations are present during this meeting.

You can request a meeting at bec@umcutrecht.nl.