Linux terminal for beginners 251106
                        
This course provides an introduction to using the linux terminal. It is suitable for complete beginners who have never used the command line before.
                    
            Resources from SIT
                        
Resources from the SIT department. From Cluster usage to programming languages
                    
            Cluster conversion course 2025
                        This course will give a quick revision of essential general concepts for using a cluster well followed by specific examples utilising
                    
            Python for beginners
                        
The course will guide you to setup a development environment using VS Code together with Gitlab and creating a personal Git repository for the course. Afterwards, it will provide Python programming concepts, covering variables, functions, loops, lists, dictionaries, among others. During the process we will show you tools to help you keep the code clean and to enforce proper Python syntax and styling.  Later, we will cover the basics of file handling in Python, gaining the ability to read/write files and to manipulate data. Finally, we will visualize data by using the packages matplotlib and seaborn by using practical examples.
                    
            Complete cluster course - 250213
                        
Welcome to the "Complete Cluster  Course"!
This course will 
consolidate material presented in the beginner cluster course and expand
 on the concepts to be aware of when trying to optimize use of the 
cluster. The main message of the course is to embrace the 
parallelism available within the cluster and that pipelines should be 
made from lots of small independent pieces that are spread throughout 
the cluster rather than large monolithic long jobs that run on a single 
node. The course will show why this should be done and how to achieve 
it. Topics that are going to be addressed:  
Video tour of the data centre 
What is a cluster 
Logging in 
Queuing / the scheduler 
What resource are available at the CRG cluster 
Simple batch scripts - directives 
Troubleshooting - what happened to my jobs? 
Interactive sessions 
Supercomputers, beowulf clusters, horizontal v vertical scaling 
Hardware considerations 
Multithreaded jobs, parallelism, Amdahl's Law 
Job arrays 
Job dependencies 
Building a pipeline 
Storage issues, treemap 
Job stats, resource estimation 
Scaling analysis 
                    
            Cluster conversion course - 250221
                        
This course will give a quick revision of essential general concepts for using a cluster well followed by specific examples utilising slurm to run jobs on the cluster. 
                    
            Linux terminal for beginners 250210
                        
This course provides an introduction to using the linux terminal. It is suitable for complete beginners who have never used the command line before.
                    
            NGS courses - Babraham Institute
                        
This package includes a set of short courses:
Quality control in Sequencing Experiments
Analysing Mapped Sequence Data with SeqMonk
RNA-Seq Analysis
10X Single Cell RNA-Seq Analysis
ChIP-Seq Analysis
Analysing bisulfite methylation sequence data
                    
            Complete R courses - Babraham Institute
                        
Course 1. Introduction to Tidyverse
R is a
popular language and environment that allows powerful and fast manipulation of
data, offering many statistical and graphical options. This course aims to
introduce R as a tool for statistics and graphics, with the main aim being to
become comfortable with the R environment. As well as introducing core R
language concepts this course also provides the basics of using the Tidyverse
for data maniupulation, and ggplot for plotting. It will focus on entering and
manipulating data in R and producing simple graphs. A few functions for basic
statistics will be briefly introduced, but statistical functions will not be
covered in detail.
Course 2. Advanced Tidyverse
The 'Tidyverse' is a set of add-in R packages for data loading, modelling, manipulation and plotting. It is an attempt to make data analysis and plotting cleaner, simpler and more consistent by addressing some poor design decisions in the original language. This course follows on from our Introduction to R with tidyverse and focusses on the manipulation and restructuring of data using the tidyverse packages. The course shows how to do complex transformations on large data structures and how to deal efficiently with data which is both large and sometimes not well behaved.
Course 3. Introduction to ggplot This course is normally taught as part of the R with Tidyverse bootcamp. Ggplot is the most popular plotting extension to R and replicates many of the graph types found in the core plotting libraries. This course provides an introduction to the ggplot2 libraries and gives a practical guide for how to use these to create different types of graphs.Course 4. Introduction to Core RR is a popular language and environment that allows powerful and fast manipulation of data, offering many statistical and graphical options. This course aims to introduce R as a tool for statistics and graphics, with the main aim being to become comfortable with the R environment. It will focus on entering and manipulating data in R and producing simple graphs. A few functions for basic statistics will be briefly introduced, but statistical functions will not be covered in detail.Course 5. Advanced Core RThis course follows on from the introductory course. It goes into more detail on practical guides to filtering and combining complex data sets. It also looks at other core R concepts such as looping with apply statements and using packages. Finally, it looks at how to document your R analyses and generate complete analysis reports.Course 6. Plotting complex figures with Core RThis course is a comprehensive guide to the use of the built-in R plotting functionality to construct everything from customised simple plots to complex multi-layered figures. It follows on from the material in our introductory R course and participants are expected to have a basic understanding of R - enough to load and do basic manipulation of datasets.Course 7. Introduction to ShinyShiny is an R package that enables interactive web applications to be built using R. They are a great way of allowing users to explore a dataset and make use of the graphical and statistical functionality of R without having to write any code.Course 8. Using R Notebooks This course is designed for people who are already familiar with R and are ready for a more integrated way to perform and report their analyses. It will show the use of R Notebooks for interactive analysis and then demonstrate how to apply this to the production of complete reports.Course 9. Writing R PackagesR packages are the best way to create robust re-usable code, either for internal use or for sharing with the wider community. In this course we will look at how to write functions which are robust for use by others. We will then go through the process of authoring function based R packages with the help of the recommended development tools.Course 10. Using git and GitHub with RStudioRStudio has embedded tools to facilitate the use of git with RProjects. This short course explores this functionality.
                    
            Python, Perl, Unix, ML - Babraham Institute
                        
Python, Perl, Unix, ML are considered core bionformatic skills. Here we provide a package for learning these skills.
Course 1. Python.
Part1. Introduction to Python. Python has
            established itself as one of the most commonly used programming languages. It
            is a very powerful language, which makes it relatively easy to write programs
            from simple automation scripts to more fully featured applications. In
            bioinformatics python has become widely used both as a language to write
            scripts and applications, but also, via packages like pandas, numpy and seaborn
            as an environment for data analysis, competing with more focussed languages
            such as R. In this course we focus on the use of python to develop simple
            scripts and larger applications. These can be used for simple data processing
            and aggreagation, for automating repeated tasks or to write larger user-facing command
            line programs. We start from the ground up, and make no assumption of any
            previous programming experience.
Part 2. Advanced Python. This course builds on the basic features of Python3 introduced in the Introduction to Python course. At the end of this course you should be able to write moderately complicated programs, and be aware of additional resources and wider capabilities of the language to undertake more substantial projects. The course tries to provide a grounding in the basic theory you'll need to write programs in any language as well as an appreciation of the right way to do things in Python.
Part 3. Python: Object Oriented Programming. A strength of Python and a feature that makes this language attractive to so many, is that Python is what is known as an object-oriented programming language (OOP). This is a short course that introduces the basic concepts of OOP. It then goes into more detail explaining how to build and manipulate objects. While this course does not provide an exhaustive discussion of OOP in Python, by the end of the course attendees should be able to build sophisticated objects to aid analysis and research.
Course 2. Introduction to Unix. Increasing amounts of bioinformatics work is done in a command line unix environment. Most large scale processing applications are written for unix and most large scale compute environments are also based on this. This course provides an introduction to the concepts of unix and provides a practical introduction to working in this environment. Internally we link this course to a more specific course illustrating the use of our internal cluster environment and this part of the course could be adapted for other sites with different compute infrastructure.
Course 3. Learning to Program with Perl. For a long time, Perl has been a popular language among those starting out with programming. Although it is a powerful language, many of its features make it especially suited to first time programmers as it reduces the complexity found in many other languages. Perl is also one of the world's most popular languages which means there are a huge number of resources available to anyone setting out to learn it. This course aims to introduce the basic features of the Perl language. At the end you should have everything you need to write moderately complicated programs, and enough pointers to other resources to get you started on bigger projects. The course tries to provide a grounding in the basic theory you'll need to write programs in any language, as well as an appreciation for the right way to do things in Perl.
Course 4. Introduction to Machine Learning. This course provides a theoretical and practical introduction to the use of machine learning on biological datasets. For the final section of the course we will introduce the tidymodels framework for machine learning in R, so it will be helpful to have attended our introductory and advanced R courses, or to have had equivalent experience, although this is not a prerequisite to attend the course.
                    
            True Image Deconvolution, Restoration and Analysis Workshop 250313
                        
If you are interested in producing high-quality microscopy images and
 obtaining reliable analysis results, this workshop may be of interest 
to you. Topics covered will include diffraction, acquisition pitfalls, 
spherical aberration, photon noise, Point Spread Function, 
Nyquist-Shannon Sampling Rate, Image Quality Control, crosstalk, 
(colocalization) analysis, and deconvolution. In addition, the Huygens 
Software will be demo-ed. 
This course is valid from March 2025 til March 2026
                    
            An Introduction to Proteomics - Babraham Institute
                        
This course
provides an introduction to the methods, data and analysis of quantitative
proteomics data. It goes through the background of how the data is acquired and
quantitated and the process of searching the spectra against reference
databases to identify them at the spectrum, peptide and protein level. We look
at quality control of search results to identify problems.
Data
analysis is run using the MSstats package, both via the friendly Shiny
interface, and then in more detail using R. Whilst there are no strict
pre-requisites for this course, a familiatity with R and ggplot would be very
helpful.
                    
            An Introduction to Mathematical Modelling - Babraham Institute
                        
This course
was developed in collaboration with the Le Novère lab at The Babraham
Institute. The course is not currently running and is not supported, but we are
leaving course materials here for reference.
It provides
an introduction to the concepts of modelling biological systems. It is intended
for biologists who have no experience in modelling but would like to know how
it might apply to their area of research. The course provides a complete
background to the history of modelling and the different approaches through
which a biological system can be approximated by mathematical methods. The
course also provides a practical introduction to the COPASI modelling environment.
                    
            An Introduction to Biological Big Data - Babraham Institute
                        
This couse
provides both a biological and technical introduction to Biological Big Data.
It is divided into three, day-long sessions where participants learn about the
available big data resources, what they mean, and how to use them. There
are extensive practicals to give time for people to familiarise themselves with
the sites they are shown.
                    
            Learning Vim (CRG Staff only) 2025
                        
This course introduces vim and provides
resources for jump-starting your vim journey to learn the motions and to start
customising your environment.
 
                    
            Linux containers 2025
                        
This course is designed to teach the basics of everyday Linux Containers
 usage. Participants will learn what Linux Containers are and why they 
are relevant to today's scientific practice. They will learn hands-on 
Docker, the most popular container technology, and by the end of the 
course, they should be able to build simple container images by 
themselves. They will also be introduced to Singularity/Apptainer, a 
more suitable container software for HPC environments.
                    
            Your first Nextflow pipeline 2025
                        
Learn how to write a Nextflow pipeline from scratch
Join our beginner-friendly course on Nextflow, the powerful workflow 
management system for scalable and reproducible data analysis. This 
course covers the fundamentals of Nextflow, from writing your first 
pipeline to running it efficiently. You’ll learn how to be reproducible,
 using containers and how to automate complex analyses, and optimize 
workflows for high-performance computing (HPC).
                    
            Introduction to Nextflow 2025
                        
The aim of this course is to give a general overview on Nextflow, focusing on the execution, configuration and deployment of local and publicly available pipelines.
                    
            Workflows for reproducible research 250317
                        
This course is designed to teach the fundamental concepts and practical 
guidelines for ensuring that everyday data generation and management 
tasks fit into reproducible scientific workflows. The course emphasizes 
the importance of open data formats, and recommends using Markdown for 
documentation. Participants will also learn how to use Gitlab and 
Github, two data collaboration platforms, for tracking and managing data
 and documentation across different interfaces such as command line, 
IDEs and web browser. Git's underlying version control capabilities will
 be covered in detail during the hands-on sessions. 
                    
            Quick introduction to programming in R 250327
                        
This course aims to provide basic notions of R programming to people that have NEVER worked with R and that want to learn how to use it for data analysis and visualization. 
The Introduction to R course starts from the very basics of R language, all the way through learning how to create scripts, read and write files, manipulate different data structures and plot the results, which will allow you to learn how to do some basic analysis and visualization of your own data by the end of the course. In this course we will combine explanations and examples with lots of hands-on that will allow you to get familiar with basic programming concepts and explore the different possibilities that R offers.