Enrolment options

Welcome to the "Complete Cluster  Course"!

This course will consolidate material presented in the beginner cluster course and expand on the concepts to be aware of when trying to optimize use of the cluster. 

The main message of the course is to embrace the parallelism available within the cluster and that pipelines should be made from lots of small independent pieces that are spread throughout the cluster rather than large monolithic long jobs that run on a single node. The course will show why this should be done and how to achieve it. 

Topics that are going to be addressed:  

  • Video tour of the data centre 
  • What is a cluster 
  • Logging in 
  • Queuing / the scheduler 
  • What resource are available at the CRG cluster 
  • Simple batch scripts - directives 
  • Troubleshooting - what happened to my jobs? 
  • Interactive sessions 
  • Supercomputers, beowulf clusters, horizontal v vertical scaling 
  • Hardware considerations 
  • Multithreaded jobs, parallelism, Amdahl's Law 
  • Job arrays 
  • Job dependencies 
  • Building a pipeline 
  • Storage issues, treemap 
  • Job stats, resource estimation 
  • Scaling analysis 

Number of course hours : 12h
Date : 13th, 17th, 20th & 25th February 2025
Level: Medium
Topics Covered:
  • What is a cluster and Logging in 
  • Queuing / the scheduler 
  • What resource are available at the CRG cluster 
  • Simple batch scripts - directives and Troubleshooting 
  • Supercomputers, beowulf clusters, horizontal v vertical scaling 
  • Hardware considerations 
  • Multithreaded jobs, parallelism, Amdahl's Law 
  • Job arrays and Job dependencies 
  • Building a pipeline / Storage issues, treemap 
  • Job stats, resource estimation / Scaling analysis 

Self enrolment (Participants)