Python, Perl, Unix, ML are considered core bionformatic skills. Here we provide a package for learning these skills.
Course 1. Python.
Part1. Introduction to Python. Python has established itself as one of the most commonly used programming languages. It is a very powerful language, which makes it relatively easy to write programs from simple automation scripts to more fully featured applications. In bioinformatics python has become widely used both as a language to write scripts and applications, but also, via packages like pandas, numpy and seaborn as an environment for data analysis, competing with more focussed languages such as R. In this course we focus on the use of python to develop simple scripts and larger applications. These can be used for simple data processing and aggreagation, for automating repeated tasks or to write larger user-facing command line programs. We start from the ground up, and make no assumption of any previous programming experience.
Part 2. Advanced Python. This course builds on the basic features of Python3 introduced in the Introduction to Python course. At the end of this course you should be able to write moderately complicated programs, and be aware of additional resources and wider capabilities of the language to undertake more substantial projects. The course tries to provide a grounding in the basic theory you'll need to write programs in any language as well as an appreciation of the right way to do things in Python.
Part 3. Python: Object Oriented Programming. A strength of Python and a feature that makes this language attractive to so many, is that Python is what is known as an object-oriented programming language (OOP). This is a short course that introduces the basic concepts of OOP. It then goes into more detail explaining how to build and manipulate objects. While this course does not provide an exhaustive discussion of OOP in Python, by the end of the course attendees should be able to build sophisticated objects to aid analysis and research.
Course 2. Introduction to Unix. Increasing amounts of bioinformatics work is done in a command line unix environment. Most large scale processing applications are written for unix and most large scale compute environments are also based on this. This course provides an introduction to the concepts of unix and provides a practical introduction to working in this environment. Internally we link this course to a more specific course illustrating the use of our internal cluster environment and this part of the course could be adapted for other sites with different compute infrastructure.
Course 3. Learning to Program with Perl. For a long time, Perl has been a popular language among those starting out with programming. Although it is a powerful language, many of its features make it especially suited to first time programmers as it reduces the complexity found in many other languages. Perl is also one of the world's most popular languages which means there are a huge number of resources available to anyone setting out to learn it. This course aims to introduce the basic features of the Perl language. At the end you should have everything you need to write moderately complicated programs, and enough pointers to other resources to get you started on bigger projects. The course tries to provide a grounding in the basic theory you'll need to write programs in any language, as well as an appreciation for the right way to do things in Perl.
Course 4. Introduction to Machine Learning. This course provides a theoretical and practical introduction to the use of machine learning on biological datasets. For the final section of the course we will introduce the tidymodels framework for machine learning in R, so it will be helpful to have attended our introductory and advanced R courses, or to have had equivalent experience, although this is not a prerequisite to attend the course.

Babraham Institute