Introduction to Python
Python has
established itself as one of the most commonly used programming languages. It
is a very powerful language, which makes it relatively easy to write programs
from simple automation scripts to more fully featured applications. In
bioinformatics python has become widely used both as a language to write
scripts and applications, but also, via packages like pandas, numpy and seaborn
as an environment for data analysis, competing with more focussed languages
such as R. In this course we focus on the use of python to develop simple
scripts and larger applications. These can be used for simple data processing
and aggreagation, for automating repeated tasks or to write larger user-facing command
line programs. We start from the ground up, and make no assumption of any
previous programming experience.
Advanced Python
In recent
years, the programming language Python has become ever more popular in the
bioinformatics and computational biology communities and indeed, learning this
language marks many people's first introduction to writing code. This success
of Python is due to a number of factors. Perhaps most importantly for a
beginner, Python is relatively easy to use, being what we term a
"high-level" programming language. Don't let this terminology confuse
you however: "high-level" simply means that much of the computational
tasks are managed for you, enabling you to write shorter and simpler code to
get your jobs done.
This course builds on the basic features of
Python3 introduced in the Introduction to Python course. At the end of this
course you should be able to write moderately complicated programs, and be
aware of additional resources and wider capabilities of the language to
undertake more substantial projects. The course tries to provide a grounding in
the basic theory you'll need to write programs in any language as well as an
appreciation of the right way to do things in Python.
Python: Object Oriented Programming
A strength
of Python and a feature that makes this language attractive to so many, is that
Python is what is known as an object-oriented programming language (OOP).
This is a short course that introduces the basic
concepts of OOP. It then goes into more detail explaining how to build and
manipulate objects. While this course does not provide an exhaustive discussion
of OOP in Python, by the end of the course attendees should be able to build sophisticated
objects to aid analysis and research.
Introduction to Unix
Increasing
amounts of bioinformatics work is done in a command line unix environment. Most
large scale processing applications are written for unix and most large scale
compute environments are also based on this.
This course
provides an introduction to the concepts of unix and provides a practical
introduction to working in this environment. Internally we link this course to
a more specific course illustrating the use of our internal cluster environment
and this part of the course could be adapted for other sites with different
compute infrastructure
Learning to Program with Perl
For a long
time, Perl has been a popular language among those starting out with
programming. Although it is a powerful language, many of its features make it
especially suited to first time programmers as it reduces the complexity found
in many other languages. Perl is also one of the world's most popular languages
which means there are a huge number of resources available to anyone setting
out to learn it.
This course
aims to introduce the basic features of the Perl language. At the end you
should have everything you need to write moderately complicated programs, and
enough pointers to other resources to get you started on bigger projects. The
course tries to provide a grounding in the basic theory you'll need to write
programs in any language, as well as an appreciation for the right way to do
things in Perl.
Introduction to Machine Learning
This course provides a theoretical and practical
introduction to the use of machine learning on biological datasets. For the
final section of the course we will introduce the tidymodels framework for machine
learning in R, so it will be helpful to have attended our introductory and
advanced R courses, or to have had equivalent experience, although this is not
a prerequisite to attend the course.