Complete R courses - Babraham Institute

Course 1. Introduction to Tidyverse R is a popular language and environment that allows powerful and fast manipulation of data, offering many statistical and graphical options. This course aims to introduce R as a tool for statistics and graphics, with the main aim being to become comfortable with the R environment. As well as introducing core R language concepts this course also provides the basics of using the Tidyverse for data maniupulation, and ggplot for plotting. It will focus on entering and manipulating data in R and producing simple graphs. A few functions for basic statistics will be briefly introduced, but statistical functions will not be covered in detail. Course 2. Advanced Tidyverse The 'Tidyverse' is a set of add-in R packages for data loading, modelling, manipulation and plotting. It is an attempt to make data analysis and plotting cleaner, simpler and more consistent by addressing some poor design decisions in the original language. This course follows on from our Introduction to R with tidyverse and focusses on the manipulation and restructuring of data using the tidyverse packages. The course shows how to do complex transformations on large data structures and how to deal efficiently with data which is both large and sometimes not well behaved. Course 3. Introduction to ggplot This course is normally taught as part of the R with Tidyverse bootcamp. Ggplot is the most popular plotting extension to R and replicates many of the graph types found in the core plotting libraries. This course provides an introduction to the ggplot2 libraries and gives a practical guide for how to use these to create different types of graphs.Course 4. Introduction to Core RR is a popular language and environment that allows powerful and fast manipulation of data, offering many statistical and graphical options. This course aims to introduce R as a tool for statistics and graphics, with the main aim being to become comfortable with the R environment. It will focus on entering and manipulating data in R and producing simple graphs. A few functions for basic statistics will be briefly introduced, but statistical functions will not be covered in detail.Course 5. Advanced Core RThis course follows on from the introductory course. It goes into more detail on practical guides to filtering and combining complex data sets. It also looks at other core R concepts such as looping with apply statements and using packages. Finally, it looks at how to document your R analyses and generate complete analysis reports.Course 6. Plotting complex figures with Core RThis course is a comprehensive guide to the use of the built-in R plotting functionality to construct everything from customised simple plots to complex multi-layered figures. It follows on from the material in our introductory R course and participants are expected to have a basic understanding of R - enough to load and do basic manipulation of datasets.Course 7. Introduction to ShinyShiny is an R package that enables interactive web applications to be built using R. They are a great way of allowing users to explore a dataset and make use of the graphical and statistical functionality of R without having to write any code.Course 8. Using R Notebooks This course is designed for people who are already familiar with R and are ready for a more integrated way to perform and report their analyses. It will show the use of R Notebooks for interactive analysis and then demonstrate how to apply this to the production of complete reports.Course 9. Writing R PackagesR packages are the best way to create robust re-usable code, either for internal use or for sharing with the wider community. In this course we will look at how to write functions which are robust for use by others. We will then go through the process of authoring function based R packages with the help of the recommended development tools.Course 10. Using git and GitHub with RStudioRStudio has embedded tools to facilitate the use of git with RProjects. This short course explores this functionality.
Bioinformatics - Biostatistics

Python, Perl, Unix, ML - Babraham Institute

Python, Perl, Unix, ML are considered core bionformatic skills. Here we provide a package for learning these skills. Course 1. Python. Part1. Introduction to Python. Python has established itself as one of the most commonly used programming languages. It is a very powerful language, which makes it relatively easy to write programs from simple automation scripts to more fully featured applications. In bioinformatics python has become widely used both as a language to write scripts and applications, but also, via packages like pandas, numpy and seaborn as an environment for data analysis, competing with more focussed languages such as R. In this course we focus on the use of python to develop simple scripts and larger applications. These can be used for simple data processing and aggreagation, for automating repeated tasks or to write larger user-facing command line programs. We start from the ground up, and make no assumption of any previous programming experience. Part 2. Advanced Python. This course builds on the basic features of Python3 introduced in the Introduction to Python course. At the end of this course you should be able to write moderately complicated programs, and be aware of additional resources and wider capabilities of the language to undertake more substantial projects. The course tries to provide a grounding in the basic theory you'll need to write programs in any language as well as an appreciation of the right way to do things in Python. Part 3. Python: Object Oriented Programming. A strength of Python and a feature that makes this language attractive to so many, is that Python is what is known as an object-oriented programming language (OOP). This is a short course that introduces the basic concepts of OOP. It then goes into more detail explaining how to build and manipulate objects. While this course does not provide an exhaustive discussion of OOP in Python, by the end of the course attendees should be able to build sophisticated objects to aid analysis and research. Course 2. Introduction to Unix. Increasing amounts of bioinformatics work is done in a command line unix environment. Most large scale processing applications are written for unix and most large scale compute environments are also based on this. This course provides an introduction to the concepts of unix and provides a practical introduction to working in this environment. Internally we link this course to a more specific course illustrating the use of our internal cluster environment and this part of the course could be adapted for other sites with different compute infrastructure. Course 3. Learning to Program with Perl. For a long time, Perl has been a popular language among those starting out with programming. Although it is a powerful language, many of its features make it especially suited to first time programmers as it reduces the complexity found in many other languages. Perl is also one of the world's most popular languages which means there are a huge number of resources available to anyone setting out to learn it. This course aims to introduce the basic features of the Perl language. At the end you should have everything you need to write moderately complicated programs, and enough pointers to other resources to get you started on bigger projects. The course tries to provide a grounding in the basic theory you'll need to write programs in any language, as well as an appreciation for the right way to do things in Perl. Course 4. Introduction to Machine Learning. This course provides a theoretical and practical introduction to the use of machine learning on biological datasets. For the final section of the course we will introduce the tidymodels framework for machine learning in R, so it will be helpful to have attended our introductory and advanced R courses, or to have had equivalent experience, although this is not a prerequisite to attend the course.
Bioinformatics - Biostatistics