Building Bioinformatics Solutions: With Perl, R and MySQL

  • 5h 12m
  • Conrad Bessant, Darren Oakley, Ian Shadforth
  • Oxford University Press (US)
  • 2009

Modern bioinformatics encompasses a broad and ever-changing range of activities involved with the management and analysis of data from molecular biology experiments. Despite the diversity of activities and applications, the basic methodology and core tools needed to tackle bioinformatics problems is common to many projects. Building Bioinformatics Solutions provides a comprehensive introduction to this methodology, explaining how to acquire and use the most popular development tools, how to apply them to build processing pipelines, and how to make the results available through visualizations and web-based services for deployment either locally or via the Internet. The main development tools covered in this book are the MySQL database management system, the Perl programming language, and the R language for statistical computing. These industry standard open source tools form the core of many bioinformatics projects, both in academia and industry. The methodologies introduced are platform independent, and all the examples that feature have been tested on Windows, Linux and Mac OS.

This advanced textbook is suitable for graduate students and researchers in the life sciences who wish to automate analyses or create their own databases and web-based tools. No prior knowledge of software development is assumed. Having worked through the book, the reader should have the necessary core skills to develop computational solutions for their specific research programmes. The book will also help the reader overcome the inertia associated with penetrating this field, and provide them with the confidence and understanding required to go on to develop more advanced bioinformatics skills.

About the Authors

Conrad Bessant is the founder and head of the Cranfield University Bioinformatics Group, and designed the University's popular MSc in Applied Bioinformatics, which has been running since 2002. Conrad is now primarily focused on research, with collaborators including leading academics and blue chip companies. His research interests include multivariate data analysis, data integration and machine learning, particularly applied to proteomics and metabolomics.

Ian Shadforth holds a degree in Natural Sciences from Cambridge, and an Engineering Doctorate in Bioinformatics from Cranfield University. He is currently a project leader at LifeScan Scotland Ltd, a division of Johnson & Johnson. He is focused on the field of diabetes management, including the provision of new tools and educational programmes as well as applying his research expertise to biomarker discovery and related concepts.

Darren Oakley has a degree in Biochemistry, and a Bioinformatics MSc and PhD from Cranfield University. He is now working as a developer in the High Throughput Gene Targeting group at the Wellcome Trust Sanger Institute, helping to develop and maintain web-based tracking, monitoring and reporting systems for the laboratory.

In this Book

  • Introduction
  • Building biological databases with MySQL
  • Automating processes using Perl
  • Numerical data analysis using R
  • Programming for the Web