Open Source Software in Life Science Research

Practical Solutions to Common Challenges in the Pharmaceutical Industry and Beyond

Buch | Hardcover

582 Seiten

2012
Woodhead Publishing Ltd (Verlag)
978-1-907568-97-8 (ISBN)

Artikel merken

Open source software in life science research considers how industry and applied research groups have embraced these resources, discussing practical implementations that address real-world business problems. The book is wide in scope, taking its examples from a variety of sectors and scientific areas.

The free/open source approach has grown from a minor activity to become a significant producer of robust, task-orientated software for a wide variety of situations and applications. To life science informatics groups, these systems present an appealing proposition - high quality software at a very attractive price. Open source software in life science research considers how industry and applied research groups have embraced these resources, discussing practical implementations that address real-world business problems.The book is divided into four parts. Part one looks at laboratory data management and chemical informatics, covering software such as Bioclipse, OpenTox, ImageJ and KNIME. In part two, the focus turns to genomics and bioinformatics tools, with chapters examining GenomicsTools and EBI Atlas software, as well as the practicalities of setting up an ‘omics’ platform and managing large volumes of data. Chapters in part three examine information and knowledge management, covering a range of topics including software for web-based collaboration, open source search and visualisation technologies for scientific business applications, and specific software such as DesignTracker and Utopia Documents. Part four looks at semantic technologies such as Semantic MediaWiki, TripleMap and Chem2Bio2RDF, before part five examines clinical analytics, and validation and regulatory compliance of free/open source software. Finally, the book concludes by looking at future perspectives and the economics and free/open source software in industry.

Lee Harland is currently leading the information engineering group at Pfizer – a group tasked with developing cutting edge software that helps scientists use internal and external information more effectively. He is also leading member of the pharma-industry pre-competitive group, the Pistoia Alliance, and has 13 years’ experience in bioinformatics, software development and information science within major Pharma. Mark Forster is currently a senior information domain specialist within the Syngenta R&D Information Systems (RDIS) group, supporting R&D scientists in the fields of small molecule discovery and development, plant breeding and biotechnology. He has 15 years of industrial experience in scientific software development, deployment and support in the US and the UK.

Dedication

List of figures and tables

Foreword

About the editors

About the contributors

Introduction

Chapter 1: Building research data handling systems with open source tools

Abstract:

1.1 Introduction

1.2 Legacy

1.3 Ambition

1.4 Path chosen

1.5 The ‘ilities

1.6 Overall vision

1.7 Lessons learned

1.8 Implementation

1.9 Who uses LSP today?

1.10 Organisation

1.11 Future aspirations

Chapter 2: Interactive predictive toxicology with Bioclipse and OpenTox

Abstract:

2.1 Introduction

2.2 Basic Bioclipse-OpenTox interaction examples

2.3 Use Case 1: Removing toxicity without interfering with pharmacology

2.4 Use Case 2: Toxicity prediction on compound collections

2.5 Discussion

2.6 Availability

Chapter 3: Utilizing open source software to facilitate communication of chemistry at RSC

Abstract:

3.1 Introduction

3.2 Project Prospect and open ontologies

3.3 ChemSpider

3.4 ChemDraw Digester

3.5 Learn Chemistry Wiki

3.6 Conclusion

3.7 Acknowledgments

Chapter 4: Open source software for mass spectrometry and metabolomics

Abstract:

4.1 Introduction

4.2 A short mass spectrometry primer

4.3 Metabolomics and metabonomics

4.4 Data types

4.5 Metabolomics data processing

4.6 Metabolomics data processing using the open source workflow engine, KNIME

4.7 Open source software for multivariate analysis

4.8 Performing PCA on metabolomics data in R/KNIME

4.9 Other open source packages

4.10 Perspective

4.11 Acknowledgments

Chapter 5: Open source software for image processing and analysis: picture this with ImageJ

Abstract:

5.1 Introduction

5.2 ImageJ

5.3 ImageJ macros: an overview

5.4 Graphical user interface

5.5 Industrial applications of image analysis

5.6 Summary

Chapter 6: Integrated data analysis with KNIME

Abstract:

6.1 The KNIME platform

6.2 The KNIME success story

6.3 Benefits of 'professional open source'

6.4 Application examples

6.5 Conclusion and outlook

6.6 Acknowledgments

Chapter 7: Investigation-Study-Assay, a toolkit for standardizing data capture and sharing

Abstract:

7.1 The growing need for content curation in industry

7.2 The BioSharing initiative: cooperating standards needed

7.3 The ISA framework – principles for progress

7.4 Lessons learned

7.5 Acknowledgments

Chapter 8: GenomicTools: an open source platform for developing high-throughput analytics in genomics

Abstract:

8.1 Introduction

8.2 Data types

8.3 Tools overview

8.4 C++ API for developers

8.5 Case study: a simple ChIP-seq pipeline

8.6 Performance

8.7 Conclusion

8.8 Resources

Chapter 9: Creating an in-house â€™omics data portal using EBI Atlas software

Abstract:

9.1 Introduction

9.2 Leveraging ’omics data for drug discovery

9.3 The EBI Atlas software

9.4 Deploying Atlas in the enterprise

9.5 Conclusion and learnings

9.6 Acknowledgments

Chapter 10: Setting up an â€™omics platform in a small biotech

Abstract:

10.1 Introduction

10.2 General changes over time

10.3 The hardware solution

10.4 Maintenance of the system

10.5 Backups

10.6 Keeping up-to-date

10.7 Disaster recovery

10.8 Personnel skill sets

10.9 Conclusion

10.10 Acknowledgements

Chapter 11: Squeezing big data into a small organisation

Abstract:

11.1 Introduction

11.2 Our service and its goals

11.3 Manage the data: relieving the burden of data-handling

11.4 Organising the data

11.5 Standardising to your requirements

11.6 Analysing the data: helping users work with their own data

11.7 Helping biologists to stick to the rules

11.8 Running programs

11.9 Helping the user to understand the details

11.10 Summary

Chapter 12: Design Tracker: an easy to use and flexible hypothesis tracking system to aid project team working

Abstract:

12.1 Overview

12.2 Methods

12.3 Technical overview

12.4 Infrastructure

12.5 Review

12.6 Acknowledgements

Chapter 13: Free and open source software for web-based collaboration

Abstract:

13.1 Introduction

13.2 Application of the FLOSS assessment framework

13.3 Conclusion

13.4 Acknowledgements

Chapter 14: Developing scientific business applications using open source search and visualisation technologies

Abstract:

14.1 A changing attitude

14.2 The need to make sense of large amounts of data

14.3 Open source search technologies

14.4 Creating the foundation layer

14.5 Visualisation technologies

14.6 Prefuse visualisation toolkit

14.7 Business applications

14.8 Other applications

14.9 Challenges and future developments

14.10 Reflections

14.11 Thanks and Acknowledgements

Chapter 15: Utopia Documents: transforming how industrial scientists interact with the scientific literature

Abstract:

15.1 Utopia Documents in industry

15.2 Enabling collaboration

15.3 Sharing, while playing by the rules

15.4 History and future of Utopia Documents

Chapter 16: Semantic MediaWiki in applied life science and industry: building an Enterprise Encyclopaedia

Abstract:

16.1 Introduction

16.2 Wiki-based Enterprise Encyclopaedia

16.3 Semantic MediaWiki

16.4 Conclusion and future directions

16.5 Acknowledgements

Chapter 17: Building disease and target knowledge with Semantic MediaWiki

Abstract:

17.1 The Targetpedia

17.2 The Disease Knowledge Workbench (DKWB)

17.3 Conclusion

17.4 Acknowledgements

Chapter 18: Chem2Bio2RDF: a semantic resource for systems chemical biology and drug discovery

Abstract:

18.1 The need for integrated, semantic resources in drug discovery

18.2 The Semantic Web in drug discovery

18.3 Implementation challenges

18.4 Chem2Bio2RDF architecture

18.5 Tools and methodologies that use Chem2Bio2RDF

18.6 Conclusions

Chapter 19: TripleMap: a web-based semantic knowledge discovery and collaboration application for biomedical research

Abstract:

19.1 The challenge of Big Data

19.2 Semantic technologies

19.3 Semantic technologies overview

19.4 The design and features of TripleMap

19.5 TripleMap Generated Entity Master ('GEM') semantic data core

19.6 TripleMap semantic search interface

19.7 TripleMap collaborative, dynamic knowledge maps

19.8 Comparison and integration with third-party systems

19.9 Conclusions

Chapter 20: Extreme scale clinical analytics with open source software

Abstract:

20.1 Introduction

20.2 Interoperability

20.3 Mirth

20.4 Mule ESB

20.5 Unified Medical Language System (UMLS)

20.6 Open source databases

20.7 Analytics

20.8 Final architectural overview

Chapter 21: Validation and regulatory compliance of free/open source software

Abstract:

21.1 Introduction

21.2 The need to validate open source applications

21.3 Who should validate open source software?

21.4 Validation planning

21.5 Risk management and open source software

21.6 Key validation activities

21.7 Ongoing validation and compliance

21.8 Conclusions

Chapter 22: The economics of free/open source software in industry

Abstract:

22.1 Introduction

22.2 Background

22.3 Open source innovation

22.4 Open source software in the pharmaceutical industry

22.5 Open source as a catalyst for pre-competitive collaboration in the pharmaceutical industry

22.6 The Pistoia Alliance Sequence Services Project

22.7 Conclusion

Index

Reihe/Serie	Woodhead Publishing Series in Biomedicine
Verlagsort	Cambridge
Sprache	englisch
Maße	156 x 234 mm
Gewicht	1030 g
Themenwelt	Informatik ► Office Programme ► Outlook
	Naturwissenschaften ► Biologie
	Naturwissenschaften ► Chemie ► Technische Chemie
	Technik
ISBN-10	1-907568-97-2 / 1907568972
ISBN-13	978-1-907568-97-8 / 9781907568978
Zustand	Neuware