Research Software Engineering with Python - Damien Irving, Kate Hertweck, Luke Johnston, Joel Ostblom, Charlotte Wickham

Research Software Engineering with Python

Building software that makes research possible
Buch | Hardcover
528 Seiten
2021
Chapman & Hall/CRC (Verlag)
978-0-367-69834-8 (ISBN)
179,95 inkl. MwSt
Based on the practical experiences of its authors, who collectively have spent several decades teaching software skills to scientists, this book covers everything graduate-level researchers need to automate their workflows, collaborate with colleagues, and ensure that their results are trustworthy.
Writing and running software is now as much a part of science as telescopes and test tubes, but most researchers are never taught how to do either well. As a result, it takes them longer to accomplish simple tasks than it should, and it is harder for them to share their work with others than it needs to be.

This book introduces the concepts, tools, and skills that researchers need to get more done in less time and with less pain. Based on the practical experiences of its authors, who collectively have spent several decades teaching software skills to scientists, it covers everything graduate-level researchers need to automate their workflows, collaborate with colleagues, ensure that their results are trustworthy, and publish what they have built so that others can build on it. The book assumes only a basic knowledge of Python as a starting point, and shows readers how it, the Unix shell, Git, Make, and related tools can give them more time to focus on the research they actually want to do.

Research Software Engineering with Python can be used as the main text in a one-semester course or for self-guided study. A running example shows how to organize a small research project step by step; over a hundred exercises give readers a chance to practice these skills themselves, while a glossary defining over two hundred terms will help readers find their way through the terminology. All of the material can be re-used under a Creative Commons license, and all royalties from sales of the book will be donated to The Carpentries, an organization that teaches foundational coding and data science skills to researchers worldwide.

Dr. Damien B. Irving is post-doctoral researcher in climate science at the University of New South Wales living in Hobart, Tasmania. With a strong interest in data science education and open/reproducible research, Damien is involved in The Carpentries community as an instructor, lesson author and Regional Coordinator for Australia, is an Associate Editor with the Journal of Open Research Software, and is currently the Global Coordinator for the Research Bazaar, a worldwide festival promoting the digital literacy emerging at the center of modern research. Dr. Kate L. Hertweck is a scientist and educator who endeavors to uphold core values like diversity/equity/inclusion, accessibility of information, and learning over knowing. They currently lead training and community efforts to support biomedical researchers at Fred Hutchinson Cancer Research Center in Seattle, Washington. Kate is an instructor and trainer for the Carpentries and has also participated in that group's lesson development/maintenance and community governance. Dr. Luke Johnston is a diabetes epidemiologist working at the Steno Diabetes Center Aarhus in Denmark. He is passionate about educating researchers on modern computing tools and skills, having taught many Carpentry workshops as well as creating and instructing several intensive courses teaching computing skills and analytic reproducibility to diabetes researchers. When he isn't teaching or doing research, he is building software tools to automate common research workflows and tasks. Dr. Joel Ostblom is a post-doctoral teaching fellow in the Master's of Data Science program at the University of British Columbia in Vancouver, B.C. He has co-created or led the development of several courses and workshops at the University of Toronto and the University of British Columbia. Joel cares deeply about spreading data literacy and excitement over programmatic data analysis, which is reflected in his contributions to open source projects and data science learning resources. Dr. Charlotte Wickham is a data scientist and educator, who teaches in the Statistics Department at Oregon State University, as well as operating her own consulting and training business. She loves to help people build their data super powers in the R programming language. She currently lives in Corvallis, Oregon, but originally hails from New Zealand. Dr. Greg Wilson is a programmer and educator based in Toronto, Ontario, and was the co-founder and first Executive Director of Software Carpentry. A member of the Python Software Foundation, Greg has written or edited over a dozen books and received ACM SIGSOFT's Influential Educator Award in 2020.

Welcome
0.1The Big Picture
0.2 Intended Audience
0.3 What You Will Learn
0.4 Using this Book
0.5 Contributing and Re-Use
0.6 Acknowledgments

Getting Started
1.1 Project Structure
1.2 Downloading the Data
1.3 Installing the Software
1.4 Summary
1.5 Exercises
1.6 Key Points

The Basics of the Unix Shell
2.1 Exploring Files and Directories
2.2 Moving Around
2.3 Creating New Files and Directories
2.4 Moving Files and Directories
2.5 Copying Files and Directories
2.6 Deleting Files and Directories
2.7 Wildcards
2.8 Reading the Manual
2.9 Summary
2.10 Exercises
2.11 Key Points

Building Tools with the Unix Shell
3.1 Combining Commands
3.2 How Pipes Work
3.3 Repeating Commands on Many Files
3.4 Variable Names
3.5 Redoing Things
3.6 Creating New Filenames Automatically
3.7 Summary
3.8 Exercises
3.9 Key Points

Going Further with the Unix Shell
4.1 Creating New Commands
4.2 Making Scripts More Versatile
4.3 Turning Interactive Work into a Script
4.4 Finding Things in Files
4.5 Finding Files
4.6 Configuring the Shell
4.7 Summary
4.8 Exercises .
4.9 Key Points

Building Command-Line Tools with Python

5.1 Programs and Modules
5.2 Handling Command-Line Options
5.3 Documentation
5.4 Counting Words
5.5 Pipelining
5.6 Positional and Optional Arguments
5.7 Collating Results
5.8 Writing Our Own Modules
5.9 Plotting
5.10 Summary
5.11 Exercises
5.12 Key Points

Using Git at the Command Line
6.1 Setting Up
6.2 Creating a New Repository
6.3 Adding Existing Work
6.4 Describing Commits
6.5 Saving and Tracking Changes
6.6 Synchronizing with Other Repositories
6.7 Exploring History
6.8 Restoring Old Versions of Files
6.9 Ignoring Files
6.10 Summary
6.11 Exercises
6.12 Key Points

Going Further with Git
7.1 What’s a Branch?
7.2 Creating a Branch
7.3 What Curve Should We Fit?
7.4 Verifying Zipf’s Law
7.5 Merging
7.6 Handling Conflicts
7.7 A Branch-Based Workflow
7.8 Using Other People’s Work
7.9 Pull Requests
7.10 Handling Conflicts in Pull Requests
7.11 Summary
7.12 Exercises
7.13 Key Points

Working in Teams
8.1 What is a Project?
8.2 Include Everyone
8.3 Establish a Code of Conduct
8.4 Include a License
8.5 Planning
8.6 Bug Reports
8.7 Labeling Issues
8.8 Prioritizing
8.9 Meetings
8.10 Making Decisions
8.11 Make All This Obvious to Newcomers
8.12 Handling Conflict
8.13 Summary
8.14 Exercises
8.15 Key Points

Automating Analyses with Make
9.1 Updating a Single File
9.2 Managing Multiple Files
9.3 Updating Files When Programs Change
9.4 Reducing Repetition in a Makefile
9.5 Automatic Variables
9.6 Generic Rules
9.7 Defining Sets of Files
9.8 Documenting a Makefile
9.9 Automating Entire Analyses
9.10 Summary
9.11 Exercises
9.12 Key Points

Configuring Programs
10.1 Configuration File Formats
10.2 Matplotlib Configuration
10.3 The Global Configuration File
10.4 The User Configuration File
10.5 Adding Command-Line Options
10.6 A Job Control File
10.7 Summary
10.8 Exercises
10.9 Key Points

Testing Software
11.1 Assertions
11.2 Unit Testing
11.3 Testing Frameworks
11.4 Testing Floating-Point Values
11.5 Integration Testing
11.6 Regression Testing
11.7 Test Coverage
11.8 Continuous Integration
11.9 When to Write Tests
11.10 Summary
11.11 Exercises
11.12 Key Points

Handling Errors
12.1 Exceptions
12.2 Writing Useful Error Messages
12.3 Testing Error Handling
12.4 Reporting Errors
12.5 Summary
12.6 Exercises
12.7 Key Points

Tracking Provenance
13.1 Data Provenance
13.2 Code Provenance
13.3 Summary
13.4 Exercises
13.5 Key Points

Creating Packages with Python
14.1 Creating a Python Package
14.2 Virtual Environments
14.3 Installing a Development Package
14.4 What Installation Does
14.5 Distributing Packages
14.6 Documenting Packages
14.7 Software Journals
14.8 Summary
14.9 Exercises
14.10 Key Points

Finale
15.1 Why We Wrote This Book

Appendix
A Solutions
B Learning Objectives
C Key Points
D Project Tree
E Working Remotely
F Writing Readable Code
G Documenting Programs
H YAML
I Anaconda
J Glossary
K References
Index

Erscheinungsdatum
Zusatzinfo 55 Line drawings, color; 55 Illustrations, color
Sprache englisch
Maße 156 x 234 mm
Gewicht 1143 g
Themenwelt Mathematik / Informatik Informatik Software Entwicklung
Informatik Theorie / Studium Algorithmen
ISBN-10 0-367-69834-X / 036769834X
ISBN-13 978-0-367-69834-8 / 9780367698348
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich