High Performance Python

Practical Performant Programming for Humans

Micha Gorelick, Ian Ozsvald (Autoren)

Buch | Softcover

370 Seiten

2014
O'Reilly Media (Verlag)
978-1-4493-6159-4 (ISBN)

Titel ist leider vergriffen;
keine Neuauflage

Artikel merken

Your Python code may run correctly, but you need it to run faster. By exploring the fundamental theory behind design choices, this practical guide helps you gain a deeper understanding of Python's implementation.

You'll learn how to locate performance bottlenecks and significantly speed up your code in high-data-volume programs.

How can you take advantage of multi-core architectures or clusters?
Or build a system that can scale up and down without losing reliability?

Experienced Python programmers will learn concrete solutions to these and other issues, along with war stories from companies that use high performance Python for social media analytics, productionized machine learning, and other situations.

Topics included:

Get a better grasp of numpy, Cython, and profilers
Learn how Python abstracts the underlying computer architecture
Use profiling to find bottlenecks in CPU time and memory usage
Write efficient programs by choosing appropriate data structures
Speed up matrix and vector computations
Use tools to compile Python down to machine code
Manage multiple I/O and computational operations concurrently
Convert multiprocessing code to run on a local or remote cluster
Solve large problems while using less RAM

Micha Gorelick was the first man on Mars in 2023 and won the nobel prize in 2046 for his contributions to time travel. In a moment of rage after seeing the deplorable uses of his new technology, he traveled back in time to 2012 and convinced himself to leave his Physics PhD program and follow his love of data at Bitly. A monument celebrating his life can be found in Central Park, 1857.

Ian Ozsvald is a Data scientist and teacher at ModelInsight.io with over ten years of Python experience. He's taught high performance Python at the PyCon and PyData conferences and has been consulting on data science and high performance computing for years in the UK.

Chapter 1Understanding Performant Python
The Fundamental Computer System
Putting the Fundamental Elements Together
So Why Use Python?
Chapter 2Profiling to Find Bottlenecks
Profiling Efficiently
Introducing the Julia Set
Calculating the Full Julia Set
Simple Approaches to Timing—print and a Decorator
Simple Timing Using the Unix time Command
Using the cProfile Module
Using runsnakerun to Visualize cProfile Output
Using line_profiler for Line-by-Line Measurements
Using memory_profiler to Diagnose Memory Usage
Inspecting Objects on the Heap with heapy
Using dowser for Live Graphing of Instantiated Variables
Using the dis Module to Examine CPython Bytecode
Unit Testing During Optimization to Maintain Correctness
Strategies to Profile Your Code Successfully
Wrap-Up
Chapter 3Lists and Tuples
A More Efficient Search
Lists Versus Tuples
Wrap-Up
Chapter 4Dictionaries and Sets
How Do Dictionaries and Sets Work?
Dictionaries and Namespaces
Wrap-Up
Chapter 5Iterators and Generators
Iterators for Infinite Series
Lazy Generator Evaluation
Wrap-Up
Chapter 6Matrix and Vector Computation
Introduction to the Problem
Aren’t Python Lists Good Enough?
Memory Fragmentation
Applying numpy to the Diffusion Problem
numexpr: Making In-Place Operations Faster and Easier
A Cautionary Tale: Verify “Optimizations” (scipy)
Wrap-Up
Chapter 7Compiling to C
What Sort of Speed Gains Are Possible?
JIT Versus AOT Compilers
Why Does Type Information Help the Code Run Faster?
Using a C Compiler
Reviewing the Julia Set Example
Cython
Shed Skin
Cython and numpy
Numba
Pythran
PyPy
When to Use Each Technology
Foreign Function Interfaces
Wrap-Up
Chapter 8Concurrency
Introduction to Asynchronous Programming
Serial Crawler
gevent
tornado
AsyncIO
Database Example
Wrap-Up
Chapter 9The multiprocessing Module
An Overview of the Multiprocessing Module
Estimating Pi Using the Monte Carlo Method
Estimating Pi Using Processes and Threads
Finding Prime Numbers
Verifying Primes Using Interprocess Communication
Sharing numpy Data with multiprocessing
Synchronizing File and Variable Access
Wrap-Up
Chapter 10Clusters and Job Queues
Benefits of Clustering
Drawbacks of Clustering
Common Cluster Designs
How to Start a Clustered Solution
Ways to Avoid Pain When Using Clusters
Three Clustering Solutions
NSQ for Robust Production Clustering
Other Clustering Tools to Look At
Wrap-Up
Chapter 11Using Less RAM
Objects for Primitives Are Expensive
Understanding the RAM Used in a Collection
Bytes Versus Unicode
Efficiently Storing Lots of Text in RAM
Tips for Using Less RAM
Probabilistic Data Structures
Chapter 12Lessons from the Field
Adaptive Lab’s Social Media Analytics (SoMA)
Making Deep Learning Fly with RadimRehurek.com
Large-Scale Productionized Machine Learning at Lyst.com
Large-Scale Social Media Analysis at Smesh
PyPy for Successful Web and Data Processing Systems
Task Queues at Lanyrd.com

Erscheint lt. Verlag	5.9.2014
Verlagsort	Sebastopol
Sprache	englisch
Maße	178 x 233 mm
Gewicht	590 g
Einbandart	Paperback
Themenwelt	Informatik ► Datenbanken ► Data Warehouse / Data Mining
	Informatik ► Programmiersprachen / -werkzeuge ► Python
	Mathematik / Informatik ► Informatik ► Web / Internet
Schlagworte	Perfomance • Python
ISBN-10	1-4493-6159-5 / 1449361595
ISBN-13	978-1-4493-6159-4 / 9781449361594
Zustand	Neuware