Consumer Depth Cameras for Computer Vision (eBook)
XVI, 210 Seiten
Springer London (Verlag)
978-1-4471-4640-7 (ISBN)
Dr. Andrea Fossati and Dr. Helmut Grabner are post-doctoral researchers in the Computer Vision Laboratory at ETH Zurich, Switzerland.
Dr. Juergen Gall is a Senior Researcher at the Max Planck Institute for Intelligent Systems, Tübingen, Germany.
Dr. Xiaofeng Ren is a Research Scientist at the Intel Science and Technology Center for Pervasive Computing, Intel Labs, and an Affiliate Assistant Professor at the Department of Computer Science and Engineering of the University of Washington, Seattle, WA, USA.
Dr. Kurt Konolige is a Senior Researcher at Industrial Perception Inc., Palo Alto, CA, USA.
The launch of Microsoft's Kinect, the first high-resolution depth-sensing camera for the consumer market, generated considerable excitement not only among computer gamers, but also within the global community of computer vision researchers.The potential of consumer depth cameras extends well beyond entertainment and gaming, to real-world commercial applications such virtual fitting rooms, training for athletes, and assistance for the elderly. This authoritative text/reference reviews the scope and impact of this rapidly growing field, describing the most promising Kinect-based research activities, discussing significant current challenges, and showcasing exciting applications.Topics and features: presents contributions from an international selection of preeminent authorities in their fields, from both academic and corporate research; addresses the classic problem of multi-view geometry of how to correlate images from different viewpoints to simultaneously estimate camera poses and world points; examines human pose estimation using video-rate depth images for gaming, motion capture, 3D human body scans, and hand pose recognition for sign language parsing; provides a review of approaches to various recognition problems, including category and instance learning of objects, and human activity recognition; with a Foreword by Dr. Jamie Shotton of Microsoft Research, Cambridge, UK.This broad-ranging overview is a must-read for researchers and graduate students of computer vision and robotics wishing to learn more about the state of the art of this increasingly "e;hot"e; topic.
Dr. Andrea Fossati and Dr. Helmut Grabner are post-doctoral researchers in the Computer Vision Laboratory at ETH Zurich, Switzerland.Dr. Juergen Gall is a Senior Researcher at the Max Planck Institute for Intelligent Systems, Tübingen, Germany.Dr. Xiaofeng Ren is a Research Scientist at the Intel Science and Technology Center for Pervasive Computing, Intel Labs, and an Affiliate Assistant Professor at the Department of Computer Science and Engineering of the University of Washington, Seattle, WA, USA.Dr. Kurt Konolige is a Senior Researcher at Industrial Perception Inc., Palo Alto, CA, USA.
Consumer Depth Cameras for Computer Vision 3
Foreword 5
Working on Human Pose Estimation for Kinect 6
Beyond Entertainment 7
Looking to the Future 8
Preface 9
Contents 12
Acronyms 14
Part I: 3D Registration and Reconstruction 16
Chapter 1: 3D with Kinect 18
1.1 Introduction 18
1.2 Kinect as a 3D Measuring Device 19
1.2.1 IR Image 20
1.2.2 RGB Image 21
1.2.3 Depth Image 21
1.2.4 Depth Resolution 21
1.3 Kinect Geometrical Model 23
1.3.1 Shift Between IR Image and Depth Image 24
1.3.2 Identi?cation of the IR Projector Geometrical Center 25
1.3.3 Identi?cation of Effective Depth Resolutions of the IR Camera and Projector Stereo Pair 26
1.4 Kinect Calibration 29
1.4.1 Learning Complex Residual Errors 30
1.5 Validation 31
1.5.1 Kinect Depth Models Evaluation on a 3D Calibration Object 34
1.5.2 Comparison of Kinect, SLR Stereo and 3D TOF 35
1.5.3 Combining Kinect and Structure from Motion 36
1.6 Conclusion 39
References 39
Chapter 2: Real-Time RGB-D Mapping and 3-D Modeling on the GPU Using the Random Ball Cover 41
2.1 Introduction 42
2.2 Related Work 43
2.3 Methods 45
2.3.1 Data Preprocessing on the GPU 46
Nomenclature 46
Landmark Extraction 47
2.3.2 Photogeometric ICP Framework 47
2.3.3 6-D Nearest Neighbor Search Using RBC 48
2.4 Implementation Details 50
2.4.1 Details Regarding the ICP Framework 50
2.4.2 RBC Construction and Queries on the GPU 51
RBC Construction 51
RBC Nearest Neighbor Queries 53
2.5 Experiments and Results 53
2.5.1 Qualitative Results 53
2.5.2 Performance Study 56
Preprocessing Pipeline 56
ICP Using RBC 56
2.5.3 Approximate RBC 57
2.6 Discussion and Conclusions 59
References 60
Chapter 3: A Brute Force Approach to Depth Camera Odometry 63
3.1 Introduction 63
3.2 Related Work 64
3.3 Proposed Method 65
3.3.1 Algorithm Overview 66
3.3.2 Practical Issues 67
Feature Extraction 67
Score Evaluation 67
3.3.3 Implementation Details 67
3.4 Experimental Results 68
3.4.1 Qualitative Evaluation 68
3.4.2 Precision Analysis 69
3.4.3 Comparison with the ICP Method 72
3.5 Conclusion and Future Work 73
References 73
Part II: Human Body Analysis 75
Chapter 4: Key Developments in Human Pose Estimation for Kinect 77
4.1 Introduction: The Challenge 77
4.2 Body Part Classi?cation-The Natural Markers Approach 78
4.2.1 Generating the Training Data 79
4.2.2 Randomized Forests for Classi?cation 79
4.3 Random Forest Regression-The Voting Approach 80
4.4 Context-Sensitive Pose Estimation-Conditional Regression Forests 81
4.5 One-Shot Model Fitting: The Vitruvian Manifold 82
4.6 Directions for Future Work 83
References 83
Chapter 5: A Data-Driven Approach for Real-Time Full Body Pose Reconstruction from a Depth Camera 85
5.1 Introduction 86
Contributions 87
5.2 Related Work 88
Intensity-Image-Based Tracking 88
Depth-Image-Based Tracking 88
5.3 Acquisition and Data Preparation 90
5.3.1 Depth Data 90
5.3.2 Model of the Actor 91
5.3.3 Pose Database 92
5.3.4 Normalization 93
5.4 Pose Reconstruction Framework 94
5.4.1 Local Optimization 95
5.4.2 Feature Computation 95
5.4.3 Database Lookup 101
5.4.4 Hypothesis Selection 102
5.5 Experiments 103
5.5.1 Feature Extraction 103
5.5.2 Quantitative Evaluation 103
5.5.3 Run Time 105
5.5.4 Qualitative Evaluation 105
5.5.5 Limitations 108
5.6 Conclusions 109
References 109
Chapter 6: Home 3D Body Scans from a Single Kinect 113
6.1 Introduction 114
6.2 Related Work 116
6.3 Sensor and Preprocessing 117
Intrinsic Calibration 118
Stereo Calibration 118
Depth Calibration 118
Ground Plane 118
Segmentation 118
6.4 Body Model and Fitting 119
6.4.1 SCAPE Body Model 119
6.4.2 Pose Initialization 120
6.4.3 Depth Objective 121
6.4.4 Silhouette Objective 121
6.4.5 Optimization 124
6.5 Results 124
From Bodies to Measurements 127
Accuracy Relative to Laser Scans 127
Linear Measurement Accuracy 129
6.6 Conclusions 129
References 130
Chapter 7: Real Time Hand Pose Estimation Using Depth Sensors 132
7.1 Introduction 132
7.1.1 Related Work 134
7.1.1.1 Hand Pose Estimation 134
7.1.1.2 Hand Shape Recognition from Depth 135
7.2 Methodology 135
7.2.1 Data 136
7.2.2 Decision Trees 137
7.2.3 Randomized Decision Forest for Hand Pose Estimation 138
7.2.4 Joint Position Estimation 140
7.3 Experiments 141
7.3.1 Datasets 141
7.3.1.1 Synthetic Dataset 141
7.3.1.2 Real Dataset 141
7.3.2 Effect of Model Parameters 141
7.3.2.1 The Effect of the Forest Size 142
7.3.2.2 The Effect of the Tree Depth 142
7.3.2.3 The Effect of the Feature Space 142
7.3.2.4 The Effect of the Sample Size 143
7.3.2.5 The Effect of the Mean Shift Parameters 144
7.3.3 Hand Pose Estimation Results 145
7.3.4 Proof of Concept: American Sign Language Digit Recognizer 146
7.3.4.1 Hand Shape Classi?ers 147
7.3.4.2 Model Selection on the Synthetic Dataset 147
7.3.4.3 ASL Digit Classi?cation Results on Real Data 147
7.4 Conclusion 148
References 149
Part III: RGB-D Datasets 151
Chapter 8: A Category-Level 3D Object Dataset: Putting the Kinect to Work 152
8.1 Introduction 153
8.2 Related Work 156
8.2.1 3D Datasets for Detection 156
RGBD-Dataset of [23] 156
UBC Visual Robot Survey [3, 20] 156
3D Table Top Object Dataset [28] 156
Solutions in Perception Challenge [2] 156
Max Plank Institute Kinect Dataset [8] 156
Indoor Scene Segmentation Dataset [27] 157
Other Datasets 158
8.2.2 3D and 2D/3D Recognition 158
8.3 The Berkeley 3D Object Dataset 159
8.3.1 Data Annotation 159
8.3.2 The Kinect Sensor 160
8.3.3 Smoothing Depth Images 160
8.3.4 Data Statistics 161
8.4 Detection Baselines 163
8.4.1 Sliding Window Detector 163
8.4.2 Evaluation 164
8.4.3 Pruning and Rescoring by Size 166
8.5 A Histogram of Curvature (HOC) 167
8.5.1 Curvature 168
8.5.2 HOC 168
8.5.3 Experimental Setup and Baselines 172
8.5.4 Results 173
8.6 Discussion 174
References 174
Chapter 9: RGB-D Object Recognition: Features, Algorithms, and a Large Scale Benchmark 177
9.1 Introduction 178
9.2 RGB-D Object Dataset Collection 178
9.3 Segmentation 179
9.4 Video Scene Annotation 182
9.5 RGB-D Object Recognition 184
9.5.1 Experimental Setup 185
9.5.2 Distance Learning for RGB-D Object Recognition 185
9.5.2.1 Instance Distance Learning 186
9.5.2.2 RGB-D Feature Set 186
9.5.2.3 Evaluation 187
9.5.3 Kernel Descriptors for RGB-D Object Recognition 189
9.5.3.1 Kernel Descriptors 189
9.5.3.2 Evaluation 191
9.5.4 Joint Object Category, Instance, and Pose Recognition 192
9.5.4.1 Object-Pose Tree 192
9.5.4.2 Evaluation 193
9.6 Object Detection in Scenes Using RGB-D Cameras 194
9.6.1 RGB-D Object Detection 195
9.6.2 Scene Labeling 198
9.7 Discussion 200
References 200
Chapter 10: RGBD-HuDaAct: A Color-Depth Video Database for Human Daily Activity Recognition 203
10.1 Introduction 203
10.2 Related Works 204
10.3 RGBD-HuDaAct: Color-Depth Human Daily Activity Database 205
10.3.1 Related Video Databases 205
10.3.2 Database Construction 207
10.3.3 Database Statistics 207
10.4 Color-Depth Fusion for Activity Recognition 208
10.4.1 Depth-Layered Multi-channel STIPs (DLMC-STIPs) 209
10.4.2 3-Dimensional Motion History Images (3D-MHIs) 211
10.5 Experimental Evaluations 213
10.5.1 Evaluation Schemes 213
10.5.2 DLMC-STIPs vs. STIPs 214
10.5.3 3D-MHIs vs. MHIs 215
10.6 Conclusions 216
References 217
Index 219
Erscheint lt. Verlag | 3.10.2012 |
---|---|
Reihe/Serie | Advances in Computer Vision and Pattern Recognition | Advances in Computer Vision and Pattern Recognition |
Zusatzinfo | XVI, 210 p. 109 illus., 106 illus. in color. |
Verlagsort | London |
Sprache | englisch |
Themenwelt | Informatik ► Grafik / Design ► Digitale Bildverarbeitung |
Mathematik / Informatik ► Informatik ► Software Entwicklung | |
Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik | |
Schlagworte | 3D Point Cloud • computer vision • Consumer Depth Cameras • Kinect • pattern recognition |
ISBN-10 | 1-4471-4640-9 / 1447146409 |
ISBN-13 | 978-1-4471-4640-7 / 9781447146407 |
Haben Sie eine Frage zum Produkt? |
Digital Rights Management: ohne DRM
Dieses eBook enthält kein DRM oder Kopierschutz. Eine Weitergabe an Dritte ist jedoch rechtlich nicht zulässig, weil Sie beim Kauf nur die Rechte an der persönlichen Nutzung erwerben.
Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich