Pentium(TM) Processor -  Michael L. Schmit

Pentium(TM) Processor (eBook)

Optimization Tools
eBook Download: PDF | EPUB
2014 | 1. Auflage
406 Seiten
Elsevier Science (Verlag)
978-1-4832-1425-2 (ISBN)
Systemvoraussetzungen
Systemvoraussetzungen
54,95 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen
Pentium? Processor
Pentium Processor Optimization Tools covers advanced program optimization techniques for the Intel 80x86 family of chips, including the Pentium. The book starts by providing a review and history of the optimization tool. The text then discusses the 80x86 programming language; Pentium and its tools; and the superscalar Pentium programming. The operation of the floating-point unit; techniques for including assembly language routines in C or C++ programs; and the protected-mode programming are also considered. The book further tackles optimizations and code alignment; as well as the background and technical capabilities of the PowerPC vs. the Pentium and their future technical directions. Computer programmers and students taking related courses will find the book invaluable.

Front Cover 1
Pentium™ Processor: Optimization Tools 4
Copyright Page 5
Table of Contents 8
Dedication 6
Introduction 14
WHO IS THIS BOOK FOR? 14
WHAT IS ON THE DISK? 14
WHY LEARN ASSEMBLY LANGUAGE FOR THE PENTIUM? 15
HOW TO PROCEED 15
ACKNOWLEDGMENTS 16
SECTION I: Review and Historical Context 18
CHAPTER 1. Number Systems 20
HEX 23
SIGNED NUMBERS 25
NUMERIC OVERFLOW 25
DATA SIZES 26
LITTLE ENDIAN VS. BIG ENDIAN 27
CHAPTER 2. What Is Assembly? 30
INTRODUCTION 30
Tools and Terminology 32
WHAT ARE COMPILERS, INTERPRETERS AND ASSEMBLERS? 32
CHAPTER 3. The 8086 Family History and Architecture 36
COMPATIBILITY LESSONS 38
MATH CO-PROCESSORS 39
THE 80286 39
32-BIT 80386 41
RISCY 80486 42
THE 80586 42
THE COMPETITION 43
THE P6 43
SECTION II: 80x86 Family Background 44
CHAPTER 4. 8086 Architecture and Instruction Set 46
8088 Architecture 47
The 8088 Instruction Set 57
Shifts and Rotates 64
Program Control And Branching 68
Flag Manipulations 74
Multiply and Divide 75
BCD Instructions 76
String Instructions 79
Interrupts 86
Miscellaneous Instructions 88
Flag Summary 90
CHAPTER 5. Writing Beginning Programs 92
ASSEMBLER DIRECTIVES 92
WHAT DO ALL THOSE STATEMENTS MEAN? 94
LABELS AND IDENTIFIERS 94
PROCEDURES 96
@DATA 96
DEFINING DATA ITEMS 96
USING DOS SYSTEM FUNCTIONS 96
THE END DIRECTIVE 97
MEMORY MODELS 97
CHAPTER 6. Assembly Tools 100
EDITING 101
ASSEMBLING 101
LINKING 102
DEBUGGING 102
DEBUG32 103
CHAPTER 7. The Instruction Set Evolves: The 186 to the 386 106
THE LOST BROTHER, THE 80186 107
The 80286 109
The 80386 110
NEW 386 ADDRESSING MODES 112
NEW 386 INSTRUCTIONS 112
PROTECTED MODE 115
SECTION III: Introduction to Pentium and Tools 118
CHAPTER 8. The 80486 and Pentium 120
Pentium 122
BIGGER CACHE 123
NEW PENTIUM INSTRUCTIONS 123
Summary 125
CHAPTER 9. Superscalar Programming 126
Dual Integer Pipelines 127
Branch Prediction Logic 131
Optimized Cycle Times 133
CHAPTER 10. Integer and Floating-Point Pipeline Operation 136
INSTRUCTION FETCHING 136
THE MEMORY CACHE 137
PIPELINES 137
ADDRESS GENERATION INTERLOCK (AGI) 140
PAIRED PIPELINES 142
486 PIPELINE DELAYS 143
PENTIUM PIPELINE DELAYS 144
Pentium Floating-Point Pipeline 149
FPU PIPELINE DELAYS 151
CONCURRENT INTEGER AND FPU PROCESSING 154
CHAPTER 11. Using the Pentium Optimizer Program 156
How IT WORKS 158
ADDRESS GENERATION INTERLOCKS 159
CHAPTER 12. Timing with a Software Timer 160
ICE 160
BUILT-IN PENTIUM TIMER 161
SOFTWARE TIMER 162
TIMER SOFTWARE FUNCTION REFERENCE 164
Percent Speed Changes 166
SECTION IV: Superscalar Pentium Programming 168
CHAPTER 13. Optimization Warm-ups 170
STRING INSTRUCTION OPTIMIZATIONS 171
CHAPTER 14. String Search and Translate 180
STRING SEARCH 181
String Translations 183
Atomic Programming 185
CODING CHALLENGE 189
REALITY CHECK 189
Case-Independent String Searching 190
CASE-INDEPENDENT STRING SCAN 191
CASE-INDEPENDENT STRING COMPARE 192
CONCLUSIONS 195
CHAPTER 15. Checksums and Extended Precision Addition 196
STEP1 197
STEP 2 198
STEP 3 198
STEP 4 200
STEP 5 201
STEP 6 202
COMING COMPLETELY UNDONE 204
SUMMARY 204
FALSE STEPS 205
Extended Precision Addition 206
SECTION V: Advanced Topics 210
CHAPTER 16. Floating-Point Math 212
FPU BASICS 212
FPU MATRIX OPTIMIZATIONS 215
WHICH ARRAY DECLARATION IS BEST? 218
OPTIMIZING WITH ASSEMBLY 218
CHAPTER 17. Interfacing to C 228
INLINE ASSEMBLY 228
INLINE ASSEMBLY EXAMPLE 230
LINKING SEPARATE MODULES 231
CALLING CONVENTIONS 232
FULL C-TO-ASSEMBLY TEMPLATES 236
EXAMPLES OF CALLING ASSEMBLY ROUTINES FROM C 237
USING THE EXTENDED PROC DIRECTIVE 240
FASTCALL 241
FASTCALL REGISTERS 243
TIMING C CODE 244
CHAPTER 18. Protected-Mode Programming 248
INTRODUCTION TO PROTECTED MODE 248
DPMI, DOS PROTECTED-MODE INTERFACE 249
PROTECTED-MODE SEGMENTS 250
CONVERTING CODE TO PROTECTED MODE 251
MIXED 16-BIT AND 32-BIT PROTECTED-MODE PROGRAMMING 251
FULL SEGMENT DEFINITIONS 251
PROTECTED MODE TIMING 253
32-BIT PROTECTED-MODE CODE TEMPLATE 253
LARGE DATA SEGMENTS 263
TIMING 32-BIT CODE 265
Cloaking Developers Toolkit 269
CHAPTER 19. Final Notes and Optimizations 272
SPEED VS. CODE SIZE 272
LEA, THE MULTI-PURPOSE INSTRUCTION 276
CODE AND DATA ALIGNMENT 277
LOCAL STACK VARIABLES 278
MEASURING AND CORRECTING THE DATA MISALIGNMENT PENALTY 280
CODE ALIGNMENT 283
Further Reading 285
Where We've Been 285
SECTION VI: PowerPC vs. Pentium 288
CHAPTER 20. PowerPC vs. Pentium 290
WHAT IS RISC? 290
WHAT IS CISC? 291
WHAT IS RISC, REALLY? 291
WHICH IS BETTER, RISC OR CISC? 292
IS THE PENTIUM RISC OR CISC? 292
SUPERSCALAR PROCESSORS 293
SUPERSCALAR TECHNIQUES AND TERMINOLOGY 294
WHAT IS IN THE POWERPC? 296
IS THE POWERPC LESS EXPENSIVE? 298
FUTURE PROCESSOR DESIGNS 299
APPENDICES 300
APPENDIX A: Instruction Set Reference 300
APPENDIX B: Optimization Cross-Reference by Instruction 334
APPENDIX C: Optimization Guidelines by CPU 344
APPENDIX D: Simple Instructions for Pentium Pairing 350
APPENDIX E: Instruction Pairing Rules for Pentium 352
APPENDIX F: Single-Byte Instructions 354
APPENDIX G: Quick Reference for Important Instruction Timings 358
APPENDIX H: Undocumented Pentium Registers 362
APPENDIX I: DEBUG32 Command Summary 366
APPENDIX J: Improving Performance 372
APPENDIX K: Glossary of Terms 376
APPENDIX L: Products Mentioned 392
Index 394

CHAPTER 1

Number Systems


Publisher Summary


This chapter reviews binary, hexadecimal, and decimal number systems. Decimal numbers are used for money, time, measurements, and even television channels. Everything is based on decimal except the internals of computers and other electronic devices. The binary number system is used internally in every computer. Binary or base two has two digits, 0 and 1. Decimal or base 10 has 10 digits, 0 through 9. Computers use binary because the electronic circuits can have only two states, on or off. Different devices may use different physical properties; a magnetic disk may store binary digits as magnetized or not magnetized or as north or south but the effect is the same—on or off. Decimal numbers are formed by combining a number of digits.

“It’S a poor sort of memory that only works backwards,” the Queen remarked.

—Lewis Carroll from Alice’s Adventures in Wonderland

In this chapter we’ll review binary, hexadecimal and decimal number systems. If you have a working knowledge of binary and hexadecimal, then skip to Chapter 2. If you’ve never programmed a computer or used a higher-level language such as C, BASIC or Pascal, you may be familiar only with the concept of decimal (base 10) numbers. We all grew up with decimal numbers—for money, time, measurements and even television channels. Everything is based on decimal—except the internals of computers and other electronic devices. Decimal is easy for us because we grew up with it. And, of course, we have 10 fingers.

The binary number system is used internally in every computer. Binary, or base two, has two digits, 0 and 1. Decimal, or base 10, has 10 digits, 0 through 9. Computers use binary because the electronic circuits can have only two states, “on” or “off.” Different devices may use different physical properties (a magnetic disk may store binary digits as magnetized or not magnetized or as north or south) but the effect is the same–on or off.

To become familiar with binary, we’ll start by looking at whole integer decimal numbers (i.e., 0, 1, 2, 3 …). We form decimal numbers by combining a number of digits. Each digit has two factors that are multiplied together. The first factor is the digit (0 through 9).


Figure 1.1 Example Decimal Number

The second factor varies based on its position within the whole number. The far right digit has a multiplier of 1. The next digit to the left has a multiplier of 10, the next 100, and so on. Moving a digits multiplier left increases its value by a factor of 10, making the numbers base 10.

For example in the number 3406, we say that the number is:

6 × 1 = 6 0
+ 0 × 10 = 00 1
+ 4 × 100 = 400 2
+ 3 × 1000 = 3000 3
        3406  

Of course, we already knew the value of 3406 was 3406. The real issue is how to convert numbers from one base to another. This same process can be performed for numbers in any base. Also notice the multiplier is the base raised to the power of the position. In the example above the 4 is multiplied by 100 and the 100 is 102 (10 raised to the power of 2).

Each digit in a binary number is called a bit, which is a binary digit. So when we have a binary number, say with 4 bits, the value of each of bit is progressively larger. The first bit (bit 0) has a value of 1, or 20. The next bit has a value of 2, or 21. The next bit has a value of 4, or 22. The last bit has a value of 8, or 23. Each bit has a value that is two times the value of the previous bit. In decimal, each digit has a value of 10 times the value of the previous digit to its right.

A byte is a binary number that contains 8 bits. If all the bits in a byte are 1, (the largest possible number), the base 10 value of the byte would be 255. So a byte can have a value from 0 to 255.


Figure 1.2 Example Binary Number

In the following example we’ll convert the binary number to base 10.

1 × 1 = 1 0
+ 0 × 2 = 0 1
+ 1 × 4 = 4 2
+ 1 × 8 = 8 3

Addition in binary is very easy because there are only four possible combinations of numbers to add. By contrast, in base 10 there are 100 combinations. The following is a list of all possible single bit additions.

bit 1 bit 2 result
0 0 0
0 1 1
1 0 1
1 1 0 with a carry
To add two numbers in binary, we follow the same procedure as when adding in decimal. We’ll add 01101001 to 00010001:
binary decimal
01101001 105
00010001 17
01111010 122

HEX


Hexadecimal (usually shortened to “hex”) is a number system based on 16 digits (base 16). Because there are only 10 symbols (0–9) for digits, this poses a small problem when working with number systems requiring more than 10 symbols. There could be many solutions to this problem. One would be to use the first 16 letters of the alphabet. Another would be to make up an entirely new set of symbols. However, the convention in general use is to use the digits 0 to 9 and then use the letters A to F for the values 10 to 15.

Here we’ll convert a hex number to decimal and to binary. The hex number is 3A21h.

1 × 1 = 1 0
+ 2 × 16 = 32 1
+ A × 256 = 2560 2
+ 3 × 4096 = 12288 3
        14881  

Converting a hex number to binary is a completely different and much easier process. Each hex digit is just another representation for a combination of four binary digits (or bits). The reason that hex is commonly used is because of the fact that it is just a combination of four bits. Think of hex as a “shorthand” for binary. Table 1.1 is the conversion table.

Table 1.1

Decimal, Hex and Binary Equivalents

decimal hex binary
0 0 0000
1 1 0001
2 2 0010
3 3 0011
4 4 0100
5 5 0101
6 6 0100
7 7 0111
8 8 1000
9 9 1001
10 A 1010
11 B 1011
12 C 1100
13 D 1101
14 E 1110
15 F 1111

So, to convert to binary, each hex digit is converted in sequence. Converting 3A21h goes as follows:

For easier reading, most examples in this book will use hex and decimal. Hex numbers are usually followed by the letter H for clarity. Memory addresses are always in hex and may be shown in the segment:offset format (4 hex digits, followed by a colon and then followed by 4 more hex digits). Later chapters will discuss the meaning of the segment:offset...

Erscheint lt. Verlag 28.6.2014
Sprache englisch
Themenwelt Mathematik / Informatik Informatik Theorie / Studium
Technik Bauwesen
ISBN-10 1-4832-1425-7 / 1483214257
ISBN-13 978-1-4832-1425-2 / 9781483214252
Haben Sie eine Frage zum Produkt?
PDFPDF (Adobe DRM)
Größe: 23,8 MB

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

EPUBEPUB (Adobe DRM)
Größe: 13,6 MB

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belle­tristik und Sach­büchern. Der Fließ­text wird dynamisch an die Display- und Schrift­größe ange­passt. Auch für mobile Lese­geräte ist EPUB daher gut geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Discover tactics to decrease churn and expand revenue

von Peter Armaly; Jeff Mar

eBook Download (2024)
Packt Publishing Limited (Verlag)
25,19
A practical guide to probabilistic modeling

von Osvaldo Martin

eBook Download (2024)
Packt Publishing Limited (Verlag)
35,99
Unleash citizen-driven innovation with the power of hackathons

von Love Dager; Carolina Emanuelson; Ann Molin; Mustafa Sherif …

eBook Download (2024)
Packt Publishing Limited (Verlag)
35,99