| Structure at a Glance |  | vi |  | 
| Preface |  | xv |  | 
| 
|  | PART 1 BACKGROUND AND MOTIVATION |  |  | 1 | (80) | 
| 
|  | Combinational Digital Circuits |  |  | 3 | (18) | 
| 
|  | Signals, Logic Operators, and Gates |  |  | 3 | (4) | 
| 
|  | Boolean Functions and Expressions |  |  | 7 | (1) | 
|  |  | 8 | (3) | 
| 
|  | Useful Combinational Parts |  |  | 11 | (2) | 
| 
|  | Programmable Combinational Parts |  |  | 13 | (2) | 
| 
|  | Timing and Circuit Considerations |  |  | 15 | (6) | 
|  |  | 17 | (2) | 
| 
|  | References and Further Readings |  |  | 19 | (2) | 
| 
|  | Digital Circuits with Memory |  |  | 21 | (17) | 
| 
|  | Latches, Flip-Flops, and Registers |  |  | 21 | (3) | 
|  |  | 24 | (1) | 
| 
|  | Designing Sequential Circuits |  |  | 25 | (3) | 
|  |  | 28 | (2) | 
| 
|  | Programmable Sequential Parts |  |  | 30 | (2) | 
| 
|  | Clocks and Timing of Events |  |  | 32 | (6) | 
|  |  | 34 | (3) | 
| 
|  | References and Further Readings |  |  | 37 | (1) | 
| 
|  | Computer System Technology |  |  | 38 | (21) | 
| 
|  | From Components to Applications |  |  | 39 | (2) | 
| 
|  | Computer Systems and Their Parts |  |  | 41 | (4) | 
|  |  | 45 | (3) | 
| 
|  | Processor and Memory Technologies |  |  | 48 | (3) | 
| 
|  | Peripherals, I/O, and Communications |  |  | 51 | (3) | 
| 
|  | Software Systems and Applications |  |  | 54 | (5) | 
|  |  | 56 | (2) | 
| 
|  | References and Further Readings |  |  | 58 | (1) | 
|  |  | 59 | (22) | 
| 
|  | Cost, Performance, and Cost/Performance |  |  | 59 | (3) | 
| 
|  | Defining Computer Performance |  |  | 62 | (3) | 
| 
|  | Performance Enhancement and Amdahl's Law |  |  | 65 | (2) | 
| 
|  | Performance Measurement vs Modeling |  |  | 67 | (5) | 
| 
|  | Reporting Computer Performance |  |  | 72 | (2) | 
| 
|  | The Quest for Higher Performance |  |  | 74 | (7) | 
|  |  | 76 | (3) | 
| 
|  | References and Further Readings |  |  | 79 | (2) | 
| 
|  | PART 2 INSTRUCTION-SET ARCHITECTURE |  |  | 81 | (76) | 
| 
|  | Instructions and Addressing |  |  | 83 | (20) | 
| 
|  | Abstract View of Hardware |  |  | 83 | (3) | 
|  |  | 86 | (3) | 
| 
|  | Simple Arithmetic and Logic Instructions |  |  | 89 | (2) | 
| 
|  | Load and Store Instructions |  |  | 91 | (2) | 
| 
|  | Jump and Branch Instructions |  |  | 93 | (4) | 
|  |  | 97 | (6) | 
|  |  | 99 | (3) | 
| 
|  | References and Further Readings |  |  | 102 | (1) | 
|  |  | 103 | (20) | 
|  |  | 103 | (3) | 
| 
|  | Using the Stack for Data Storage |  |  | 106 | (2) | 
|  |  | 108 | (2) | 
|  |  | 110 | (3) | 
|  |  | 113 | (3) | 
|  |  | 116 | (7) | 
|  |  | 120 | (2) | 
| 
|  | References and Further Readings |  |  | 122 | (1) | 
| 
|  | Assembly Language Programs |  |  | 123 | (16) | 
| 
|  | Machine and Assembly Languages |  |  | 123 | (3) | 
|  |  | 126 | (1) | 
|  |  | 127 | (3) | 
|  |  | 130 | (1) | 
|  |  | 131 | (2) | 
| 
|  | Running Assembler Programs |  |  | 133 | (6) | 
|  |  | 136 | (2) | 
| 
|  | References and Further Readings |  |  | 138 | (1) | 
| 
|  | Instruction-Set Variations |  |  | 139 | (18) | 
|  |  | 139 | (2) | 
| 
|  | Alternative Addressing Modes |  |  | 141 | (4) | 
| 
|  | Variations in Instruction Formats |  |  | 145 | (2) | 
| 
|  | Instruction-Set Design and Evolution |  |  | 147 | (1) | 
|  |  | 148 | (3) | 
|  |  | 151 | (6) | 
|  |  | 154 | (2) | 
| 
|  | References and Further Readings |  |  | 156 | (1) | 
| 
|  | PART 3 THE ARITHMETIC/LOGIC UNIT |  |  | 157 | (84) | 
|  |  | 159 | (19) | 
| 
|  | Positional Number Systems |  |  | 159 | (3) | 
|  |  | 162 | (3) | 
|  |  | 165 | (1) | 
|  |  | 166 | (3) | 
|  |  | 169 | (2) | 
|  |  | 171 | (7) | 
|  |  | 174 | (2) | 
| 
|  | References and Further Readings |  |  | 176 | (2) | 
|  |  | 178 | (19) | 
|  |  | 178 | (2) | 
| 
|  | Carry Propagation Networks |  |  | 180 | (3) | 
| 
|  | Counting and Incrementation |  |  | 183 | (2) | 
|  |  | 185 | (3) | 
| 
|  | Logic and Shift Operations |  |  | 188 | (3) | 
|  |  | 191 | (6) | 
|  |  | 193 | (3) | 
| 
|  | References and Further Readings |  |  | 196 | (1) | 
|  |  | 197 | (22) | 
|  |  | 197 | (4) | 
|  |  | 201 | (3) | 
| 
|  | Programmed Multiplication |  |  | 204 | (2) | 
|  |  | 206 | (4) | 
|  |  | 210 | (3) | 
|  |  | 213 | (6) | 
|  |  | 215 | (3) | 
| 
|  | References and Further Readings |  |  | 218 | (1) | 
| 
|  | Floating-Point Arithmetic |  |  | 219 | (22) | 
|  |  | 219 | (5) | 
| 
|  | Special Values and Exceptions |  |  | 224 | (2) | 
|  |  | 226 | (3) | 
| 
|  | Other Floating-Point Operations |  |  | 229 | (1) | 
| 
|  | Floating-Point Instructions |  |  | 230 | (3) | 
| 
|  | Result Precision and Errors |  |  | 233 | (8) | 
|  |  | 237 | (2) | 
| 
|  | References and Further Readings |  |  | 239 | (2) | 
| 
|  | PART 4 DATA PATH AND CONTROL |  |  | 241 | (74) | 
| 
|  | Instruction Execution Steps |  |  | 243 | (15) | 
| 
|  | A Small Set of Instructions |  |  | 244 | (2) | 
| 
|  | The Instruction Execution Unit |  |  | 246 | (1) | 
|  |  | 247 | (2) | 
|  |  | 249 | (1) | 
| 
|  | Deriving the Control Signals |  |  | 250 | (3) | 
| 
|  | Performance of the Single-Cycle Design |  |  | 253 | (5) | 
|  |  | 255 | (2) | 
| 
|  | References and Further Readings |  |  | 257 | (1) | 
|  |  | 258 | (19) | 
| 
|  | A Multicycle Implementation |  |  | 258 | (3) | 
| 
|  | Clock Cycle and Control Signals |  |  | 261 | (3) | 
| 
|  | The Control State Machine |  |  | 264 | (2) | 
| 
|  | Performance of the Multicycle Design |  |  | 266 | (1) | 
|  |  | 267 | (4) | 
|  |  | 271 | (6) | 
|  |  | 273 | (3) | 
| 
|  | References and Further Readings |  |  | 276 | (1) | 
|  |  | 277 | (20) | 
|  |  | 277 | (4) | 
| 
|  | Pipeline Stalls or Bubbles |  |  | 281 | (3) | 
| 
|  | Pipeline Timing and Performance |  |  | 284 | (2) | 
| 
|  | Pipelined Data Path Design |  |  | 286 | (3) | 
|  |  | 289 | (2) | 
|  |  | 291 | (6) | 
|  |  | 293 | (3) | 
| 
|  | References and Further Readings |  |  | 296 | (1) | 
| 
|  | Pipeline Performance Limits |  |  | 297 | (18) | 
| 
|  | Data Dependencies and Hazards |  |  | 297 | (3) | 
|  |  | 300 | (2) | 
|  |  | 302 | (2) | 
|  |  | 304 | (2) | 
|  |  | 306 | (3) | 
|  |  | 309 | (6) | 
|  |  | 310 | (3) | 
| 
|  | References and Further Readings |  |  | 313 | (2) | 
| 
|  | PART 5 MEMORY SYSTEM DESIGN |  |  | 315 | (76) | 
|  |  | 317 | (18) | 
| 
|  | Memory Structure and SRAM |  |  | 317 | (3) | 
|  |  | 320 | (3) | 
|  |  | 323 | (2) | 
| 
|  | Pipelined and Interleaved Memory |  |  | 325 | (2) | 
|  |  | 327 | (2) | 
| 
|  | The Need for a Memory Hierarchy |  |  | 329 | (6) | 
|  |  | 331 | (3) | 
| 
|  | References and Further Readings |  |  | 334 | (1) | 
| 
|  | Cache Memory Organization |  |  | 335 | (18) | 
|  |  | 335 | (3) | 
|  |  | 338 | (3) | 
|  |  | 341 | (1) | 
|  |  | 342 | (3) | 
|  |  | 345 | (1) | 
| 
|  | Improving Cache Performance |  |  | 346 | (7) | 
|  |  | 348 | (4) | 
| 
|  | References and Further Readings |  |  | 352 | (1) | 
|  |  | 353 | (18) | 
|  |  | 353 | (3) | 
|  |  | 356 | (3) | 
|  |  | 359 | (1) | 
|  |  | 360 | (1) | 
|  |  | 361 | (4) | 
| 
|  | Other Types of Mass Memory |  |  | 365 | (6) | 
|  |  | 367 | (3) | 
| 
|  | References and Further Readings |  |  | 370 | (1) | 
| 
|  | Virtual Memory and Paging |  |  | 371 | (20) | 
| 
|  | The Need for Virtual Memory |  |  | 371 | (2) | 
| 
|  | Address Translation in Virtual Memory |  |  | 373 | (3) | 
| 
|  | Translation Lookaside Buffer |  |  | 376 | (3) | 
| 
|  | Page Replacement Policies |  |  | 379 | (3) | 
|  |  | 382 | (1) | 
| 
|  | Improving Virtual Memory Performance |  |  | 383 | (8) | 
|  |  | 386 | (3) | 
| 
|  | References and Further Readings |  |  | 389 | (2) | 
| 
|  | PART 6 INPUT/OUTPUT AND INTERFACING |  |  | 391 | (74) | 
|  |  | 393 | (18) | 
| 
|  | Input/Output Devices and Controllers |  |  | 393 | (2) | 
|  |  | 395 | (2) | 
|  |  | 397 | (3) | 
| 
|  | Hard-Copy Input/Output Devices |  |  | 400 | (4) | 
| 
|  | Other Input/Output Devices |  |  | 404 | (2) | 
| 
|  | Networking of Input/Output Devices |  |  | 406 | (5) | 
|  |  | 408 | (2) | 
| 
|  | References and Further Readings |  |  | 410 | (1) | 
|  |  | 411 | (18) | 
| 
|  | I/O Performance and Benchmarks |  |  | 411 | (2) | 
|  |  | 413 | (3) | 
|  |  | 416 | (1) | 
| 
|  | Demand-Based I/O: Interrupts |  |  | 417 | (1) | 
| 
|  | I/O Data Transfer and DMA |  |  | 418 | (3) | 
| 
|  | Improving I/O Performance |  |  | 421 | (8) | 
|  |  | 425 | (3) | 
| 
|  | References and Further Readings |  |  | 428 | (1) | 
| 
|  | Buses, Links, and Interfacing |  |  | 429 | (20) | 
| 
|  | Intra-and Intersystem Links |  |  | 429 | (4) | 
|  |  | 433 | (2) | 
| 
|  | Bus Communication Protocols |  |  | 435 | (3) | 
| 
|  | Bus Arbitration and Performance |  |  | 438 | (2) | 
|  |  | 440 | (1) | 
|  |  | 441 | (8) | 
|  |  | 445 | (2) | 
| 
|  | References and Further Readings |  |  | 447 | (2) | 
| 
|  | Context Switching and Interrupts |  |  | 449 | (16) | 
|  |  | 449 | (2) | 
| 
|  | Interrupts, Exceptions, and Traps |  |  | 451 | (2) | 
| 
|  | Simple Interrupt Handling |  |  | 453 | (3) | 
|  |  | 456 | (2) | 
| 
|  | Types of Context Switching |  |  | 458 | (2) | 
| 
|  | Threads and Multithreading |  |  | 460 | (5) | 
|  |  | 462 | (2) | 
| 
|  | References and Further Readings |  |  | 464 | (1) | 
| 
|  | PART 7 ADVANCED ARCHITECTURES |  |  | 465 | (83) | 
| 
|  | Road to Higher Performance |  |  | 467 | (23) | 
| 
|  | Past and Current Performance Trends |  |  | 467 | (3) | 
| 
|  | Performance-Driven ISA Extensions |  |  | 470 | (3) | 
| 
|  | Instruction-Level Parallelism |  |  | 473 | (3) | 
| 
|  | Speculation and Value Prediction |  |  | 476 | (3) | 
| 
|  | Special-Purpose Hardware Accelerators |  |  | 479 | (3) | 
| 
|  | Vector, Array, and Parallel Processing |  |  | 482 | (8) | 
|  |  | 485 | (3) | 
| 
|  | References and Further Readings |  |  | 488 | (2) | 
| 
|  | Vector and Array Processing |  |  | 490 | (18) | 
|  |  | 491 | (2) | 
| 
|  | Vector Processor Implementation |  |  | 493 | (4) | 
| 
|  | Vector Processor Performance |  |  | 497 | (2) | 
|  |  | 499 | (2) | 
| 
|  | Array Processor Implementation |  |  | 501 | (2) | 
| 
|  | Array Processor Performance |  |  | 503 | (5) | 
|  |  | 504 | (3) | 
| 
|  | References and Further Readings |  |  | 507 | (1) | 
| 
|  | Shared-Memory Multiprocessing |  |  | 508 | (20) | 
| 
|  | Centralized Shared Memory |  |  | 508 | (4) | 
| 
|  | Multiple Caches and Cache Coherence |  |  | 512 | (2) | 
| 
|  | Implementing Symmetric Multiprocessors |  |  | 514 | (3) | 
| 
|  | Distributed Shared Memory |  |  | 517 | (2) | 
| 
|  | Directories to Guide Data Access |  |  | 519 | (2) | 
| 
|  | Implementing Asymmetric Multiprocessors |  |  | 521 | (7) | 
|  |  | 524 | (3) | 
| 
|  | References and Further Readings |  |  | 527 | (1) | 
| 
|  | Distributed Multicomputing |  |  | 528 | (20) | 
| 
|  | Communication by Message Passing |  |  | 528 | (4) | 
|  |  | 532 | (3) | 
| 
|  | Message Composition and Routing |  |  | 535 | (2) | 
| 
|  | Building and Using Multicomputers |  |  | 537 | (3) | 
| 
|  | Network-Based Distributed Computing |  |  | 540 | (2) | 
| 
|  | Grid Computing and Beyond |  |  | 542 | (6) | 
|  |  | 543 | (4) | 
| 
|  | References and Further Readings |  |  | 547 | (1) | 
| Index |  | 548 |  |