编辑推荐
“本书之所以成为永恒的经典,是因为它的每一次再版都不仅仅是更新补充,而是一次全面的修订,对这个激动人心且快速变化领域给出了最及时的信息和最独到的解读。对于我来说,即使已有二十多年的从业经历,再次阅读本书仍自觉学无止境,感佩于两位卓越大师的渊博学识和深厚功底。”
——Luiz André Barroso,Google公司
内容简介
本书堪称计算机系统结构学科的“圣经”,是计算机设计领域学生和实践者的必读经典。本书系统地介绍了计算机系统的设计基础、存储器层次结构设计、指令级并行及其开发、数据级并行、GPU体系结构、线程级并行和仓库级计算机等。
现今计算机界处于变革之中:移动客户端和云计算正在成为驱动程序设计和硬件创新的主流范型。因此在这个最新版中,作者考虑到这个巨大的变化,重点关注了新的平台(个人移动设备和仓库级计算机)和新的体系结构(多核和GPU),不仅介绍了移动计算和云计算等新内容,还讨论了成本、性能、功耗、可靠性等设计要素。每章都有两个真实例子,一个来源于手机,另一个来源于数据中心,以反映计算机界正在发生的革命性变革。
本书内容丰富,既介绍了当今计算机体系结构的最新研究成果,也引述了许多计算机系统设计开发方面的实践经验。另外,各章结尾还附有大量的习题和参考文献。本书既可以作为高等院校计算机专业高年级本科生和研究生学习“计算机体系结构”课程的教材或参考书,也可供与计算机相关的专业人士学习参考。
本书特色
• 更新相关内容以覆盖移动计算变革,并强调当今体系结构中最重要的两个主题:存储器层次结构和各种并行技术。
• 每章中的“Putting It All Together”小节关注了业界的各种最新技术,包括ARM Cortex-A8、Intel Core i7、NVIDIA GTX-280和GTX-480 GPU,以及一种Google仓库级计算机。
• 每章都设计了常规主题:能力、性能、成本、可依赖性、保护、编程模型和新趋势。
• 书中包括3个附录,另外8个附录可以在原出版社网站在线得到。
• 每章最后都设置了由工业界和学术界专家提供的经过更新的案例研究,以及与之配套的全新练习题。
......(更多)
John L. Hennessy 斯坦福大学校长,IEEE和ACM会士,美国国家工程研究院院士及美国科学艺术研究院院士。Hennessy教授因为在RISC技术方面做出了突出贡献而荣获2001年的Eckert-Mauchly奖章,他也是2001年Seymour Cray计算机工程奖得主,并且和本书另外一位作者David A. Patterson分享了2000年John von Neumann奖。
David A. Patterson 加州大学伯克利分校计算机科学系主任、教授,美国国家工程研究院院士,IEEE和ACM会士,曾因成功的启发式教育方法被IEEE授予James H. Mulligan,Jr.教育奖章。他因为对RISC技术的贡献而荣获1995年IEEE技术成就奖,而在RAID技术方面的成就为他赢得了1999年IEEE Reynold Johnson信息存储奖。2000年他和John L. Hennessy分享了John von Neumann奖。
......(更多)
Foreword
Preface
Acknowledgments
Chapter 1 Fundamentals of Quantitative Design and Analysis
1.1 Introduction
1.2 Classes of Computers
1.3 Defining Computer Architecture
1.4 Trends in Technology
1.5 Trends in Power and Energy in Integrated Circuits
1.6 Trends in Cost
1.7 Dependability
1.8 Measuring, Reporting, and Summarizing Performance
1.9 Quantitative Principles of Computer Design
1.10 Putting It All Together: Performance, Price, and Power
1.11 Fallacies and Pitfalls
1.12 Concluding Remarks
1.13 Historical Perspectives and References Case Studies and Exercises by Diana Franklin
Chapter 2 Memory Hierarchy Design
2.1 Introduction
2.2 Ten Advanced Optimizations of Cache Performance
2.3 Memory Technology and Optimizations
2.4 Protection: Virtual Memory and Virtual Machines
2.5 Crosscutting Issues: The Design of Memory Hierarchies
2.6 Putting It All Together: Memory Hierachies in the ARM Cortex-AS and Intel Core i7
2.7 Fallacies and Pitfalls
2.8 Concluding Remarks: Looking Ahead
2.9 Historical Perspective and References Case Studies and Exercises by Norman P. Jouppi, Naveen Muralimanohar, and Sheng Li
Chapter 3 nstruction-Level Parallelism and Its Exploitation
3.1 Instruction-Level Parallelism: Concepts and Challenges
3.2 Basic Compiler Techniques for Exposing ILP
3.3 Reducing Branch Costs with Advanced Branch Prediction
3.4 Overcoming Data Hazards with Dynamic Scheduling
3.5 Dynamic Scheduling: Examples and the Algorithm
3.6 Hardware-Based Speculation
3.7 Exploiting ILP Using Multiple Issue and Static Scheduling
3.8 Exploiting ILP Using Dynamic Scheduling, Multiple Issue, and Speculation
3.9 Advanced Techniques for Instruction Delivery and Speculation
3.10 Studies of the Limitations oflLP
3.11 Cross-Cutting Issues: ILP Approaches and the Memory System
3.12 Multithreading: Exploiting Thread-Level Parallelism to Improve Uniprocessor Throughput
3.13 Putting It All Together: The Intel Core i7 and ARM Cortex-AS
3.14 Fallacies and Pitfalls
3.15 Concluding Remarks: What's Ahead?
3.16 Historical Perspective and References Case Studies and Exercises by Jason D. Bakos and Robert R Colwell
Chapter4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures
4.1 Introduction
4.2 Vector Architecture
4.3 SIMD Instruction Set Extensions for Multimedia
4.4 Graphics Processing Units
4.5 Detecting and Enhancing Loop-Level Parallelism
4.6 Crosscutting Issues
4.7 Putting It All Together: Mobile versus Server GPUS and Tesla versus Core i7
4.8 Fallacies and Pitfalls
4.9 Concluding Remarks
4.10 Historical Perspective and References Case Study and Exercises by Jason D. Bakos
Chapter 5 Thread-Level Parallelism
5.1 Introduction
5.2 Centralized Shared-Memory Architectures
5.3 Performance of Symmetric Shared-Memory Multiprocessors
5.4 Distributed Shared-Memory and Directory-Based Coherence
5.5 Synchronization: The Basics
5.6 Models of Memory Consistency: An Introduction
5.7 Crosscutting Issues
5.8 Putting It All Together: Multicore Processors and Their Performance
5.9 Fallacies and Pitfalls
5.10 Concluding Remarks
5.11 Historical Perspectives and References
Case Studies and Exercises by Amr Zaky and David A. Wood
Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and
Data-Level Parallelism
6.1 Introduction
6.2 Programming Models and'Workloads for Warehouse-Scale Computers
6.3 Computer Architecture of Warehouse-Scale Computers
6.4 Physical Infrastructure and Costs of Warehouse-Scale Computers
6.5 Cloud Computing: The Return of Utility Computing
6.6 Crosscutting Issues
6.7 Putting It All Together: A Google Warehouse-Scale Computer
6.8 Fallacies and Pitfalls
6.9 Concluding Remarks
6.10 Historical Perspectives and References
Case Studies and Exercises by Parthasarathy Ranganathan
Appendix A Instruction Set Principles
A.1 Introduction
A.2 Classifying Instruction Set Architectures
A.3 Memory Addressing
A.4 Type and Size of Operands
A.S Operations in the Instruction Set
A.6 Instructions for Control Flow
A.7 Encoding an Instruction Set
A.8 Crosscutting Issues: The Role of Compilers
A.9 Putting It All Together: The MIPS Architecture
A.10 Fallacies and Pitfalls
A.11 Concluding Remarks
A.12 Historical Perspective and References
Exercises by Gregory D. Peterson
Appendix B Review of Memory Hierarchy
B.1 Introduction
B.2 Cache Performance
B.3 Six Basic Cache Optimizations
B.4 Virtual Memory
B.5 Protection and Examples of Virtual Memory
B.6 Fallacies and Pitfalls
B.7 Concluding Remarks
B.8 Historical Perspective and References
Exercises by Amr Zaky
Appendix C Pipelining: Basic and Intermediate Concepts
C.1 Introduction
C.2 The Major Hurdle of Pipelining--Pipeline Hazards
C.3 How Is Pipelining Implemented?
C,4 What Makes Pipelining Hard to Implement?
C.5 Extending the MIPS Pipeline to Handle Multicycle Operations
C.6 Putting It All Together: The MIPS R4000 Pipeline
C.7 Crosscutting Issues
C.8 Fallacies and Pitfalls
C.9 Concluding Remarks
C.10 Historical Perspective and References
Updated Exercises by Diana Franklin
Online Appendices
Appendix D Storage Systems
Appendix E Embedded Systems
By ThomasM Conte
Appendix F Interconnection Networks
Revised by Timothy M. Pinkston ond Jose Duoto
Appendix G Vector Processors in More Depth
Revised by Krste Asonovic
Appendix H Hardware and Software for VLIW and EPIC
Appendix I Large-Scale Multiprocessors and Scientific Applications
Appendix J Computer Arithmetic
by David Goldberg
Appendix K Survey of Instruction Set Architectures
Appendix L Historical Perspectives and References
References
Index
1.1 Introduction
1.2 Classes of Computers
1.3 Defining Computer Architecture
1.4 Trends in Technology
1.5 Trends in Power and Energy in Integrated Circuits
1.6 Trends in Cost
1.7 Dependability
1.8 Measuring Reporting, and Summarizing Performance
1.9 Quantitative Principles of Computer Design
1.10 Putting It All Together: Performance, Price, and Power
1.11 Fallacies and Pitfalls
1.12 Concluding Remarks
1.13 Historical Perspectives and References
1.14 Case Studies and Exercises by Diana Franklin
......(更多)
A natural question is whether WSCs are similar to modern clusters for high-performance computing. Although some have similar scale and cost...
计算机系统结构 操作系统 编译原理 数据库系统等计算机系的核心课程就是研究怎么造计算机的,是计算机专业学生的看家本领.
Measuring performance of multiprocessors by linear speedup versus execution time.
在设计上必须有所取舍时,一定要优先考虑较常发生的事件
......(更多)