跳转至

1 Computer abstractions and Technology

文本统计:约 1221 个字

冯诺依曼体系结构

  • 计算于存储分离
  • 数据与指令保存在同一个存储器
  • Input and output mechanisms (I/O)
  • Instruction set architecture

1.1 Computer Organization

Hardware

CPU(Processor) : active part of the computer, which contains the datapath and control and which adds numbers, tests numbers, signals I/O devices to activate, and so on.

  • Datapath (数据通路): performs arithmetic operation
  • Control (控制通路): commands the datapath, memory, and I/O devices according to the instructions of the program

Memory : the storage area programs are kept and that contains the data needed by the running programs

  • Main Memory(主存): volatile; used to hold programs while they are running.(e.g. DRAM in computers)
  • Second memory: nonvolatile; used to store programs and data between runs. (Flash in PMD, magnetic disks)

针对内存的特性有

Volatile (易失性): - DRAM (Dynamic Random-Access Memory):动态随机存储器 - SRAM (Static Random Access Memory):静态随机存储器

Nonvolatile (非易失性) - Solid state memory (Flash Memory):固态硬盘 or 闪存 - Magnetic disk (Hard disk) :硬盘

Software

1.2 Computer design: performance and idea

Response time/execution time(响应时间/执行时间):处理任务的时间

Throughput(bandwidth)(吞吐量):单位时间内完成的任务

How are response time and throughput affected by
1. Replacing the processor with a faster version.
2. Adding more processors.
  1. 换一个更快的处理器可以对这两个指标都有提升

  2. 增加更多的处理器不能增快执行时间,但可以增加吞吐率

Relative Performance =1/Execution Time

Elapsed time (实际经过的时间)

  • Total response time, including all aspects (Processing, I/O, overhead, idle time)
  • Determine system performance

CPU time

  • Time spent processing a given job (Discounts I/O time, other jobs’ shares)
  • Comprises user CPU time and system CPU time 包括用户和系统的CPU时间
  • Different programs are affected differently by CPU and system performance
\[ \text{CPU Time}=\text{CPU Clock Cycles} \times \text{Clock Cycle Time}=\frac{\text{CPU Clock Cycles}}{\text{Clock Rate}} \]

\(\text{Clock Rate}\) 就是单位时间CPU能走多少个 \(\text{Cycles}\)

CPU Time Example

Computer A: 2GHz(=\(2\times 10^9 Hz\)) clock, 10s CPU time

Designing Computer B

  • Aim for 6s CPU time

  • Can do faster clock, but causes 1.2 × clock cycles

How fast must Computer B clock be?

Instruction Count (IC) and CPI (Cycles per Instruction)

\[ \text{Clock Cycles} = \text{Instrction Count} \times \text{Cycles per Instruction} \]
\[ \text{CPU Time}=\text{Instrction Count} \times \text{CPI} \times \text{Clock Cycle Time}= \frac{\text{Instrction Count} \times \text{CPI}}{\text{Clock Rate}} \]

Instruction Count for a program : determined by program, ISA(指令集) and complier

CPI example

Computer A: Cycle Time = 250ps, CPI = 2.0 Computer B: Cycle Time = 500ps, CPI = 1.2 Same ISA Which is faster, and by how much?

\[ \frac{\text{CPU Time}_B}{\text{CPU Time}_A}=\frac{I\times 600\text{ps}}{I\times 500\text{ps}}=1.2 \]

CPI in more details

  • If different instruction classes take different numbers of cycles 不同的指令拥有不同的CPI
\[ \text{Clock Cycles}=\sum_{i=1}^n(\text{CPI}_i\times \text{Instruction Count}_i) \]
  • Weighted average CPI 平均CPI(带权重)
\[ \text{CPI}=\frac{\text{Clock cycles}}{\text{Instruction Count}}=\sum_{i=1}^n(\text{CPI}_i\times\frac{\text{Instruction Count}_i}{\text{Instruction Count}}) \]

CPI example 2

  • \(\text{Power} = \text{Capacitive load}\times \text{Voltage}^2\times \text{Frequency}\)

Summary $$ \text{CPU Time}=\frac{\text{Instructions}}{\text{Program}}\times\frac{\text{Clock cycles}}{\text{Instrction}}\times\frac{\text{Seconds}}{\text{Clock cycles}} $$

Performance depends on

  • Algorithm: affects \(IC\), possibly \(CPI\)(average)
  • Programming language: affects \(IC, CPI\)
  • Compiler: affects \(IC, CPI\)
  • Instruction set architecture: affects \(IC, CPI, T_c(\text{cycle time})\)

1.3 评价计算机的运行效率

1.3.1 SPEC

SPEC: Stanard Performance Evaluation Corp

SPEC CPU Benchmark

programs used to measure performance

  • Elapsed time to execute a selection of programs, Negligible I/O, so focuses on CPU performance

  • Normalize relative to reference machine

  • Summarize as geometric mean of performance ratios

\[ ^n\sqrt{\prod_{i=1}^n\text{Execution time ratio}_i} \]

SPEC Power Benchmark

  • Performance: ssj_ops/sec
  • Power: Watts(Joules/sec)
\[ \text{Overall ssj\_ops per Watt}=\frac{\sum_{i=0}^{10}\text{ssj}\_\text{ops}_i}{\sum_{i=0}^{10}\text{power}_i} \]

1.3.2 Some pitfalls

Amdahl’s Law

仅提升某一块内容的效率对整体效率的提升是有上限的

Improving an aspect of a computer and expecting a proportional improvement in overall performance

\[ T_{\text{improved}}=\frac{T_{\text{affected}}}{\text{improvement factor}}+T_{\text{unaffected}} \]

multiply accounts for 80s/100s

How much improvement in multiply performance to get 5× overall?

\(20=\frac{80}{n}+20 \Rightarrow\) can't be done

MIPS as a Performance Metric

MIPS : Millions of Instructions Per Second

把每秒运行的指令数来作为评价的方法

MIPS as a performance metric

没有解释:

  • Differences in ISAs between computers
  • Differences in complexity between instructions

1.4 Eight Great Ideas

  • Design for Moore’s Law (设计紧跟摩尔定律)

Design for where it will be when finishes rather than design for where it starts.

Example

Adding electromagnetic aircraft catapults (which are electrically powered as opposed to current steam-powered models), allowed by the increased power generation offered by the new reactor technology

  • Use Abstraction to Simplify Design (采用抽象简化设计)

Lower-level details are hidden to offer a simple model at higher level

Example

Building self-driving cars whose control systems partially rely on existing sensor systems already installed into the base vehicle, such as lane departure systems and smart cruise control systems 将传感器系统进行了抽象

  • Make the Common Case Fast (加速大概率事件)

Making the common case fast will tend to enhance performance better than optimizing the rare case

Example

Express elevators in building

  • Performance via Parallelism (通过并行提高性能)

Get more performance by performing operation in parallel

Example

Increasing the gate area on a CMOS transistor to decrease its switching time

  • Performance via Pipelining (通过流水线提高性能)

  • Performance via Prediction (通过预测提高性能)

To guess and start working rather than waiting until you know for sure

Example

Aircraft and marine navigation systems that incorporate wind information

  • Hierarchy of Memories (存储器层次)

With the fastest, smallest, and most expensive memory per bit at the top of the hierarchy and the slowest, largest, and cheapest per bit at the bottom

Example

Library reserve desk

  • Dependability via Redundancy (通过冗余提高可靠性)

Example

Suspension bridge cables

评论区

对你有帮助的话请给我个赞和 star => GitHub stars
欢迎跟我探讨!!!