跳转至

Chapter 8 Physical Storage System

文本统计:约 1021 个字

8.1 Classification of Physical Storage Media

Can differentiate storage into:

  • volatile storage(易失存储): loses contents when power is switched off
  • non-volatile storage (非易失存储) : Contents persist even when power is switched off. Includes secondary and tertiary storage, as well as batter-backed up main-memory.

Speed with which data can be accessed

Cost per unit of data

Reliability

  • data loss on power failure or system crash
  • physical failure of the storage device

8.2 Storage Hierarchy(存储级别)

Primary storage: Fastest media but volatile (cache, main memory).

Secondary storage: next level in hierarchy, non-volatile, moderately fast access time, also called on-line storage , E.g. flash memory 闪存, magnetic disks 磁盘

Tertiary storage: lowest level in hierarchy, non-volatile, slow access time, also called off-line storage E.g. optical storage 光盘,magnetic tape 磁带

8.3 Magnetic Hard Disk Mechanism

一个磁盘有上十万个 track(磁道), 一个磁道又有上千个 sector(扇区,是计算机和磁盘交换数据的最小单位).

Arm assembly 用来寻道,读写头共进退,寻找数据在哪个磁道上。

等对应扇区旋转到读写头,才开始传输数据。同样磁道组成的柱面。对于大文件,最好存在同一个柱面上,这样可以并行读写。

Read-write head

Surface of platter divided into circular tracks(磁道)

Each track is divided into sectors(扇区)

To read/write a sector

  • disk arm swings to position head on right track
  • platter spins continually; data is read/written as sector passes under head

Cylinder(柱面) i consists of ith track of all the platters

Disk controller(磁盘控制器)– interfaces between the computer system and the disk drive hardware.

8.3.1 Performance Measures of Disks

Access time (访问时间) – the time it takes from when a read or write request is issued to when data transfer begins. Consists of:

(1)Seek time(寻道时间)– time it takes to reposition the arm over the correct track.

  • Average seek time is ½ the worst case seek time.
  • 4 to 10 milliseconds on typical disks

(2)Rotational latency(旋转延迟)– time it takes for the sector to be accessed to appear under the head.

  • Average latency is ½ of the worst case latency.
  • 4 to 11 milliseconds on typical disks (5400 to 15000 r.p.m.)

Data-transfer rate(数据传输率) – the rate at which data can be retrieved from or stored to the disk.

内存传输是以块为单位的。即使是想要访问一个 byte, 也需要把这个 byte 所在的 4k 内存读进来。

Disk block is a logical unit for storage allocation and retrieval

  • Smaller blocks: more transfers from disk
  • Larger blocks: more space wasted due to partially filled blocks

Sequential access pattern(顺序访问模式):连续的读写请求只需要第一次访问磁盘

Random access pattern(随机访问模式):慢,希望尽量多一些顺序访问。可以用一个日志把要修改的数据记录下来,后面再进行修改,尽量用顺序访问替换随机访问。

I/O operations per second (IOPS ,每秒I/O操作数):Number of random block reads that a disk can support per second. 每秒可以支持随机读的次数。

Mean time to failure (MTTF,平均故障时间) the average time the disk is expected to run continuously without any failure.

8.3.2 Optimization of Disk-Block Access

Buffering: in-memory buffer to cache disk blocks

Read-ahead(Prefetch): Read extra blocks from a track in anticipation that they will be requested soon

Disk-arm-scheduling algorithms re-order block requests so that disk arm movement is minimized

  • elevator algorithm

File organization

  • Allocate blocks of a file in as contiguous a manner as possible
  • Allocation in units of extents(盘区)
  • Files may get fragmented

Nonvolatile write buffers (非易失性写缓存) – speed up disk writes by writing blocks to a non-volatile RAM buffer immediately

  • Non-volatile RAM: battery backed up RAM or flash memory, Even if power fails, the data is safe and will be written to disk when power returns

Log disk(日志磁盘) – a disk devoted to writing a sequential log of block updates

8.4 Flash Storage

NAND flash - used widely for storage, cheaper than NOR flash

  • requires page-at-a-time read (page: 512 bytes to 4 KB), Not much difference between sequential and random read
  • Page can only be written once, Must be erased to allow rewrite

SSD(Solid State Disks) - Use standard block-oriented disk interfaces, but store data on multiple flash storage devices internally

Feature Magnetic Disk Solid State Disk
Retrieve a page 5-10 milliseconds 20-100 microseconds
Random access Random 50 to 200 IOPS Reads: 10,000 IOPS
Writes: 40,000 IOPS
Data transfer rate 200M 500M (SATA), 3G (NVMe)
Power consumption Higher Lower
Update mode In place Erase ➔ Rewrite
Reliability MTTF: 500,000 to 1,200,000 hours Erase blocks: 100,000 to 1,000,000 erases

Erase happens in units of erase block

Remapping of logical page addresses to physical page addresses avoids waiting for erase

Flash translation table tracks mapping

  • also stored in a label field of flash page
  • remapping carried out by flash translation layer

wear leveling(磨损均衡)- evenly distributed erase operators across physical blocks

8.5 Storage Class Memory (NVM)

DRAM NVM SSD HDD
Read Latency 1 x 2 — 4 x 500x 10^5 x
Write Latency 1 x 2 — 8 x 5000x 10^5 x
Persistence No Yes Yes Yes
Byte-Addressable Yes Yes No No
Endurance Yes No No Yes

评论区

对你有帮助的话请给我个赞和 star => GitHub stars
欢迎跟我探讨!!!