5 분 소요

  1. Introduction
  2. DRAM Organization & Operation
  3. Memory Hierarchy: Principle of Locality
  4. Caches
  5. Cache Block Size
  6. Main Memory & Caches
  7. Write Policies
  8. Cache Performance
  9. Associative Caches
  10. Multilevel Caches
  11. Cache’s Interaction with Advanced CPUs and Software

Introduction

memory hierarchy : A structure that uses multiple levels of memories. ; as the distance from the processor increases, the size of the memories and the access time both increase while the cost per bit decreases.

Alt text

cf. *Why* is **memory hierarchy** crucial? ![Alt text](/images/ca5_processor-memory-improvement.png) Even if the processor speed and memory latency are getting more and more improved, it still takes around 100+ CPU cycles for a DRAM access. The hierarial structure is the key to improve this.

DRAM Organization & Operation

DRAM configuration

  • strobe: A signal used to time reading that is output together with the data.
  • Block(line): Unit of data presented in caches.

Note: CPU register ‘line’ == word (4B)

  • Large capacity
    • Arranged as 2D matrix
  • Narrow data interface: 1-16 bits (x1, x4, x8, x16)
    • Cheap packages => Few bus pins.
    • Pins are expensive.
  • Narrow address interface
    • Multiplexed address strobe lines: row and column address
    • Signaled by RAS# and CAS#, respectively. (RAS: Row Address Strobe, CAS: Column Address Strobe)

Alt text: Grid-like figure to represent how the DRAM is made up.

DRAM Organization

  • DRAM: set of banks
  • bank: independent memory request processing.
    • # banks == parallelism degree.
  • Row buffer: bank-private cache area. (=> faster than DRAMs)
    • 4-16kB in size
    • Shared by all CPU cores
  • (cf. Rank = 4-16 banks)

Alt text

“각 bank에 동시에 access가 가능하면, spreading sequential data @ multiple banks is better.”

Alt text

DRAM Operations

(In? Column-first scheduling)

1) row-hit access 2) row-miss access

DRAM access scenario

  • Three column reads to row 0, one col read to row 1.

Alt text

Access scheduling sequence

# RAS#? CAS#? Result
1 0 0 row-miss
2 0 1 row-hit
3 1 2 row-miss
4 1 2 row-miss

Explanation

    1. Row miss: ACT, row activation
      • We should have one miss at the initial data accesses.
      • Right after the miss, the corresponding data portion is filled into the buffer.
      • Here, we accessed the data of row 1 of the DRAM data for the first time.
    1. Row hit: We already had an ACT in access to row0 col0. Thus, the row buffer stores the Row0, which then we can use to access it later.
    1. Row miss: ACT, because we haven’t loaded the Row1’s data into the buffer.
      • Then we fill ? Buffer
    1. Row miss: “ACT”. Why?? (probably because the tag bit differs!)

In case of row-miss, the row buffer gets the data from the DRAM.

Alt text

Memory Hierarchy: Principle of Locality

(If a data location is referenced,)

terms:

  • temporal locality
    The principle stating that if a location is referenced then it will tend to be referenced again soon.
  • spatial locality
    The locality principle stating that if a data location is

Temporal Locality

Recently accessed items are likely to be accessed again soon.

e.g. looping variables (induction variables), instructions in a loop

Spatial Locality

Items near recently accessed one are likely to be referenced/accessed soon.

e.g. sequential instruction accesses (e.g. 4 bytes array), array data

Taking advantage of this,

  1. Store everything on disk
  2. Copy recently accessed (and nearby - spatial locality) items from disk to smaller DRAM memory
    1. main memory
  3. Copy more recently accessed (and nearby) items from DRAM to smaller SRAM memory (- temporal locality?)
    1. Cache memory attached to CPU
  • memory hierarchy

Access Hits and Misses

  • Block(line): Unit of data presented in caches.
  • Hit: If accessed data is present in upper level. (Memory accesses found in a level of the memory hierarchy.)
    • Hit ratio: hits/accesss.
  • Miss: If accessed data is absent in the upper level. (Memory accesses not found in a level of the memory hierarchy.)
    • => Copy the block from the lower level
      • (=> Regenerate the access by copying it to the upper level.(?))
    • Time taken: miss penalty: access to lower + access to upper
    • Miss ratio: misses/accesses
      • = 1 - hit ratio
  • Then accessed data supplied from upper level

Caches

Cache memory: The level of the memory hierarchy closest to the CPU (except for the register(?))

Given accesses, we have 2 Questions:

  • Where do we look?
  • How do we know the presense of the data?

Cache organization

  • memory hierarchy

스스로 정리한 내용 + Tag array + Data Array

  • Address composition (of fields)
    • tag
    • (cache) index
    • block offset
    • word-byte offset.

###

5.2. Memory Technologies

5.3. The Basics of Caches

5.4. Measuring and Improving Cache Performance

5.5. Dependable Memory Hierarchy