Memory System

Basic Concepts

The maximum size of the memory that can be used in any computer is determined by the addressing scheme.

16-bit addresses = 216 = 64K memory locations

Most modern computers are byte addressable.

Big-endian & Little-endian Assignment

Traditional Architecture

Figure . Connection of the memory to the processor.

Some Basic Concepts

“Block transfer” – bulk data transfer

Memory access time

Memory cycle time

RAM – any location can be accessed for a Read or Write operation in some fixed amount of time that is independent of the location’s address.

Cache memory

Virtual memory, memory management unit

Internal Organization of Memory Chips

Figure :Organization of bit cells in a memory chip.

Semiconductor RAM Memories

A Memory Chip

Figure . Organization of a 1K ´ 1 memory chip.

Static Memories

•The circuits are capable of retaining their state as long as power is

applied.

Figure : A static RAM cell.

• CMOS cell: low power consumption

Figure: An example of a CMOS memory cell.

Asynchronous DRAMs

Static RAMs are fast, but they cost more area and are more expensive. Dynamic RAMs (DRAMs) are cheap and area efficient, but they can not retain their state indefinitely – need to be periodically refreshed

Asynchronous DRAMs

Figure : A single-transistor dynamic memory cell

A Dynamic Memory Chip

Figure: Internal organization of a 2M ´ 8 dynamic memory chip.

Synchronous DRAMs

•The operations of SDRAM are controlled by a clock signal.

Figure : Synchronous DRAM.

Synchronous DRAMs

Figure : Burst read of length 4 in an SDRAM.

• No CAS pulses is needed in burst operation.

• Refresh circuits are included (every 64ms).

• Clock frequency > 100 MHz

• Intel PC100 and PC133

Synchronous DRAMs

• The choice of a RAM chip for a given application depends on several factors:Cost, speed, power, size…

• SRAMs are faster, more expensive, smaller.
• DRAMs are slower, cheaper, larger
• Which one for cache and main memory, respectively?

• Refresh overhead – suppose a SDRAM whose cells are in 8K rows; 4 clock cycles are needed to access each row; then it takes 8192×4=32,768 cycles to refresh all rows; if the clock rate is 133 MHz, then it takes 32,768/(133×10-6)=246×10-6 seconds; suppose the typical refreshing period is 64 ms, then the refresh overhead is 0.246/64=0.0038<0.4% of the total time available for accessing the memory.

Memory Controller

Figure: Use of a memory controller.

Read-Only-Memory

Figure : A ROM cell.

•Volatile / non-volatile memory

•ROM

•PROM: programmable ROM

•EPROM: erasable, reprogrammable ROM

•EEPROM: can be programmed and erased electrically

Flash Memory

•Similar to EEPROM

•Difference: only possible to write an entire block of cells instead of a single cell

•Low power

•Use in portable equipment

•Implementation of such modules

–Flash cards

–Flash drives

•Fastest access is to the data held in processor registers. Registers are at the top of the memory hierarchy.

•Relatively small amount of memory that can be implemented on the processor chip. This is processor cache.

•Two levels of cache. Level 1 (L1) cache is on the processor chip. Level 2 (L2) cache is in between main memory and processor.

•Next level is main memory, implemented as SIMMs. Much larger, but much slower than cache memory.

•Next level is magnetic disks. Huge amount of inexepensive storage.

Cache Memory

The Cache memory stores a reasonable number of blocks at a given time but this number is small compared to the total number of blocks available in Main Memory.

The Cache control hardware decide that which block should be removed to create space for the new block that contains the referenced word.

The collection of rule for making this decision is called the replacement algorithm.

The cache control circuit determines whether the requested word currently exists in the cache.

If the data is in the cache it is called a Read or Write hit

If the data is not present in the cache, then a Read miss or Write miss occurs

Cache Memories

•Effectiveness of cache is based on a property of computer programs called locality of reference

•Most of programs time is spent in loops or procedures called repeatedly. The remainder of the program is accessed infrequently.

•Temporal referencing – a recently executed instruction is likely to be called again.

•Spatial referencing – instructions in close proximity to a recently executed instruction are likely to be called again.

•Effectiveness of cache is based on a property of computer programs called locality of reference

•Most of programs time is spent in loops or procedures called repeatedly. The remainder of the program is accessed infrequently.

•Temporal referencing – a recently executed instruction is likely to be called again.

•Spatial referencing – instructions in close proximity to a recently executed instruction are likely to be called again.

•Based on locality of reference
–Temporal
•Recently executed instructions are likely to executed again soon
–Spatial
•Instructions in close proximity to a recently executed instruction (with respect to an address) are also likely to be executed soon.
•Cache Block – a set of contiguous address locations (cache block = cache line)

Conceptual Operation of Cache

•Memory control circuitry is designed to take advantage of locality of reference.
•Temporal
–Whenever an information (instruction or data) is first needed, this item should be brought into the cache where it will hopefully remain until it is needed again.
•Spatial
–Instead of fetching just one item from the main memory to the
cache, it is useful to fetch several items that reside at adjacent
addresses well.
•A set of contiguous addresses are called a block
–cache block or cache line
•Write through Protocol
–Cache and main memory are updated simultaneously
•Write Back Protocol
–Update on the cache and mark it with an associated flag bit (dirty or modified bit)
–Main memory is updated later, when the block containing this marked word is to be removed from cache to make room for a new block.

Write Protocols

•Write through
–Simpler, but results in unnecessary Write operations in main memory when a cache word is updated several times during its cache residency.
•write back
–can result in unnecessary write operations because when a cache block is written back to the memory all words of the block are written back, even if only a single word has been changed while the block was in the cache.

Mapping Algorithms

•Processor does not need to know explicitly that there is a cache.
•Based on R/W operations, the cache control circuitry determines whether the requested word currently exists in the cache. (Hit)
•If information is in cache for a read, main memory is not involved. For write operations, system can either use write-through protocol or write-back protocol

Mapping Functions

•Specification of correspondence between the main memory blocks and those in cache.
•Hit or Miss

–Write through Protocol

–Write back protocol (uses dirty bit)

–Read miss

–Load through or early restart on read miss

–Write Miss

Read Protocols

•Read miss
–Addressed word is not in cache

–Block of words containing requested word is written from main memory to cache.

–After entire block is written to cache, particular word is forwarded to processor.

Or word may be sent to processor as soon as it is read from main memory (load-through or early-restart)

reduces processor’s wait time but requires more complex circuitry.

Write Miss

•If addressed word is not in cache for a write operation, write miss occurs.
•write-through
– information is written directly into main memory.
•Write-back
– block containing word is brought into cache, then the desired word in the cache is overwritten with the new information.

Mapping Function

•Direct Mapping

•Associative Mapping

•Set-Associative Mapping

Direct Mapping

•Block j of the main memory maps to j modulo 128 of the cache.

•More than one memory block is mapped onto the same position in the cache.

•Resolve the contention by allowing new block to replace the old block

•Memory address is divided into three fields:

- Low order 4 bits determine one of the 16 words in a block.

- When a new block is brought into the cache, the the next 7 bits determine which cache block this new block is placed in.

- High order 5 bits determine which of the possible 32 blocks is currently present in the cache. These are tag bits.

•Simple to implement but not very flexible

Associative Mapping

•Main memory block can be placed into any cache position.

•Memory address is divided into two fields:

- Low order 4 bits identify the word within a block.

- High order 12 bits or tag bits identify a memory block when it is

resident in the cache.

•Flexible, and uses cache space efficiently.

•Replacement algorithms can be used to replace an existing block in the

cache when the cache is full.

•Cost is higher than direct-mapped cache

Set-Associative Mapping

•Blocks of cache are grouped into sets.

•Mapping function allows a block of the main memory to reside in any

block of a specific set.

•Memory address is divided into three fields:

- 6 bit field determines the set number.

- High order 6 bit fields are compared to the tag

fields of the two blocks in a set.

•Set-associative mapping combination of direct and associative mapping.

•Number of blocks per set is a design parameter.

- One extreme is to have all the blocks in one set,

requiring no set bits (fully associative mapping).

- Other extreme is to have one block per set, is

the same as direct mapping.

Replacement Algorithms

•Difficult to determine which blocks to kick out
•Least Recently Used (LRU) block
•The cache controller tracks references to all blocks as computation proceeds.
•Increase / clear track counters when a hit/miss occurs
•For Associative & Set-Associative Cache
Which location should be emptied when the cache is full and a miss occurs?

– First In First Out (FIFO)

– Least Recently Used (LRU)

•Distinguish an Empty location from a Full one

– Valid Bit

FIFO- Replacement Algorithms

LRU-Replacement Algorithms

Consecutive words in a modules

Figure : Consecutive words in a modules

Consecutive words are placed in a module.

High-order k bits of a memory address determine the module.

Low-order m bits of a memory address determine the word within a module.

When a block of words is transferred from main memory to cache, only one module is busy at a time.

Figure : Consecutive words in consecutive modules

Consecutive words in Consecutive modules

• Consecutive words are located in consecutive modules.

• Consecutive addresses can be located in consecutive modules.

• While transferring a block of data, several memory modules can be kept busy at the same time.

Virtual Memory

Techniques that automatically move program and data blocks into the physical main memory when they are required for execution is called the Virtual Memory

The binary address that the processor issues either for instruction or data are called the virtual / Logical address.

When the desired data are in the main memory ,these data are fetched / accessed immediately.

If the data are not in the main memory, the MMU causes the Operating system to bring the data into memory from the disk.

Assume that program and data are composed of fixed-length units called pages.
A page consists of a block of words that occupy contiguous locations in the main memory.
Page is a basic unit of information that is transferred between
       secondary storage and main memory.
Size of a page commonly ranges from 2K to 16K bytes.
Pages should not be too small, because the access time
      of a secondary storage device is much larger than the main
       memory.
Pages should not be too large, else a large portion of the page may not be used, and it will occupy valuable space in the main memory.
Each virtual or logical address generated by a processor is interpreted as a virtual page number (high-order bits) plus an offset (low-order bits) that specifies the location of a particular byte within that page.
Information about the main memory location of each page is kept in the page table.
   -Main memory address where the page is stored.
   -Current status of the page.
Area of the main memory that can hold a page is called as page frame.
Starting address of the page table is kept in a page table base register.

Virtual address from processor

Associative-mapped TLB

• High-order bits of the virtual address generated by the processor select the virtual page.

•These bits are compared to the virtual page numbers in the TLB.

•If there is a match, a hit occurs and the corresponding address of the pageframe is read.

•If there is no match, a miss occurs and the page table within the main memory must be consulted.

•Set-associative mapped TLBs are found in commercial processors.

Recent Post

Tags

Popular Posts

Memory System

Internal Organization of Memory Chips

Figure :Organization of bit cells in a memory chip.

Semiconductor RAM Memories

Figure . Organization of a 1K ´ 1 memory chip.

No comments

Download App from Play store

Recent Posts

Comments

Facebook

My Blog List

Recent in Sports

Recent Post

Tags

Popular Posts

Memory System

Internal Organization of Memory Chips

Figure :Organization of bit cells in a memory chip.

Semiconductor RAM Memories

Figure . Organization of a 1K ´ 1 memory chip.

No comments

Social Counter

Download App from Play store

Recent Posts

Comments

Facebook

My Blog List

Recent in Sports