1 / 44

This lecture…

This lecture…. Options for managing memory: Paging Segmentation Multi-level translation Paged page tables Inverted page tables Comparison among options. Hardware Translation Overview. Physical address. Think of memory in two ways: View from the CPU – what program sees, virtual memory

lev
Download Presentation

This lecture…

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. This lecture… • Options for managing memory: • Paging • Segmentation • Multi-level translation • Paged page tables • Inverted page tables • Comparison among options

  2. Hardware Translation Overview Physical address • Think of memory in two ways: • View from the CPU – what program sees, virtual memory • View from memory – physical memory • Translation implemented in hardware; controlled in software. • There are many kinds of hardware translation schemes. • Start with the simplest! Virtual address Translation Box (MMU) Physical memory CPU Data read or write (untranslated)

  3. Base and Bounds • Each program loaded into contiguous regions of physical memory, but with protection between programs. • First built in the Cray-1. • relocation: physical addr = virtual addr + base register • protection: check that address falls in (base, base+bound) bounds base virtual address physical address yes Memory + CPU < no MMU error

  4. Base and Bounds • Program has illusion it is running on its own dedicated machine, with memory starting at 0 and going up to size = bounds. • Like linker-loader, program gets contiguous region of memory. • But unlike linker-loader, protection: program can only touch locations in physical memory between base and base + bounds. 0 6250 Code Data stack bound Virtual memory 6250 + bound Physical memory

  5. Base and Bounds • Provides level of indirection: OS can move bits around behind the program’s back, for instance, if program needs to grow beyond its bounds, or if need to coalesce fragments of memory. • Stop program, copy bits, change base and bounds registers, restart. • Only the OS gets to change the base and bounds! Clearly, user program can’t, or else lose protection.

  6. Base and Bounds • With base&bounds system, what gets saved/restored on a context switch? • Everything from before + base/limit values • Complete contents of memory out to disk (Called “Swapping”) • Hardware cost: • 2 registers, Adder, Comparator • Plus, slows down hardware because need to take time to do add/compare on every memory reference.

  7. Base and bound tradeoffs • Pros: • Simple, fast • Cons: • Hard to share between programs • For example, suppose two copies of “vi” • Want to share code • Want data and stack to be different • Can’t do this with base and bounds! • Complex memory allocation • Doesn’t allow heap, stack to grow dynamically – want to put these as far apart as possible in virtual memory, so that they can grow to whatever size is needed.

  8. Base and bound: Cons (complex allocation) • Variable-sized partitions • Hole – block of available memory; holes of various size are scattered throughout memory. • New process allocated memory from hole large enough to fit it • Operating system maintains information about:a) allocated partitions b) free partitions (hole) OS OS OS OS process 5 process 5 process 5 process 9 process 9 10 arrive 9 arrive 8 done process 10 process 8 5 done process 2 process 2 process 2 process 2

  9. Dynamic Storage-Allocation Problem • How to satisfya request of size n from a list of free holes? • First-fit: Allocate the first hole that is big enough. • Best-fit: Allocate the smallest hole that is big enough; must search entire list, unless ordered by size. Produces the smallest leftover hole. • Worst-fit: Allocate the largest hole; must also search entire list. Produces the largest leftover hole. • First-fit and best-fit better than worst-fit in terms of speed and storage utilization. • Particularly bad if want address space to grow dynamically (e.g., the heap).

  10. Internal Fragmentation • Internal Fragmentation – allocated memory may be slightly larger than requested memory but not being used. OS Process 7 Process 4 request for 18,462 bytes Process 4 Hole of 18,464 bytes Internal fragment of 2 bytes Process 2

  11. External Fragmentation • External Fragmentation - total memory space exists to satisfy request but it is not contiguous • 50-percent rule: one-third of memory may be unusable. • Given N allocated blocks, another 0.5N blocks will be lost due to fragmentation. OS 50k process 3 ? 125k Process 9 process 8 100k process 2

  12. Compaction • Shuffle memory contents to place all free memory together in one large block • Only if relocation dynamic! • Same I/O DMA problem OS OS OS 50k process 3 process 3 90k process 8 125k Process 9 60k process 8 process 8 process 3 100k process 2 process 2 process 2

  13. Segmentation • A segmentis a region of logically contiguous memory. • Idea is to generalize base and bounds, by allowing a table of base&bound pairs. • Virtual address: <segment-number, offset> • Segment table – maps two-dimensional user defined address into one-dimensional physical address • base - starting physical location • limit - length of segment • Hardware support • Segment Table Base Register • Segment Table Length Register

  14. Segmentation

  15. Segmentation example • Assume 14 bit addresses divided up as: • 2 bit segment ID (1st digit), and a 12 bit segment offset (last 3). physical memory Virtual memory 0 0 4ff Seg base limit 6ff 1000 • 0 code 0x4000 0x700 • 1 Data 0x0000 0x500 • - • 3 Stack 0x2000 0x1000 14ff 2000 2fff 3000 3fff Segment table where is 0x0240? 0x1108? 0x265c? 0x3002? 0x1600? 4000 46ff

  16. Observations about Segmentation • This should seem a bit strange: the virtual address space has gaps in it! • Each segment gets mapped to contiguous locations in physical memory, but may be gaps between segments. • But a correct program will never address gaps; if it does, trap to kernel and then core dump. • Minor exception: stack, heap can grow. • In UNIX, sbrk() increases size of heap segment. • For stack, just take fault, system automatically increases size of stack.

  17. Observations about Segmentation cont’d • Detail: Need protection mode in segmentation table. • For example, code segment would be read-only (only execution and loads are allowed). • Data and stack segment would be read-write (stores allowed). • What must be saved/restored on context switch? • Typically, segment table stored in CPU, not in memory, because it’s small. • Might store all of processes memory onto disk when switched (called “swapping”)

  18. Segment Translation Example • Example: What happens with the segment table shown earlier, with the following as virtual memory contents? Code does: strlen(x); Physical memory Initially PC = 240 Virtual memory x: 108 666 … Main: 4240 store 1108, r2 4244 store pc +8, r31 4248 jump 360 424c … ... Strlen: 4360 loadbyte (r2), r3 … 4420 jump (r31) Main: 240 store 1108, r2 244 store pc +8, r31 248 jump 360 24c … … Strlen: 360 loadbyte (r2), r3 … 420 jump (r31) … x: 1108 a b c \0

  19. Segmentation Tradeoffs • Pro: • Efficient for sparse address spaces • Multiple segments per process • Easy to share whole segments (for example, code segment) • Don’t need entire process in memory!!! • Con: • Complex memory allocation • Extra layer of translation speed = hardware support • Still need first fit, best fit, etc., and re-shuffling to coalesce free fragments, if no single free space is big enough for a new segment. • How do we make memory allocation simple and easy?

  20. Paging • Logical address space can be noncontiguous; process is allocated physical memory whenever available. • Divide physical memory into fixed-sized blocks called frames. • Divide logical memory into blocks of same size called pages(page size is power of 2, 512 bytes to 16 MB). • Simpler, because allows use of a bitmap. What’s a bitmap? 001111100000001100 • Each bit represents one page of physical memory – 1 means allocated, 0 means unallocated. • Lots simpler than base&bounds or segmentation

  21. Address Translation Architecture • Operating system controls mapping: any page of virtual memory can go anywhere in physical memory. virtual address physical address f Phys frame # Offset Virtual page # Offset CPU d No < Page table size error p Page table yes PTBR Phys frame # physical memory Page table

  22. Paging Example

  23. Paging Example

  24. Paging Tradeoffs • What needs to be saved/restored on a context switch? • Page table pointer and limit • Advantages • no external fragmentation (no compaction) • relocation (now pages, before were processes) • Disadvantages • internal fragmentation • consider: 2048 byte pages, 72,766 byte proc • 35 pages + 1086 bytes = 962 bytes fragment • avg: 1/2 page per process • small pages! • overhead • page table / process (context switch + space) • lookup (especially if page to disk)

  25. Free Frames • Frame table: keeps track of which frames are allocated and which are free. Free frames (a) before allocation (b) After allocation

  26. Implementation of Page Table • Page table kept in registers • Fast! • Only good when number of frames is small • Expensive! • Instructions to load or modify the page-table registers are privileged. Registers Memory Disk

  27. Page 1 Page 0 2 Page 0 Page 1 1 Implementation of Page Table • Page table kept in main memory • Page Table Base Register (PTBR) • Page Table Length • Two memory accesses per data/inst access. • Solution? Associative Registers ortranslation look-aside buffers(TLBs). 0 2 1 0 1 PTBR 1 2 Page table Virtual memory 3 Physical memory

  28. Associative Register • Associative memory – parallel search • Address translation (A´, A´´) • If A´ is in associative register, get frame # out. • Otherwise get frame # from page table in memory • TLB full – replace one (LRU, random, etc.) • Address-space identifiers (ASIDs): identifies each process, used for protection, many processes in TLB Page # Frame #

  29. Translation Look-aside Buffer (TLB)

  30. Paging Hardware With TLB 10-20% mem time (Intel P3 has 32 entries) (Intel P4 has 128 entries)

  31. Effective Access Time • Associative Lookup =  time unit • Assume memory cycle time is 1 microsecond • Hit ratio – percentage of times that a page number is found in the associative registers; ratio related to number of associative registers. • Hit ratio =  • Effective Access Time (EAT) EAT = (1 + )  + (2 + )(1 – ) = 2 +  –  • Example: • 80% hit ratio,  = 20 nanoseconds, memory access time = 100 nanoseconds • EAT = 0.8 x 120 + 0.20 x 220 = 140 nanoseconds

  32. Memory Protection • Protection bits with each frame • “valid” - page in process’ logical address space • “invalid” - page not in process’ logical address space. • Store in page table • Expand to more perms • 14-bit address space – • 0 to 16,383 • Program’s addresses – • 0 to 10,468 • beyond 10,468 is illegal • Page 5 classified as valid • Due to 2K page size, • internal fragmentation

  33. Multilevel Paging • Most modern operating systems support a very large logical address space (232 or 264). • Example • logical address space = 32 bit • suppose page size = 4K bytes (212) • page table = 1 million entries (232/212 = 220) • each entry is 4 bytes, space required for page table = 4 MB • Do not want to allocate the page table contiguously in main memory. • Solution • divide the page table into smaller pieces (Page the page table)

  34. Two-Level Paging page number page offset p1 p2 d 10 10 12 p1 – index into outer page table p2 – displacement within the page of the page table On context-switch: save single PageTablePtr register

  35. Address-Translation Scheme • Address-translation scheme for a two-level 32-bit paging architecture

  36. Paging + segmentation: best of both? • simple memory allocation, • easy to share memory, and • efficient for sparse address spaces Virtual address Physical address virt seg # virt page # offset phys frame# offset No page-table page-table base size error > yes Segment table Physical memory + Phys frame # Page table

  37. Paging + segmentation • Questions: • What must be saved/restored on context switch? • How do we share memory? Can share entire segment, or a single page. • Example: 24 bit virtual addresses = 4 bits of segment #, 8 bits of virtual page #, and 12 bits of offset. Physical memory Segment table What do the following addresses translate to? 0x002070? 0x201016 ? 0x14c684 ? 0x210014 ? Page-table base Page-table size 0x2000 0x14 – – 0x1000 0xD – – 0x1000 0x6 0xb 0x4 … 0x2000 0x13 0x2a 0x3 … portions of the page tables for the segments

  38. Multilevel translation • What must be saved/restored on context switch? • Contents of top-level segment registers (for this example) • Pointer to top-level table (page table) • Pro: • Only need to allocate as many page table entries as we need. • In other words, sparse address spaces are easy. • Easy memory allocation • Share at segment or page level (need additional reference counting) • Cons: • Pointer per page (typically 4KB - 16KB pages today) • Page tables need to be contiguous • Two (or more, if > 2 levels) lookups per memory reference

  39. Hashed Page Tables • What is an efficient data structure for doing lookups? Hash table. • Why not use a hash table to translate from virtual address to a physical address. • Common in address spaces > 32 bits. • Each entry in the hash table contains a linked list of elements that hash to the same location (to handle collisions). • Take virtual page #, run hash function on it, index into hash table to find page table entry with physical page frame #.

  40. Hashed Page Table

  41. Hashed Page Table • Independent of size of address space, • Pro: • O(1) lookup to do translation • Requires page table space proportional to how many pages are actually being used, not proportional to size of address space – with 64 bit address spaces, this is a big win! • Con: • Overhead of managing hash chains, etc. • Clustered Page Tables • Each entry in the hash table refers to several pages (such as 16) rather than a single page.

  42. Inverted Page Table • One entry for each real (physical) page of memory. • Entry consists of the virtual address of the page stored in that real memory location, with information about the process that owns that page. • Address-space identifier (ASID) stored in each entry maps logical page for a particular process to the corresponding physical page frame.

  43. Inverted Page Table Architecture

  44. Inverted Page Table • Pro: • Decreases memory needed to store each page table • Con: • increases time needed to search the table • Use hash table to limit the search • One virtual memory reference requires at least two real memory reads: one for the hash table entry and one for the page table. • Associative registers can be used to improve performance.

More Related