Kinesia Online Course
Operating Systems
Kinesia LLC, 2003

  1. Introduction
  2. Processes
  3. Inter Process Communication
  4. Deadlocks
  5. Memory Management
 
  1. File Systems
  2. Protection and Security
  3. I/O Systems


The fruit of silence is PRAYER. The fruit of prayer is
FAITH. The fruit of faith is LOVE. The fruit of love
is SERVICE.  The fruit of service is PEACE.

				MOTHER TERESA
Memory Management
  1. Introduction

    Memory Management without swaping or paging:

  2. Monoprogramming
    e.g. MS-DOS, running one process at a time

    Multiprogramming
    Advantages:

    • easier to program by splitting into 2 or more processes
    • process spend time to wait for I/O to complete

      Example:

      A process spends fraction p of its time in I/O wait.

      if n process in memory

      CPU is not utilized if all processes are waiting

        Probability Pwait = pn

        Thus CPU utilization α = 1 - Pwait = 1 - pn

      n -- degree of multiprogramming

      e.g p = 1/2

      n &alpha
      1 1/2 ( 50% )
      2 3/4 ( 75% )
      3 7/8 ( 87.5% )
      .. ..
      10 99.9%

  3. Swapping

    moving processes from main memory to disk and back

    variable partition

    A
    B
    C
     
    D
    hole
    B
    E
    hole
    external fragmentation

    memory compaction -- combine all holes into one big hole ( it may require a lot of time; mainframe has special hardware to handle compaction )

    consider a hole ~ 10244 bytes

    next process requires 10242 bytes

    2 bytes left, overhead to keep track of the 2 bytes is much larger than 2 bytes

    so better allocate the 2 bytes to the process → internal fragmentation

    Memory allocation with swapping

    Use a linked list to handle dynamic storage allocation

    A set of holes with various sizes scattered throughout memory

  4. first fit -- allocate the first hole that is big enough; broken up into 2 pieces, one for process and one for unused

  5. next fit -- starts searching from where it is left off

  6. best fit -- allocate the smallest hole that is big enough => smallest leftover hole

  7. worst fit -- allocate largest hole => largest leftover hole

  8. quick fit -- maintains separate lists of holes of similar sizes
  9. Virtual Memory

  10. Paging

    • old days, program split into pieces -- overlay, overlay0, overlay1 swapped by OS but program split was done by programmers
    • modern computers have special hardware called a memory management unit (MMU). Whenever the CPU wants to access memory (whether it is to load an instruction or load or store data), it sends the desired memory address to the MMU, which translates it to another address before passing it on the the memory unit.
    • address generated by the CPU -- virtual address,
    • address translated to by the MMU -- physical address
    • virutal address space is broken up into equal-sized units -- pages
    • the corresponding units in physical memory is called page frames
    • page size = page frame size
    • page tables
    • logical address ( virtual address ) is mapped into physical address

      Example

      page     64K virtual
                    address space
       32K main mem   page frame
      0 0
      4K
      2
      1  
      8 K
      1
      2  
      12 K
      6
      3  
      16 K
      0
      4  
      20 K
      4
      5  
      24 K
      3
      6  
      28 K
      X
      7  
      32 K
      X
      8  
      36 K
      X
      9  
      40 K
      5
      10  
      44 K
      X
      11  
      48 K
      7
      12  
      52 K
      X
      13  
      56 K
      X
      14  
      60 K
      X
      15  
      64 K
      X
       
        0
      4K  
      0
         
      8 K
      1
         
      12 K
      2
         
      16 K
      3
         
      20 K
      4
         
      24 K
      5
         
      28 K
      6
         
      32 K
      7

    • MOV register, [ 36K + 10 ]
        page table => page frame 5

        Thus, MMU maps the address to 20K + 10 and put it on the bus

    • MOV register, [52K]
        page fault,

        CPU generates a trap
        put a page frame back on disk
        and put required page to main memory

      e.g. want to access location 8196

      8196 = 8192 + 4 = 8K + 4     so it is in page 2

      Virtual Address
      page # 4K address
      0   0   1   0 0 . . .              . . . 0 1 0 0
      15  0

      page
      #
      page
      frame
      0 0 1 0 1
      1 0 0 1 1
      2 1 1 0 1
      3 0 0 0 1
      4 1 0 0 1
      5 0 1 1 1
      6 0 0 0 0
      7 0 0 0 0
      8 0 0 0 0
      9 1 0 1 1
      10 0 0 0 0
      11 1 1 1 1
      12 0 0 0 0
      13 0 0 0 0
      14 0 0 0 0
      15 0 0 0 0

      valid/invalid
       
       
       
       
       
       
       
       
      1 1 0 0 ... ... 0 1 0 0

    • The tables used by the MMU have a valid bit for each page in the virtual address space. If this bit is set, the translation of virtual addresses on a page proceeds as normal. If it is clear, any attempt by the CPU to access an address on the page generates an interrupt called a page fault trap
    • Heavily used programs such as assembler, compiler, data base systems and so on can be shared among different users. The only condition for it is the code must be reentrant. It is crucial the correct functionality of shared paging scheme that the pages are unchanged. If one user were to change a location, it would be changed for all other users.

  11. Segmentation

    In paging, user's view of memory are separated from actual physical memory

    user's view of memory -- a collection of variable-sized segments, e.g. stacks, data, symbol table, main program, functions, ...

    Segmentation is a memory-management scheme that supports its user view of memory; a logical memory is a collection of segments:

    • A logical address consists of a segment number s and an offset d.
    • The segment number s is used as an index into the segment table.
    • Each entry in the segment table has a segment base, which points where the physical memory begins and a segment limit which points where physical memory ends. Therefore, the offset d must be between 0 and the limit

    Implementation of segment tables:

    • The segment table can be kept either in fast registers or in system memory.
    • kept in registers: can be very quickly referenced
    • kept in system memory: needed when a program has a large number of segments; a Segment Table Base Register STBR which points to the segment table in memory, and a Segment Table Length Register STLR which limits the possible manageable number of segment are implemented.

  12. Page Replacement Algorithms

    • Optimal Page Replacement

      replace the page that will not be used for the longest period of time

      guarantees lowest page fault but almost impossible to implement

      mainly for comparison

    • Not-Recently-Used ( NRU ) Page Replacement

      associate with each page 2 bits:

      R -- referenced bit
      M -- modified bit ( dirty bit )

      R is set on any read/write to the page

      M is set when the page is written

      classfunctions R M
      0not referenced, not modified     0 0
      1not referenced, modified 0 1
      2referenced, not modified 1 0
      3referenced, modified 1 1

      NRU removes a page at random from the lowest nonempty class

      easy to understand

      efficient to implement

    • First-in First-out ( FIFO ) Replacement

      replace the oldest page

      second chance

      reference bit R
      inspect oldest page
      if R = 0, page is replaced immediately
      if R = 1, then set R → 0 and page is put at end of queue and inspect next oldest page

      Belady's anomaly
      page
      reference
      youngest    oldest page
      fault
      -----
      00--Fault
      110-F
      2210F
      3321F
      0032F
      1103F
      4410F
      0410N
      1410N
      2241F
      3324F
      4324N
      9 faults
      3 page frames, 5 pages
       
      page
      reference
      youngest    oldest page
      fault
      ------
      00---Fault
      110--F
      2210-F
      33210F
      03210N
      13210N
      44321F
      00432F
      11043F
      22104F
      33210F
      44321F
      10 faults
      4 page frames, 5 pages

    • Least-Recently-Used ( LRU ) Replacement

      select the page that has not been used for the longest time

      expensive, need a linked list of all pages in memory

      clock algorithm -- an efficient way to approximate LRU

         uses circular queue and reference bits as mechanisms
      
         current   |  new value
         ref. bit  |  ref. bit       action
        -----------+------------------------------------------------------
                   |
             0     |      0       replace this page, move pointer forward
                   |
             1     |      0       skip this page for now, continue search
      
         
      two-handed clock uses both reference and modified bits
      
          current   |  new value
         ref. mod.  |  ref. mod.       actions
        ------------+------------------------------------------------------
                    |
          0    0    |   0    0      replace this page (when any scheduled
                    |                  write back on it is completed), move
                    |                  pointer forward
                    |
          0    1    |   0    0      clean the page (i.e., write back), skip
                    |                  this page for now, continue search
                    |
          1    0    |   0    0      skip this page for now, continue search
                    |
          1    1    |   0    1      skip this page for now, continue search
         
  13. Design Issues for Paging System

  14. The Working Set Model

    demand paging -- pages are loaded on demand only, not in advance; only the pages that are needed to run the program before the program will be swapped out, will be swapped in; Through this, swapping time and amount of physical memory required can be decreased.

    locality -- processes tend to reference storage in nonuniform highly localized pattern

    • temporal ( time ) locality -- storage location referenced recently are likely to be referenced in near future, e.g.
    • stacks
    • subroutines
    • looping
    • spatial locality -- storage references tend to be clustered so that once a location is referenced, it is highly likely that nearby locations will be referenced e.g.
    • array traversals
    • sequatial code execution

    Working set -- the set of pages that a process is currently using

    Denning

    W( Δ , t ) -- at time t, the set consists of all pages used by most recent memory reference

    Δ -- working set window

    if two many processes are running, page faults may occur in every few instructions; this occurs when the total size of all working set exceeds the number of frames → thrashing

    to avoid thrashing, we can keep track of each process's working set and make sure that it is in memory before letting the process in ( prepaging )
    A process will never be executed unless its working set is resident in main memory. Pages outside the working set may be discarded at any time

    use the recent needs of a process to predict its future needs

    aging bit -- if a page is not referenced for a certain amount of time, it will be dropped from the working set

    Total number of frames D = ∑iWi
    if D > m ( number of frames ), thrashing occurs

  15. Local Users' Global Allocation

    local -- each process process is allocated a fixed amount of memory

    global -- dynamically allocate page frames among the runnable processes

    in general global works better

    figure

  16. Page Size

    on average, half of the final page is wasted ( internal fragmentation )

    if n segments, page size = p bytes,

      n x p / 2   bytes wasted
    this may lead to the conclusion of smaller page size is better

    But small pages => large page table

    suppose

      each table entry requires e bytes

      page size = p bytes

      wasted memory in each page = p / 2

      average process size = s bytes

      number of pages needed per process = s / p

      total table space = s x e / p

      total overhead H = s x e
      p
      + p
      2

      Minimize

      0 = d H
      d p
      = - s x e
      p2
      + 1
      2

      => p = ( 2se ) ½

      if s = 32K, e = 8 bytes, p ~ ( 2 x 32K x 8 ) ½ ~ 724 ( bytes )

      ~ 512, 1 K, 2K, 4K