CIS 415 Operating Systems - Spring 2003

Navigate

Discussion: Week 10 session 1

Memory Management

Virtual memory uses relocation (the fact that each page can be placed anywhere in memory). But suppose the system doesn't support relocation. A price is paid in the memory efficiency department because of the "Birthday" phenomenon. Trace the following sequence of page references through both systems (with and without relocation) and show where the page faults occur.

Memory capacity: 16 frames
Process P1: 3 pages, initially loaded in frames 0..2
Process P2: 10 pages, initially loaded in frames 3..12
Process P3: 5 pages, initially loaded in frames 2..6 but completely swapped out for now.

Reference string (pages): P1.0 P2.0 P3.1 P2.0 P2.1 P3.0 P1.2 P3.1 P3.2

An X represents a reference of a page already in memory. A * at the top represents a page fault with the replacement page beneath.
With relocation:

             *        *        *
0  P1.0 X
1  P1.1
2  P1.2                   X
3  P2.0   X      X
4  P2.1            X
5  P2.2
6  P2.3
7  P2.4
8  P2.5
9  P2.6
10 P2.7
11 P2.8
12 P2.9
13  -	      P3.1            X
14  -	               P3.0
15  -	                        P3.2

Without relocation:

             *    *      *    *    *    *
0  P1.0 X
1  P1.1
2  P1.2                 P3.0 P1.2
3  P2.0   X P3.1 P2.0             P3.1
4  P2.1               X                P3.2
5  P2.2
6  P2.3
7  P2.4
8  P2.5
9  P2.6
10 P2.7
11 P2.8
12 P2.9
13 -
14 -
15 -

Processes and Threads

Suppose you are writing a set of programs to be run concurrently and operate on some shared global data structure. The programs are identical, having 500K of code and operating on 1MB of global data. Each instance of the program also allocates 1MB of private data from the heap during its execution. There will be 10 instances of this program running at once on a uniprocessor system.

How much total memory is necessary if each instance is a separate process?
Each process has its own private code/data memory, but with one shared global portion for the data structure.
10*(1MB+500K) + 1MB = 16MB

How much is needed if each instance is a thread, and all instances are part of a single process? What other benefits does this approach give?
All threads share the same code memory.
10*1MB + 1MB + 500K = 11.5MB

If each instance periodically calls for I/O and the system's threads implementation is User-Level Threads, which of the two implementations would you choose?
Since in User-Level Threading I/O requests block the entire process, an efficient program would have to use multiple processes. The caveat: the OS must support sharing a portion of memory among the processes.

Scheduling

Trace the following processes through the given scheduling policies, showing their time on the processor in a Gannt chart:

Process	Arrival Time	Service Time
A	1	5
B	3	7
C	7	1
D	8	3
E	10	6

FCFS A
B
C
D
E

SRT A
B
C
D
E

SPN A
B
C
D
E

HRRN A
B
C
D
E

RR
q=1 A
B
C
D
E

RR
q=2 A
B
C
D
E

FB
q=1 A
B
C
D
E

FB
q=2ⁱ A
B
C
D
E

Scheduling

What does it mean for a process to be:

Ready?
The process is not waiting for any event and may be executed at any time.
Blocked?
The process is waiting for some event to trigger an interrupt.
Suspended?
None of the pages of the process are in main memory, and thus it cannot execute until some pages are swapped back in.

Are they mutually exclusive, or can a process have more than one of these states at a time?
"Blocked/Suspended" and "Ready/Suspended" states exist.

DMA, Caching

Consider a system with a 32-bit 100MHz memory bus. What is the effective memory bandwidth for random RAM-to-processor transfers, using 1 cycle to send the request and 1 cycle to read the data? (i.e. how much data can be transferred per second?)
Every transfer takes 2 cycles, so we can read four bytes 50 million times a second, for a total of 200MB/sec. Block transfers are not used because this is random access.

Now suppose that the DMA controller uses 2 out of every 6 bus cycles for its purposes. When in use, what is the effective memory bandwidth for random RAM-to-processor transfers?
The CPU now can use only 2/3 of the bus cycles, so its bandwidth is now effectively 2/3 * 200MB/sec = 133.33MB/sec.

Now suppose the processor would make 10 memory requests per bus cycle if it didn't have to wait for the memory transfer to complete. If the processor cache has a 99% hit rate, then how many bus cycles will be left idle while DMA is in use?
The processor generates an external memory access due to a cache miss once every 10 bus cycles, taking 2 of every 10 cycles for 1/5th of the available bus bandwidth. The DMA takes 1/3 of the bus bandwidth. We can estimate that the bus is idle (1 - (1/5 + 1/3)) of the time.
This works out to (1 - 8/15) = 7/15, just under half of the cycles.

What really happens is that the DMA uses its cycles at regular intervals:
|- - - - D D|- - - - D D|...
When we add in the CPU memory requests, we see that the second memory request falls during the DMA access, so it must wait until afterward:
|M M - - D D|- - - - D D|M M...
This creates a repeating pattern of 12 cycles with 6 free, so exactly half of the cycles are idle.

Semaphores

Suppose Reza and his two GTF's hold their office hours at the same time, and whoever is not busy serves the next student. There are three general questions that students need help with, and each instructor can help up to three people with the same question, but the question is not known to the instructor until the first student in the group is selected. Assume an endless supply of students with each question.

Implement this system using semaphores in pseudocode.

Semaphores (you may add to these or ignore these at your peril/convenience):

BinarySemaphore instructorAvailable; //initialized to 0
CountingSemaphore ask[3]; //initialized to 3, 1 per question
CountingSemaphore ready[3]; //initialized to 0, 1 per instructor
CountingSemaphore release[3]; //initialized to 0, 1 per instructor
CountingSemaphore leave[3]; //initialized to 0, 1 per instructor
BinarySemaphore studentMutex[3]; //initialized to 1, 1 per question
BinarySemaphore instructorMutex; //initialized to 1

boolean group[3] = {false, false, false};
int asking[3] = {0, 0, 0};
int instructor;

Student(q) {
   Wait(ask[q]);
   Wait(studentMutex[q]);
   if(!group[q]) {
      Wait(instructorAvailable);
      group[q] = true;
   }
   Signal(studentMutex[q]);
   int i = instructor;
   asking[i] = q;
   Signal(ready[i]);
   
   // go to office and listen

   Wait(release[i]);
   Signal(leave[i]);
}

Instructor(id) {
   while(true) {
      Wait(instructorMutex);
      instructor = id;
      Signal(instructorAvailable);
      for(int i = 0; i < 3; i++)
        Wait(ready[i]);
      int q = asking[id];
      group[q] = false;
      for(int j = 0; j < 3; j++)
        Signal(ask[q]);
      Signal(instructorMutex);

      // go to office and teach

      for(int k = 0; k < 3; k++)
        Signal(release[id]);
      for(int l = 0; l < 3; l++)
        Wait(leave[id]);
   }
}