uocis  CIS 415 Operating Systems - Spring 2003

Discussion: Week 9 - Solution

File Organization

Consider the following types of files and classify them according to the given parameters.
In particular:
  • Is the file unsorted (U) or sorted by key field (K), chronologically (C) or by hash function on a key field (H)?
  • Does the file use fixed-length (F) or possibly variable-length (V) records?
  • How many indexes (N=none, S=single, M=multiple) are embedded in the file?
  • What type of index (P=partial, X=exhaustive, B=both) is most commonly used?
Sort[UKCH]Records[FV]Index(es)[NSM][PXB]
Pile CVN
Sequential KFN
Indexed SequentialKFSP
Indexed UVMB
Hashed H?N

What would be the best file organization to use for the following applications and why
There are many correct answers. These are what I would use.

  • Output from a simulator
    Pile file - Data can be sorted/indexed later.

  • Large financial database
    Indexed - Need fast searces on multiple fields.

  • Phone book (common use - i.e. look up only by name)
    Indexed Sequential - Fast searches on one key field.

  • Spell check dictionary (look up entries by how the word "sounds")
    Either Indexed or Hashed might work, but none of these file types are especially efficient for this application.


Just as a note, non-database files usually have a mix between pile and indexed organizations. The records are usually all of varying length and type (i.e. having different fields). The first few "header" records sometimes act as indexes to the other data records, but in other situations the records have a fixed order to them.

In this "binary" type of file, the OS does not know about the individual records and so just stores the file as one big, long string of 1's and 0's (similar to variable blocking with spanning).

File Storage

Consider a file whose records contain varying amounts of data. In this file, every even-numbered record (beginning with record 0) contains 100 bytes of data and every odd-numbered record contains 75 bytes of data. The file contains 100 records.

Suppose that the blocks on the hard disk are 550 bytes long.

How many blocks does this file occupy if Fixed Blocking is used?
With fixed blocking, it is necessary for all records to be the same length, namely 100 bytes long. That means we can use 5 records per block, so we have 100/5 = 20 blocks, with 50*20 = 1000 bytes of wasted space.

...if Spanned Variable Blocking is used?
There are 50*(100+75) = 8750 bytes in the file. This can fit into (8750/550) = 15.91 blocks. 16 blocks are thus necessary to hold this file with (16*550 - 8750) = 50 bytes of wasted space.

...if Unspanned Variable Blocking is used?
Each block can hold three 100-byte blocks and three 75-byte blocks. The file can fit into (100/6) = 16.67 blocks, so 17 blocks are necessary. There are (17*550 - 8750) = 600 bytes of wasted space.


Created by: Tim Singer June 3, 2003