[Home | Assignments | Links | Grades | CIS 313]

CIS 323 - Assignment 2

Binary Search Trees


Due Thursday, February 4, 2010

Description

You must write your code in C++ for this assignment.

The goal of the assignment is to get more familiar with implementing data structures and analyzing running time.

Task

  1. Implement a binary search tree data structure with insert and find operations.
  2. Expand your Linked List implementation from last week to include a find operation.
  3. Compare the running time of the find operation for the binary search tree and the linked list for selected inputs.
You can find the relevant data structures and algorithms in Cormen, chapter 10 and 12. Do not sort the elements. Just use the standard implementations.

What to turn in

  1. Your linked list implementation, conforming to the input/output specification shown below.
  2. Your binary search tree implementation, conforming to the input/output specification shown below.
  3. A small discussion of the running time for your implementations, using the supplied testcases. This should include:
    • Which data structure performs best in the worst case?
    • What is the difference between the testcases that have "worst" in the name and the ones that have "better" and why might "better" be better? (hint: See Cormen section 12.4)
    • How does your tests demonstrate that find takes linear time O(n) in linked lists?
    • How does your tests demonstrate that find takes time proportial to the height of the tree O(h) in a binary search tree?

Timing your code

We are interested in seing how the find operation changes with the input size. Therefore, we cannot (as I have previously said) use the unix time command. You will have to use C++ built-in functions for this.

I use the clock_t datatype. I have supplied a full working example of a linked list implementation using this method right here

Extra testcases (if needed)

Your system might be faster (or slower) than mine so you will get very low running times no matter what. I have created a program so you can make your own testcases. You can find the program here

There is three different ways of running the program:

java -jar assn2_generator.jar
java -jar assn2_generator.jar numNodes numFinds
java -jar assn2_generator.jar numNodes numFinds numTestcases

numNodes defaults to 10000, numFinds defaults to 50000 and numTestcases defaults to 4.

The number of testcases determines the number of input nodes in each testcase. The number of input nodes increments for each testcase but the number of finds does not. Remember, we are interested in measuring find running time as a function of input size increases.

Example: To create 5 testcases with 100000 finds in each and increments of 20000 input nodes, i would do:

java -jar assn2_generator.jar 20000 100000 5

Input/Output

You can find correct input/output examples for both the linked list and the binary search tree here

The input will be formatted in the following way:

  1. A non-negative integer on its own line representing the number of integers n to be inserted.
  2. The n integers to insert — each on their own line.
  3. A non-negative integer on its own line representing the number of integers m you will find.
  4. The m integers to find — each on their own line.
  5. You may assume that no duplicate integers will be inserted and a test case on the order of 50,000+ insertions (plus some significant number of finds) will be used to evaluate your program.

    Here is an example:

    Input:
    5
    30
    40
    20
    50
    10
    6
    20
    30
    30
    50
    100
    200

    The above input will produce, conceptually, the following data structures:

    Linked List:
    30 –> 40 –> 20 –> 50 –> 10
    
    Binary Search Tree:
         30
       /   \
      20   40
     /       \
    10       50
    
    The output is based on the find operation. For both linked lists and binary search trees we measure the number of nodes that was checked in order to find the correct node, including the found node itself. If a node is not found, a -1 should be printed.

    Output (linked list implementation):

    3
    1
    1
    4
    -1
    -1

    Output (binary search tree implementation):

    2
    1
    1
    3
    -1
    -1

    Notice that the linked list is slower in this example. This is not always the case, as we discussed in class.

    You should conform exactly to the input and output specification, as I will be using massive test cases and checking your output against the correct output. Just like last time.

    Submission

    No further submission possible.
Last updated: February 8, 2010 | David Lebech