[Home | Assignments | Links | Grades | CIS 313]
CIS 323 - Assignment 2
Binary Search Trees
Due Thursday, February 4, 2010
Description
You must write your code in C++ for this assignment.
The goal of the assignment is to get more familiar with implementing data structures and analyzing running time.
Task
- Implement a binary search tree data structure with insert and find operations.
- Expand your Linked List implementation from last week to include a find operation.
- Compare the running time of the find operation for the binary search tree and the linked list for selected inputs.
What to turn in
- Your linked list implementation, conforming to the input/output specification shown below.
- Your binary search tree implementation, conforming to the input/output specification shown below.
- A small discussion of the running time for your implementations, using the supplied testcases. This should include:
- Which data structure performs best in the worst case?
- What is the difference between the testcases that have "worst" in the name and the ones that have "better" and why might "better" be better? (hint: See Cormen section 12.4)
- How does your tests demonstrate that find takes linear time O(n) in linked lists?
- How does your tests demonstrate that find takes time proportial to the height of the tree O(h) in a binary search tree?
Timing your code
- Testcases can be found here: inputs.
- They all have 50 thousand find operations and 10, 20, 30 or 40 thousand insert operations, respectively.
- Notice that there is two versions, worst and better. I will leave it up to you to figure out why they are named so.
We are interested in seing how the find operation changes with the input size. Therefore, we cannot (as I have previously said) use the unix time
command. You will have to use C++ built-in functions for this.
I use the clock_t
datatype. I have supplied a full working example of a linked list implementation using this method right here
Extra testcases (if needed)
Your system might be faster (or slower) than mine so you will get very low running times no matter what. I have created a program so you can make your own testcases. You can find the program here
There is three different ways of running the program:
java -jar assn2_generator.jar
java -jar assn2_generator.jar numNodes numFinds
java -jar assn2_generator.jar numNodes numFinds numTestcases
numNodes
defaults to 10000, numFinds
defaults to 50000 and numTestcases
defaults to 4.
The number of testcases determines the number of input nodes in each testcase. The number of input nodes increments for each testcase but the number of finds does not. Remember, we are interested in measuring find running time as a function of input size increases.
Example: To create 5 testcases with 100000 finds in each and increments of 20000 input nodes, i would do:
java -jar assn2_generator.jar 20000 100000 5
Input/Output
You can find correct input/output examples for both the linked list and the binary search tree here
The input will be formatted in the following way:
- A non-negative integer on its own line representing the number of integers n to be inserted.
- The n integers to insert — each on their own line.
- A non-negative integer on its own line representing the number of integers m you will find.
- The m integers to find — each on their own line.
- Input:
-
5 30 40 20 50 10 6 20 30 30 50 100 200 -
The above input will produce, conceptually, the following data structures:
-
Linked List: 30 –> 40 –> 20 –> 50 –> 10 Binary Search Tree: 30 / \ 20 40 / \ 10 50
The output is based on the find operation. For both linked lists and binary search trees we measure the number of nodes that was checked in order to find the correct node, including the found node itself. If a node is not found, a -1 should be printed. Output (linked list implementation):
-
3 1 1 4 -1 -1 Output (binary search tree implementation):
-
2 1 1 3 -1 -1
You may assume that no duplicate integers will be inserted and a test case on the order of 50,000+ insertions (plus some significant number of finds) will be used to evaluate your program.
Here is an example:
Notice that the linked list is slower in this example. This is not always the case, as we discussed in class.
You should conform exactly to the input and output specification, as I will be using massive test cases and checking your output against the correct output. Just like last time.