class logo

CS 122 - Day 12 Exercises

This page lists some problems for today's class. After today, you should be able to solve these problems on your own. Problems with this color are more challenging.

Stock Price Study

We have some data files that contain historical stock price data for several companies. The data is in a comma-delimited format with daily price and trading volume for the entire history of the stock. For this class, we'll just work with a couple of stocks, but you could download this information for any stock from yahoo.com and many other websites. I've got data for the following companies:

Download all three files and save them in the same directory as your program so that your program can load them on demand.

Our goal is to write a program that analyzes this data, we'll start simply and then add features to our program.

1) Maximum Price and Volume

Write a program that asks the user which stock they'd like to analyze, and then finds the highest price the stock ever hit. Think about how to structure the program before you start writing code. You'll need to read in the data file, and then split up each line using the comma character as the delimiter. The split function takes an argument that is the delimiter to use, calling it would look something like this: parts = line.split(',') Don't forget to convert the values to floats or ints if appropriate before doing comparisons on them.

Once you get that working, have it also print the day that the maximum price occurred and the trading volume on that day.

Output from the program might look something like this:

Welcome to the stock analysis tool We have data for the following symbols: goog, msft, appl Enter the symbol of the stock to analyze : goog The maximum (adjusted) price goog reached is 741.79 this occurred on 2007-11-06 when 8436300 shares were traded The maximum trading volume for goog was 41116700 shares this occurred on 2006-01-20, the price closed at 399.46

My solution

1.a) Functions

Convert your existing program to use a function to find the maximum price. Your function doesn't need to return a value, it can just print out the results that it finds.

Once you get that working, add another function that finds the day with the maximum trading volume and prints out the date, volume, and price for that day.

My solution

2) Tuples

Look at your code to solve problem 1.A above. You should have at least two functions, and when you look at your functions you will probably see that they both repeat some steps. What the repetition is will be different for different people, but there's probably something.

In my solution, I read the file into a list of strings and then passed that list to both of the functions. This means that each function needs to know how to parse the line (split it up into chunks using the comma as the delimiter). This works, but it makes your functions less general: they will only work for comma-delimited files.

That is a silly restriction for your functions to have, they ought for any list of data that has the date, price, and volume. Is there some way that we could simplify things so that we just parse the data once, and then work with the data itself rather than strings?

For this part of the exercise, change your code so that it calls a function to read the file. The function that reads the file should split each line up into it's parts, and build a tuple from those parts. It should add the tuple to a list, and when it is finished, return the list of tuples. Each tuple should be (date, volume, price), where date is a string, volume is an int, and price is a float. To build the tuple, do something like this:

parts = line.split(',') date = parts[0] volume = int(parts[5]) price = float(parts[6]) tuple = (date, volume, price) tupleList.append(tuple)

Now you can change your other functions to accept a list of tuples. Instead of needing to parse the strings and split it each time they run, they'll just be able to access the data elements directly.

My solution

2.A) Another data format

IBM is an old company and likes to do things there own way. What if we wanted to add IBM to our analyzer, but it's data came in a tab-delimited format instead of a comma delimited format. Let's update our program so that the function that reads the data file is smart enough to parse lines from the IBM file using the tab character as a delimiter ('\t') and everything else using the comma as a delimiter.

Here's the ibm data file

Note that you should not need to change your functions for finding the max volume and max price because they don't work with strings from the file, they take tuples of data. The function that reads the file is the only one that needs to change. This is why we changed those functions to use tuples, it makes them more general so that we can reuse them more easily.

My solution

3) Biggest jump

Let's add a function to find the biggest jump in price between two consecutive days. To do this, you'll need to compare the price from one day with the price from the next day, but when we iterate through a list we only see one day's prices at a time. There's a trick to doing this, create an extra variable that holds yesterday's price.

# assume data is a list of tuples yesterdaysPrice = 0 for tuple in data: (date, volume, price) = tuple difference = price - yesterdaysPrice # ... do something with the difference here # at the end of the loop, set yesterdays # price to todays price, and when we start # the loop again we'll get a new price yesterdaysPrice = price

NOTE: for this to work, the data must be ordered by date. Our data is, but it is backwards (newest to oldest), so we should reverse the list of data after we read it

4) Longest increase

Let's keep adding to our program. Let's find the longest run of consecutive price increases. To check whether the price has increased from one day to the next, you need to be able to compare the prices on two days just like you did in step 3. Here's an example:

# assume data is a list of tuples yesterdaysPrice = 0 for tuple in data: (date, volume, price) = tuple if price > yesterdaysPrice: # we know the price increased # at the end of the loop, set yesterdays # price to todays price, and when we start # the loop again we'll get a new price yesterdaysPrice = price

Then you can keep another variable that tracks how many consecutive days the price has increased. Every time you see an increase, you increment this variable, if there is a decrease, you check to see if the current total is the new maximum, and then reset it to zero.

NOTE: for this to work, the data must be ordered by date. Our data is, but it is backwards (newest to oldest), so we should reverse the list of data when we read it

My solution