Let's say we could get one million humans playing per day, and answering 10 questions per day. So over a year, we would have over 3.5 trillion facts checked. In 20 years, i.e., the singularity, CYC would have 70 trillion facts. Enough to pass the Turing Test, in your opinion? Why or why not?
You can stop on page 279. We will cover remainder of chapter later.
We will start working on our final project this week. Your goal will be to build a chatbot that accepts input from a human user (or another chatbot, I suppose!) and then responds. Similar in style to what we saw in class with the Turing Test. The twist is that your chatbot will do very little work itself in generating a response. Instead, it will go out to the web to get information.
More specifically, when your chatbot gets an input from a user (a question, a statement), it will see if it can find that input on the web. If so, it will check if there is a canned response to go with it. If so, it will give the canned response back to the user.
What if your chatbot does not find a canned response on the web? It will then see if it can elicit a response from the user:
user: What color is the sky? chatbot: I don't know how to respond to that. How would you respond? user: Blue, silly. chatbot: Thanks, I'll remember that. ...This is the interesting part. Your chatbot will share information with the other chatbots in class. They will become a chatbot collective (sounds mildly menacing). Assuming that the above conversation was spawned because the chatbot could not find the question "What color is the sky?" on the web, it will (a) add the question to the web, and (b) add the response it got from the user, i.e., "Blue, silly." The next time any chatbot in class gets a user input of "What color is the sky?", it will now find "Blue, silly." on the web and respond that way. Cool, huh.
I'm going to set up the web page where the chatbots share information. You can find it at http://www.cs.uoregon.edu/classes/08W/cis170/cis170-cloud.html. It is an xml file, which we will talk about in class. You can also find info here (among other places): http://en.wikipedia.org/wiki/XML.
I'm going to stay in the playground again this week. I think it is a good place to try out ideas before committing to a real application. Remember the code below from last week. Go ahead and paste it in fxri (or whatever playground you are using).
require 'net/http' require 'uri' def get_page(site,path) the_page = site + path #builds larger string out of two pieces res = Net::HTTP.get URI.parse(the_page) return res #should be raw xml end
Now let's pull in the chatbot collective wisdom. This is exactly the same as last week, other than p is now a different path and I've used the variable name xml in place of h.
s = 'http://www.cs.uoregon.edu/classes/08W/cis170' p = '/cis170-cloud.html' xml = get_page(s,p) #this line needs to be successful for code below to work
What you have in xml is a big jumble of xml code as we discussed in class. Within that jumble is the current snapshot of the user inputs that have been seen and the responses (if any) to each of those inputs. Several things to note: (1) There may be more than one response for each user input. We will talk later about how this is possible. (2) There are no duplicates. Each user input has to be unique. Each response for a given input must be unique. Duplicate responses are allowed if they are responses to different user input. (3) User input is tagged with a unique id. This is used to tie a response to a user input. The setting tags at the top keep track of the id counters. We will see how they are used next week.
This week, let's see if we can use what we have learned about our three strategies for pulling pieces out of strings to solve a specific problem. In particular, assume a user has typed the question "What color is the sky?", and we now want to see if we can find a response within our xml string. Go ahead and paste the following into your playground. I made up the variable name ui to remind me it is a user input.
ui = "What color is the sky?"
There is a two-step process our chatbot will eventually need to follow: (1) see if the user input (i.e., the string in ui) matches a user input we have already seen, and (2) if it does, can we find a response to go with it. We will only be looking at the first step this week. So our goal is to check if the string in ui matches the text in a user_input tag. You can go ahead and look at what is in xml. You should see that in fact there is the ui string buried within the xml string. So it is there. We just have to figure out how to make Ruby find it.
Looking at the xml file, it appears that all user inputs that have been seen are between the tags <user_input> and </user_input>. I'm going to give you some help here. The code below will enumerate (i.e., hand you, one by one) all user_input entries. Each time it hands you a chunk, it will store it in the variable entry. You can paste it and see how it works. I've added the puts "*****..." just to help you see where one entry begins. At the time of writing this homework, there were two entries printed by code below. When you try it, there may be more.
require 'rexml/document' REXML::Document.new(xml).root.get_elements('user_input').each do |entry| entry = entry.to_s #convert to a string puts "************an entry:\n" puts entry #print out the string endYou should see all the user_input entries printed. This is a *big* help. If we did not have the code above, you would be forced to use pattern matching to pull the /<user_input>(.*)</user_input>/ piece out of the xml string. And you would be forced to write some kind of loop that went through all the matches: there can be hundreds of matches in xml, each one representing something we have seen a human user type to our chatbot. The code above does this looping for you, thank goodness.
Problem 1. We will get back to the loop above later. For now, we can do some testing without it. To help with this testing, I set the variable etest below to an example you would see in the loop above, i.e., the variable entry will eventually contain the string below; go ahead and copy etest into the playground.
Problem 2. Use the same general strategy to pull the id out of etest. However, this time you might want to try deleting rather than slicing.
Problem 3. Ok, let's start packaging up what we have learned. We will want to use it again in following assignments. The way to do that is to start a file called chatbot_helper.rb. We will define some methods in this file that we can use later, without having to go through all the bother we did in problems 1 and 2. Define the file and paste the following into it. It will give you an error message if you try to execute it at this point. We will fix that.
require 'net/http' require 'uri' require 'rexml/document' def get_page(site,path) the_page = site + path #builds larger string out of two pieces res = Net::HTTP.get URI.parse(the_page) return res #should be raw xml end def getCollectiveWisdom() return get_page('http://www.cs.uoregon.edu/classes/08W/cis170', '/cis170-cloud.html') end def getText( entry ) #return the text withinNow fill in the piece I have marked with "your solution to problem 1", i.e., add the code that sets text1 to correct value . Next, fill in the piece I have marked with "your solution to problem 2", i.e., add the code that sets id1 to correct value. Note that for both fill-ins, you will have to replace the variable name etest with entry. It should now execute without errors (but it won't do anything yet). Make sure to save this version.... #your solution to problem 1 return text1 end def getID( entry ) #find the id within... #your solution to problem 2 return id1 end #This code works as is - do not change it. def findUIMatch( user_input, xml ) REXML::Document.new(xml).root.get_elements('user_input').each do |entry| entry = entry.to_s #convert regexp to a string text = getText(entry) #your method if( user_input == text ) #think about less strict matches in future return getID(entry) #your method end end return nil #if no match found end
Finally, I have written some test cases that puts your code through its paces. Make sure it works with this test file: chatbot_test.rb. This is similar to your testing of the full adder programs. You run the tester and expect no failures. If you get a failure, you need to work on the code in chatbot_helper.rb. The grader will use chatbot_test.rb to test what you turn in.
As a final note, we have made a pretty big step with this homework. The method findUIMatch above will tell us if what the user typed in matches anything we have seen. This is the first step of the two step process! The next step will be to find a response if it exists. Next week :)