Saturday, February 19, 2011

The IBM Jeopardy! Challenge

I watched along with everyone else as Watson trounced Jeopardy! champions Ken Jennings and Brad Rutter. Here is the first of a chain of YouTube videos documenting the event. Start here and link through to the others.

Here's an even nicer link to an IBM website describing Watson and the Jeopardy! challenge.

This was a very nice showcase for what is essentially a very sophisticated search engine by IBM. It was all very theatrical, and the whole three-day affair was designed to make you ooh and ahh over IBM's new tech. In this it was very successful. A lot of people have geeked out over it.

Here are some facts about Watson:
  1. It is a huge affair as computers go. It's not so much "a computer" as a network of computers contained within 10 racks arranged in 2 banks... the equivalent of 2,800 computers, having 15 terabytes of memory.
  2. For all of that power and capacity, Watson is optimized for one task only: playing Jeopardy! Watson is incapable of general problem-solving or thought. It doesn't hear; it doesn't see.
  3. Watson has a distinct advantage over human competitors in two regards: reading speed and buzzer reflexes. Watson does not hear or read the questions; they are transmitted to the computer as a text file from the game board. In human terms this is nearly instantaneous. Watson also presses the buzzer button (identical to the ones the humans use) by means of an electronically controlled solenoid.
Here's what Watson doesn't do:
  1. It doesn't understand natural language questions. It parses the questions (a loose definition, since Jeopardy! questions aren't really questions), and within the domain defined by the Jeopardy! category it searches for the best search result fitting key words and phrases within the category. We got a glimpse of Watson's thinking as it ranked its top 3 candidate answers. For some questions it was clear that the results didn't make sense as an answer. Sometimes when it got the right answer it wasn't correct as to whether the answer represented a "who" or a "what".
  2. It doesn't run on a computer made of 3 pounds of meat. Meanwhile, in addition to playing Jeopardy!, computers composed of 3 pounds of meat can think, drive a car, and build computers like Watson.
  3. It's not environmentally friendly. It's cooled by a huge bank of air conditioners. Each of the servers that composes Watson burned more calories in a single match than Ken Jennings or Brad Rutter does in an entire day. And neither Ken nor Brad is tied to his power source.
Sadly, I don't think that Watson represents a major leap forward in artificial intelligence. It's not even as tremendously huge a leap forward in search technology as IBM implies in the videos.

Try it yourself: Make up a Jeopardy! puzzler and type it into Google. The chances are the correct answer is listed in the first result. Here's my sample "question": "This condiment is the number one seller in the US". The correct answer is "What is salsa?" You'd've gotten it correct even if you'd clicked "I feel lucky" on the search page. You don't even have to drill into the page to get the answer... it's usually displayed in the search results.

What Watson is is a solid improvement in data analytics. Its major achievement is extracting the simple, direct answer from raw data that is readily returned by ANY competent search engine today. The size, speed, and capacity of Watson has less to do with that capability than it does with meeting the game requirement that a contestant's knowledge be self-contained. I'm not getting all googly-eyed over those specs because they're a distraction from the actual tech being displayed, which is the natural language parser and the data analytics.

If anything, what this challenge has proven is how incredibly easily we humans perform the incredibly difficult tasks of natural language processing, data analytics, and cognition. When you're using Google or Yahoo! or Bing, you use these facilities all the time, and the computer does what it does best... sift through the raw data. Presented with relevant data it only takes you a moment to recognize the correct answer and understand it. You do it extremely efficiently. Watson uses a tremendous amount of processing power to do the first two of these tasks, and does the third not at all.

Watson itself is a toy... a fantastically expensive game-playing computer. The technology behind it will underlie important tools in the future. But I'm not among those welcoming our computer overlords... they're not here yet.

(P.S. I was amused by one commentator's observation that Watson's avatar was a stylized "smiley face". I'm wondering if all his "observations" are equally myopic. It's a globe, folks. As in, this planet Earth. The continents are overlapping circles. It's based on IBM's "smarter planet" logo. Sheesh.)


  1. I missed the Watson showdown on Jeopardy. I heard a little about it. I heard that Watson's human opponents fared better after the first day. They adapted to the way Watson was playing the game and changed their strategy, their own game-play to combat it. That is something that Watson could not do.

    I also have no doubt that if Watson were to come back, the Jeopardy writers would write more "Answers" that would play to Watson's weaknesses. More musical "Answers," more visual clues. And I don't think that's unfair. Jeopardy isn't a static form. They are always trying out new categories and ways to make the game more exciting.

  2. Hi, Russ... actually, the writers would use neither visual nor aural clues. It's part of the rules of Jeopardy!

    Watson can neither see nor hear, so the producers of Jeopardy! must use the very same rules they apply when blind or deaf contestants appear on the show. It is a matter of fairness.

    They also didn't write any "answers" to play to Watson's strengths or weaknesses. IBM wanted this to be a meaningful challenge, so other than allowing for Watson's handicaps, no other provisions were made.

    The reason the human players did better after the first day was because of buzzer strategy. In Jeopardy!, no contestant can buzz in until Alex finishes reading the question aloud. At that time a light informs the contestants they may buzz in. Watson gets the whole of the question and begins "thinking" at the same time the question is revealed (while Alex is talking). Everyone else can read it then, too. Watson gets an electrical signal when the buzz light comes up, and responds as soon as it has an answer that passes its "buzz threshold". For easy questions, the indicator lights and Watson immediately buzzes. For hard questions Watson may not have finished its search and the humans have a chance.

    What the humans started doing is going for the buzzer when the indicator lit whether or not they had answers. They used the verbal response time to finish thinking about it, and sometimes just guessed.

    This is a strategy that Watson could do (the "buzz threshold" is variable), but would not do, because the strategy it did use was better. Wrong answers subtract from your score. When Watson was far ahead, the buzz threshold was raised so as not to give away its lead. If the score were closer or if Watson were behind, the threshold would be much lower, because a more risky strategy would be called for.

    The programmers did a great job analyzing Jeopardy!... Watson played a very fair game with sound strategy. What I'm trying to do above isn't to denigrate the achievement; it's to throw the cold light of day on it so people aren't oohing and ahhing over the wrong things. What IBM has in Watson is a large database with fast search, an exceptional parser, and efficient data mining algorithms. It is neither thought nor intelligence.

  3. To be clear, the strategy the humans used is exactly the strategy that Watson would have used if it were behind... a lower buzz threshold and risker behavior. They did nothing at all adaptability-wise that Watson could not or would not do in the same circumstances.