15 September 2015

2015 PATA Technology Forum, Bangalore

06 September 2015, I had the pleasure of attending the first Pacific Asia Travel Association Technology Forum, at Bangalore International Exhibition Center, in partnership with phocuswright.com and connectingtravel.com. PATA is the travel and tourism industry association for the Asia Pacific region, now headed by artificial intelligence investor Mario Hardy. Phocuswright is the global nexus for technology in travel and tourism. Connecting Travel is a new professional social network initiative for the travel and tourism industry by Travel Weekly. (Both Phocuswright and Travel Weekly are now owned by Northstar Travel Media.)

The opening speaker was the prominent investor and philanthropist Mohandas Pai, in his role as chairman of the Karnataka Tourism Vision Group. Pai, who is heavily invested in tripfactory.com, provided a 360 degree overview of the skyrocketing digital economy in India, as well as its impacts on travel and tourism, both domestically and internationally. One of the most interesting things he mentioned was the Aadhaar, or Unique Identification Authority of India, basically the world's largest national identification number project, set to biometrically empower millions of people without conventional paper trail or fixed abode.

Tony D'Astolfo, managing director of Phocuswright, introduced this new "Phocuswright Fast Track", by calling it an event within an event. Phocuswright offers recent research on the Indian travel market, and not only maintains a dedicated team in India, but also is planning a full Phocuswright India travel technology conference 21-22 April 2016 near New Delhi, in Gurgaon.

Chetan Kapoor, Phocuswright research analyst for Asia Pacific, put the spotlight on Indian holidays and package travelers, highlighting the evolution of the Indian traveler, and how their shopping and booking habits are transforming traditional holidays and packages.

In the first executive roundtable, titled "Beyond Air - The Next Phase of India's Online Travel Story", Chetan Kapoor presented three of India's new travel and tourism heavyweights:
In terms of traffic, Tripadvisor is consistently within the top 3 travel sites in India, listing more than 30,000 Indian accommodations, with the largest number of reviews. HolidayIQ is a Bangalore-based travel information and review portal, with over 3 million members, listing 2,000 tourism destinations, and more than 50,000 accommodations, in India alone. Cleartrip is one of the top online travel agents in India, attracting more than $70 million in funding.

In the second executive roundtable, titled "Travel Innovation Summit Alumni Spotlight", Tony D'Astolfo introduced three of India's most innovative entrepreneurs to discuss how they are transforming the travel industry, at home and abroad:

Intuitive travel planner Mygola has recently been acquired by MakeMyTrip, one of India’s leading travel companies. TableGrabber, India's first real-time online restaurant reservation system, has recently launched RezGuru, a middle-layer software for restaurants. TripHobo, a travel itinerary-planning portal, recently announced a partnership with Zomato, a leading restaurant discovery platform made in India.

For the executive interview, Tony D’Astolfo did a one-on-one with Ritesh Agarwal, 21 year old founder and CEO of OYO Rooms, India's largest branded network of hotels. Not only is he one of India's youngest CEOs, but also India's most successful college drop-out. Agarwal was the first Indian to receive a $100,000 fellowship grant from Peter Thiel, which he invested in developing OYO Rooms. And mostly recently, he has raised $100 million from Japan’s SoftBank for OYO Rooms. Legend has it that Agarwal started OYO Rooms, which stands for "On Your Own", because his relatives would not let him control the TV remote when he was a child in India. On a personal note, I can say for sure that I am staying in better places in India, and paying less, now than I was a year ago, due to the phenomenal concept that is OYO Rooms.

Following lunch, Connecting Travel organized the "Technology Trends Defining Business Strategy" session, moderated by Tony Tenicela, IBM executive and global leader managing business development. This session focused on how global market players are redefining business models to adapt to the accelerated pace of communication, marketing, and loyalty initiatives. Social media, and virtual networks, figure prominently in creating vertical platforms that are aggregating professionals, consumers, advisers and investors into communities.
Helena Egan, director of industry relations at TripAdvisor, is primarily concerned with building relationships with destination marketing organisations, as well as educating the industry on the benefits user-generated content. Kenny Picken, CEO of Traveltek, a leading provider of travel technology solutions, shared valuable insights of how Traveltek empowers industry stakeholders, rather than by-passing them. Philip Napleton, VP at Open Destinations, providing software for tour operators and wholesalers, emerged as the voice of the younger generation, with his insight into social media and mobile applications. Rika Jean-Francois, head of corporate social responsibility for Internationale Tourismus-Boerse Messe Berlin, was the only person to emphasize the potential of travel technology in developing sustainable tourism. Mike Kistner, CEO of RezNext, a real-time hotel distribution technology company, provided perspectives of the seasoned travel technology professional. Daniela Wagner, Connecting Travel at Travel Weekly, spoke of how their new social network platform can benefit travel professionals.

Note, YourStory is the largest platform for news, reports and analysis on India's booming startup ecosystem.

07 December 2013

Taking It On The Road, Travel Technology 2013

I spent much of this year mountaineering in Europe, and re-visiting India after 34 years. Since my father's passing earlier this year, I've been free to travel again. My father was my chief technology influence. People often ask how I got into technology, since my education was in psychology and most of my career was in tourism. It was all due to my father, Lucian J. Endicott Jr.,who worked nearly three decades for IBM, and then became a professor of computer science before retiring altogether.

On this journey I've been watching closely to see what technologies I find most useful. Unlike most of the young people traveling today, I'm not traveling with a phone. I did have an Apple iPod Touch for awhile, which I enjoyed, but passed it on in favor of the new Google Nexus Android tablet. I find phones and tablets great for everything other than real work, like programming. I did buy the most economical, top rated Consumer Reports laptop for students, and have been very happy with it.

A man can only travel with so many devices though; so, the Apple iTouch and Android tablet both went to nieces, and I'm still happily traveling with my affordable laptop. In both Europe and Asia I have found locally available, prepaid "data cards" or "dongles", basically a phone chip on a USB stick, very helpful for freeing myself from dependence on wifi. However, some of the new, higher end phones come with built-in wifi "hotspot" capability, which I've seen quite a few young people using with their laptops. Without a phone per se, Skype has proven super convenient, especially premium Skype to local landlines and SMS, for literally calling from anywhere to anywhere.

I have to say that I use Skyscanner a lot, and feel it's saved me a tremendous amount of money. The only caveat is that some of the smallest new budget airlines are not included. For accommodation, I have tried both Couchsurfing and Airbnb for the first time this trip. I've actually found Couchsurfing more useful for meeting interesting, colorful people at my destinations than for easily organizing free overnights. I have also been surprised by the amount of people running accommodation operations under the radar via Airbnb, rather than truly private persons renting spare rooms, though I've been satisfied with the service.

Problematic as it may be, I do find myself using Wikitravel precisely because it provides less information rather than more. I like Wikitravel because it gives me a really quick overview of the high points, what to see and do, even for out of the way destinations. I sometimes download the entire page for offline reading, when there is no wifi available. I find both its strongest and weakest points are accommodation. Strong because anyone can add to it, so often gets places under the radar, and weak because it's totally unorganized and un-rated. Because of this deficiency, I find myself often referring to TripAdvisor reviews to double check the lower end accommodations.

Another relatively new technology I'm using a lot is Google Maps, of course frequently for directions, but also particularly for the "Search nearby" capability. I find Google Maps Search nearby capability delightful for discovering new places of all kinds, many perhaps never visited by tourists before. Especially in India directions should be taken with a grain of salt, because places usually seem to be "pinned" imprecisely, so caveat emptor. I find screenshots great for capturing Google Maps directions, easily cut and paste with the "prt sc" key for convenient offline reading.

More than ever before, travel for me is more about people than places. These days I love to visit with friends, old and new, at home or abroad. The reality is that people are now using Facebook more for personal communications than email. Facebook even makes it easier for people traveling in the same regions to meet up along the way. Another reality is that places, particularly low end places, are as likely to have Facebook pages as websites, essentially turning Facebook into its own parallel universe – no other Internet required. In fact, I only use Facebook for people, places/pages and events – and no other bells or whistles, such as innumerable travel apps, etc.

It was Facebook (and my nieces) that finally made me break down and get a small camera (Nikon Coolpix) for posting travel snaps as I go along, often the same day. Though that could be the number one use for phones I'm seeing on the road, not only for taking pictures but also for uploading them in virtual realtime….

04 September 2013

Dissecting the Summarization Process

This is in effect a mid-2013 progress update. As with many of my blog posts, this is as much a status update for me to get a better handle on where I'm at as it is to broadcast my progress.

mendicott.com is a blog reflecting on my journey with the overall project. This blog started seven years ago, in 2006, with my inquiry into The difference between a web page and a blog.... I had then returned from something like five years of world travel to find the digerati fawning over the blogosphere. At first, I failed to see the difference between a blog and a content management system (CMS) for stock standard web pages. Upon closer examination, I began to realize that the real difference lay in the XML syndication of blog feeds into the real-time web.

meta-guide.com is an attempt to blueprint, or tutorialize, the process. My original Meta Guide 1.0 development in ASP attempted to create automated, or robotic, web pages based on XML feeds from the real-time web. Meta Guide 2.0 development was based on similar feed bots, or Twitter bots, in an attempt to automate, or at least semi-automate, the rapid development of large knowledgebases from social media via knowledge silos. Basically, I use knowledge templates to automatically create the knowledge silos, or large knowledgebases. The knowledge templates are based on my own, proprietary "taxonomies", or more precisely faceted classifications, painstakingly developed over many years.

gaiapassage.com aims to be an automated, or semi-automated, summarization of the knowledge aggregated from social media by feed bots via the proprietary faceted classifications, or knowledge templates. Right now, I'm doing a semi-automated summarization process with Gaia Passage, which consists of automated research in the form of knowledge silos being "massaged" in different ways, but ultimately manually writing the summarization in natural language. This is allowing me to analyze and attempt to dissect the processes involved in order to gradually prototype automation. Summarization technologies, and in particular summarization APIs, are still in their infancy. Examples of currently available summarization technologies include automatedinsights.com and narrativescience.com. The overall field is often referred to as automatic summarization.

In the future, the Gaia Passage human readable summarizations will need to be converted into machine readable dialog system knowledgebase format. The dialog system is basically a chatbot, or conversational user interface (CUI) into a specialized database, called a knowledgebase. Most, common chatbot knowledgebases are based on, or compatible with, XML, such as AIML for example. Voice technologies, both output and input, are generally an additional layer on top of the text based dialog system.

The two main bottlenecks I've come up against are what I like to call artificial intelligence middleware, or frameworks, the "glue" to integrate the various processes, as well as adequate dialog system tools, in particular chatbot knowledgebase tools with both "frontend" and "backend" APIs (application programming interface), in other words a dialog system API on the frontend with a backend API into the knowledgebase for dynamic modification. My favorite cloud based "middleware" is Yahoo! Pipes, which is generally referred to as a mashup platform (aka mashup enabler) for feed based data; however, there are severe performance issues with Yahoo! Pipes -- so, I don't really consider it to be a production ready tool. Like Yahoo! Pipes, my ideal visual, cloud based AI middleware could or should be language agnostic -- eliminating the need to decide on a single programming language for a project. I have also looked into scientific computing packages, such as LabVIEW, Mathematica, and MATLAB, for use as potential AI middleware. Additionally, there are a variety of both natural language and intelligent agent frameworks available. Business oriented cloud based integration, including visual cloud based middleware, is often referred to as iPaaS (integration Platform as a Service), integration PaaS or "Integration as a Service".

The recent closure of the previously open Twitter API with OAuth has set my feed bot, or "smart feed", development back by years. Right now, I'm stuck trying to figure out the best way to use the new Twitter OAuth with Yahoo! Pipes, for instance via YQL, if at all. And if that were not enough, the affordable and user-friendly dialog system API, verbotsonline.com, that I was using went out of business. There are a number of dialog system API alternatives, even cloud based dialog systems, but they are neither free nor cheap, especially for significant throughput volumes. Still to do: 1) complete the Gaia Passage summarizations, 2) make Twitter OAuth work, use a commercial third party data source (such as datasift.com, gnip.com or topsy.com), or abandon Twitter as a primary source (for instance concentrate on other social media APIs instead, such as Facebook), 3) continue the search for a new and better dialog system API provider.

Most basically, the Gaia Passage project is a network of robots that will not only monitor social media buzz about both the environment and tourism but also interpret the inter-relations, cause and effects, between environment and tourism -- such as how climate change effects the tourism industry both negatively or positively, or even what effects the weather has on crime trends for a particular destination -- as well as querying these interpreted inter-relations, or "conclusions", via natural language. If this can be accomplished with any degree of satisfaction, either fully automated or semi-automated, then the system could just as easily be applied to any other vertical. Proposals from potential sponsors, investors, or technology partners are welcomed, and may be sent to mendicot [at] yahoo.com.

13 March 2013

A New Website For A New Age: GaiaPassage.com

GaiaPassage.com is subtitled "Marcus L Endicott's favorite tips for green travel around the world".  I'm calling it a deep green, eco-centric travel guide to the whole Earth.  My Gaia Passage project will be a handwritten ecotourism guide to the entire world, based on the circa 250x ccTLD.  The general idea is to write a "white paper" for every country in the world, on environmental and cultural conditions, issues, and who is doing what about them, as well as examining both how they affect tourism and how tourism affects those issues. Anyone could write a lot about something, but the idea here is to provide "snapshots", or "bite sized" summaries, of only the best information and contacts.  The name "Gaia Passage" originally came from my pre-Internet (mid-1980s) travel tips newsletter. The site is a work in progress; so far, I've completed the entire Western Hemisphere:
GaiaPassage.com is handwritten, but based on automated research and automated outline. Primary research is based on data mining 20 years of Green Travel archives. Secondary research is based on multiple years of Meta Guide Twitter bots archives. Significance is based on primary sources in the form of root website domains, and/or secondary sources in the form of Wikipedia entries. In other words, if there is not a root website domain name or a Wikipedia entry then it is unlikely to be included. (However, almost anything may be included in Wikipedia - if properly referenced.) 

I have noticed that many websites of smaller concerns are going down, offline, apparently due to the economic downturn. However, social media such as Twitter and Facebook do present affordable alternatives to owning a root domain website, and I will take these into consideration when appropriate. (In other words, when something is really cool.)  I have also noticed a lot of people using Weebly to make free websites. (Note, GaiaPassage.com currently uses the free Google Sites platform.)

In the early evolution of a website, especially large projects, it's important to first have the "containers" in place as "placeholders", which is no small task in itself. With circa 250x countries and territorial entities, that's a whole year's fulltime work for one man, revising one country per working day. This would mean initial completion by December 2013. Eventually, GaiaPassage.com entries may morph into socialbots, or conversational assistants, containing not only all the knowledge about sustainable tourism gleaned from past Green Travel archives, but also current knowledge resulting from the Meta Guide Twitter bots.

In my previous blog, 250 Conversational Twitter Bots for Travel & Tourism, I detailed my 250x Meta Guide Twitter bots, one for every country and territory in the Internet ccTLD.  Basically, I've spent the past five years working on artificial intelligence and conversational agents - and tweeting about it all the while (links below).  I had been using Twitter extensively as a framework; however, Twitter has become increasingly protectionistic, most dramatically illustrated by the high profile 2012 Twitter-LinkedIn divorce. The Twitter API has become a moving target, which is just too costly for me to keep playing catch up.  In short, I find the "Facebook complex" of Twitter management immensely annoying, and concluded to stop contributing original content; so, my New Year's resolution was to stop tweeting manually at least for all of 2013.  Further, my excellent dialog system API, VerbotsOnline.com, went out of business in 2012.  Any other good dialog system API I found to replace it turned out to be much too expensive.  As a result, all my conversational agents are shut down, at least for 2013.  My hope is that the sector will shake out and/or advance during the year, and better or at least more affordable conversational tools will become available next year.

19 June 2012

250 Conversational Twitter Bots for Travel & Tourism

The reason I haven't updated this blog in almost a year is that I have moved most online development to my Meta-Guide.com website. In the previous two postings, I began testing my content repatriation strategy, in other words aggregating my own content from around the web, which I've continued on the Meta Guide website, in fact concentrating on seeding new webpages from mining the past four years of my own tweets. I have also made a prototype summarizer, which I am now training on my Meta Guide website in order to extract content from it to add on top of the mined tweets when building out new webpages. At the moment, I have three immediate goals. I would like to reach 10,000 tweets, 1,000 Meta Guide webpages, and 100 theses in AI & NLP (from the past 10 years). I only have about another 3,000 tweets to go, so maybe another year, about 300 webpages left to make, and less than 30 more theses to discover.

This past weekend, within view of the spectacular Colorado Rocky Mountains, I succeeded after some struggle in making my 250x Meta Guide Twitter bots conversationally interactive on Twitter. These are 250x manually constructed Twitter bots, one for every country, based on country code top-level domain. That includes one for each of the 193 member states of the United Nations, plus an additional 57 various and sundry territories included in the ccTLD. All of these Meta Guide Twitter bots are powered by my @VagaBot, a single cloud-based Verbot engine from Verbotsonline, using the undocumented API and connected to Twitter via Yahoo! Pipes. Previously they have just been retweeters, aggregating country-specific travel and tourism tweets. The next phase of development will involve marrying the incoming retweets to the outgoing responses in some meaningful way, in other words datamining the incoming retweets and attempting to process them semantically into answers.

You should now be able to @sign tweet any of the Meta Guide Twitter bots with questions. Currently, message turnaround time is running up to 30 minutes, but which is par for Twitter. Among other things replies contain lines from my travel books, see Vagabond Globetrotting 3 & From the Balkans to the Baltics. If you are interested in learning more about me and what I do, I recommend watching both Part 1 & Part 2 of my recent videos on "Open Chatbot Standards for a Modular Chatbot Framework", presented in Philadelphia at Chatbots 3.2: Fifth Colloquium On Conversational Systems. If you need help with socialbots for your social CRM, I am available for consulting; just check my Contact page for details, follow me on Twitter, or connect on LinkedIn, and let's Skype!

06 July 2011

My Cleverbot Tweet-FAQ

This is an experimental "Tweet-FAQ", a cummulative listing of my tweets, microblog postings to Twitter, to date about the chatbot Cleverbot and its sister chatbot Jabberwacky, their creator Rollo Carpenter, and his companies Icogno and Existor.

  • According to Slashdot, Oct 2010, Cleverbot had 45 million lines of memorized user chat, at a rate of doubling every year http://t.co/CRfUpqG
  • http://existor.com .. "conversational AI for business, education and entertainment" .. @existor .. founded by Rollo Carpenter in 2008 ..
  • Not impressed w/ http://cleverbot.com/app "Cleverbot HD" ($2.99) interface.. "emotional avatar" is lame.. needs animated avatar w/ voice-io
  • Version 1.2 sees Cleverbot renamed Cleverbot HD http://tinyurl.com/32xxu74 .. Cleverbot iPhone / iPad app requires WiFi ($2.99) .. #Icogno
  • So I asked Cleverbot.com .. "Are you Bayesian?" .. and it replied "Yes" ..
  • http://liveenglish.ru .. George Jabberwacky teaches Russians English .. first simulator of spoken English .. whole day access only 39 rubles
  • "Learning, creating, phrasing" By Rollo Carpenter, 25th March 2010, Third colloquium on conversational systems http://tinyurl.com/ygbqyyf ..
  • Jabberwacky Cleverbot http://cleverbot.com "learns to be clever from real people, and its AI can 'say' things you may think inappropriate"

24 January 2011

How Many PlayStations Make A Watson?

"The words are just symbols to the computer. How does it know what they really mean?" - David Ferrucci

IBM Watson is IBM's project to create the first computer that can win the TV quiz show Jeopardy when pitted against human contestants, including the record holder for the longest championship streak, Ken Jennings and the current biggest all-time Jeopardy money winner, Brad Rutter. The resulting computer will be a contestant on Jeopardy next month, February 2011. I will try to give an overview here of what is known to date about IBM Watson from open sources. I'm writing this as much for my own learning as any other reason; so, give me a break if it gets a little fuzzy in the complicated parts, besides IBM is not playing all their cards. Of course, I welcome any and all comments, corrections and clarifications. BTW, in the spirit of full disclosure, I am a so-called "IBM brat" having grown up in an IBM family; my father @ljendicott worked for IBM, 1960-1987.

According to the IBM DeepQA FAQ, the history of Watson includes both "Project Halo", the quest for a "digital Aristotle", and AQUAINT, the Advanced Question Answering for Intelligence program. In fact, David Ferrucci, principal investigator for the DeepQA/Watson project, has four publications listed in the AQUAINT Bibliography. The earliest version of Watson was a trial of IBM’s AQUAINT system called PIQUANT, Practical Intelligent Question Answering Technology, adapted for the Jeopardy challenge. Another question answering system, a contemporary of PIQUANT called Ephyra (now available as OpenEphyra), was used with PIQUANT in early trials of Watson, both by IBM and their partners at the Language Technologies Institute at Carnegie Mellon University (who are jointly developing the "Open Advancement of Question Answering Systems" initiative). One of the things OpenEphyra can do that Watson doesn't do at the moment is retrieve the answers to natural language questions from the Web.

IBM Watson is not a conversational intelligence per se, but rather a question answering system (QA system). It is fully self-contained and not connected with the Internet at all. Watson does have an active Twitter account at @IBMWatson, but it is operated by a group of Watson's handlers (using CoTweet). Watson has no speech recognition capability; questions are delivered textually. It does not have autonomous text-to-speech (TTS) capability: TTS must be triggered by an operator (ostensibly to avoid interruptions to the television performance). New York Times readers have tentatively identified the voice of Watson TTS as that of Jeff Woodman. Presumably, an IBM WebSphere Voice product is being used for Watson TTS.

The distinctive "face" or avatar of IBM Watson, about the size of a human head, expresses emotion and reacts to the environment. It was created by Joshua Davis with Adobe Flash Professional CS5 using the ActionScript HYPE visual framework deployed via Adobe Flash Player 10.1. The avatar is connected to Watson via an XML socket server, which sends information about the computer’s current mood or state, such as “I know the answer”, “I won the buzz”, etc. The avatar also receives audio input from Watson’s voice by analyzing audio from the microphone.

IBM Watson is built on a massively parallel supercomputer. The hardware configuration consists of a room-sized system, about the size of 10 refrigerators: 10 racks containing 90 IBM Power 750 server clusters connected over a 10 Gb Ethernet. Each Power 750 contains 4 chips and 32 cores, and is supposedly the world's fastest processor. IBM Watson has a total of 360 computer chips and 2,880 processor cores. It has 15 terabytes of RAM, and a total data repository of 4 terabytes, consisting of two 2 terabyte (TB) I/O nodes. IBM Watson operates at some 80 teraFLOPS, or 80 trillion operations per second. (For comparison, both IBM Blue Gene and the AFRL Condor Cluster operate at some 500 teraFLOPS.)

Many sources, including the Wall Street Journal, are claiming Watson's 4 terabytes (TB) of storage contains some 200 million "pages" of content. Wired claimed only 2 million pages of data for Watson. 1TB (or 1,024GB) is roughly equivalent to the number of books in a large library (or about 1,610 CDs). Large municipal libraries may contain an average of 10,000 volumes. So, if a book averaged say 200 pages, then Watson should contain closer to something like 8 million pages of content. Content sources include unstructured text, semistructured text, and triplestores.

Watson's software configuration consists basically of SUSE Linux Enterprise Server 11, Apache Hadoop, and UIMA-AS. SUSE Linux Enterprise Server 11 is a Linux distribution supplied by Novell and targeted at the business market. Apache Hadoop is a software framework that supports data-intensive distributed applications, including an open source version of MapReduce, enabling applications to work with thousands of nodes and petabytes of data. UIMA-AS (Unstructured Information Management Architecture - Asynchronous Scaleout) is an add-on scaleout framework supporting flexible scaleout with Java Message Service. Hadoop facilitates Watson's massively parallel probabilistic evidence-based architecture by distributing it over the thousands of processor cores.

The DeepQA architecture has three layers: natural language processing (NLP), knowledge representation and reasoning (KRR), and machine learning (ML). The IBM Watson team used every trick in the book for DeepQA; apparently they couldn't decide which natural language processing techniques to use, so just used them all. Each one of Watson's 2,880 processor cores can be used like an individual computer, enabling Watson to run hundreds if not thousands of processes simultaneously. For instance, each processor thread could host a separate search. All the hundreds of components in DeepQA are implemented as UIMA annotators. The internal communications among processes is handled in UIMA by OpenJMS, an open source version of Java Message Service. The IBM Content Analytics product LanguageWare is used in Watson for natural language processing. According to David Ferrucci, Watson contains "about a million lines of new code".

Processing steps: (1) Question Analysis -> (2) Query decomposition -> (3) Hypothesis generation -> (4) Soft filtering -> (5) Evidence scoring -> (6) Synthesis -> (7) Merging and ranking -> (8) Answer and confidence

(1) Question Analysis:

In the UIMA architecture, the collection processing engine consists of the collection reader, analysis engine and common analysis structure. Collection level processing contains the entity registrar with event, entity and relation coreferencers, ultimately creating a semantic search index, the feature structure or common analysis structure store in XML and extracted knowledge database. The UIMA analysis engine consists of programs that analyze documents and infer information about them. The extracted knowledgebase resides in an IBM DB2 database.

Data in the common analysis structure can only be retrieved using indexes. Indexes are analogous to the indexes that are specified on tables of a database, and are used to retrieve instances of type and subtypes. In addition to a base common analysis structure index, there are additional indexes for annotated views, created by natural language processing techniques such as tokenization and named entity recognition.

In the Jeopardy game show, contestants are presented with clues in the form of answers, and must phrase their responses in question form. Watson receives questions or "clues" textually and then breaks them down into subclues. Question clues often consist of relations, such as syntactic subject-verb-object predicates and semantic relationships between subclues such as entities. A semantic search is where the intent of the query is specified using one or more entity or relation specifiers. Triplestore queries in the primary search are based on named entities in the clue. Watson can use detected relations to query a triplestore and directly generate candidate answers. Triplestore sources in Watson include dbpedia.org, wordnet.princeton.edu and YAGO (which itself is a combination of dbpedia, WordNet and geonames.org). Triplestore and reverse dictionary lookup can produce candidate answers directly as search results. Reverse dictionary lookup is where you look up a word by its meaning, rather than vice versa.

(2) Query decomposition:

DeepQA supports nested decomposition, or query decomposition, a kind of stochastic programming, where questions are broken down into more easily answered subclues. Nesting means that an inner subclue is nested in the outer clue, so the subclue can be replaced with an answer to form a new question that can be answered more easily.

(3) Hypothesis generation:

In constructing hypotheses, Watson creates candidate answers and intermediate hypotheses, and then checks hypotheses against WordNet for "evidence", dealing with hundreds of thousands of evidence pairs. Watson uses the offline version of WordNet, a lexical database that groups English words into synsets, or sets of synonyms, that provide definitions and record semantic relationships. Chris Welty, David Gondek, JW (Bill) Murdock and Chang Wang are the IBM Watson Algorithms Team machine learning experts. Wang in particular is an expert in "Manifold Alignment". In engineering, manifolds typically bring one into many or many into one. According to Wang, "Manifold alignment builds connections between two or more disparate data sets by aligning their underlying manifolds and provides knowledge transfer across the data sets". Watson uses logical form alignment to score on grammatical relationships, deep semantic relationships or both. Inverse document frequency is used as a statistical measure of word importance. And, the Smith-Waterman algorithm compares sequencing between questions and candidate answers for evidence.

(4) Soft filtering:

Soft filtering may consist of a lightweight scorer computing the likelihood of a candidate answer simply being an instance of the lexical answer type, or LAT. A LAT is a word in the clue that categorizes the type of answer required, independent of assigned semantics. Watson uses lexical answer type for deferred type evaluation. Interestingly, Ferrucci's name is on an IBM patent (System And Method For Providing Question And Answers With Deferred Type Evaluation), which includes lexical answer type. The patent method includes processing a query including waiting until a descriptor (Type) is determined and a candidate answer is provided. Then, a search is conducted to look for evidence that the candidate answer has the required lexical answer type. Or, it may attempt to match the LAT to a known ontological type (OT). The evidence from the different ways to determine that the candidate answer has the expected lexical answer type (LAT) is combined and one or more answers are delivered to a user. The IBM Watson team found 2500 distinct and explicit LATs in the 20,000 Jeopardy Challenge question sample; the most frequent 200 explicit LATs covered less than 50 percent of those.

(5) Evidence scoring:

There are two layers of machine learning on top of the many NLP processes. Learners located at the bottom layer are called base learners, and their predictions are combined by metalearners in the upper layer. On top of the first learning layer is a reasoning layer, which includes temporal reasoning, statistical paraphrasing, and geospatial reasoning, in order to gather and weigh evidence over both the unstructured and structured content to determine an answer with the most confidence. Watson uses about 100 algorithms for rating each of up to some 10,000 sets of possible answers for every question. Trained classifiers score each of the hundreds of NLP processes.

One type of scorer uses knowledge in triplestores for simple reasoning, such as subsumption and disjointness in type taxonomies, geospatial and temporal reasoning. Temporal reasoning is used in Watson to detect inconsistencies between dates in the clue and those associated with a candidate answer. Paraphrasing is the expression of the same message in different words. Statistical paraphrasing is the use of a statistical sentence generation technique that recombines words probabilistically to create new sentences. Geospatial reasoning is used in Watson to detect the presence or absence of spatial relations, such as directionality, borders and containment between geoentities.

(6) Synthesis:

Each subclue of every nested decomposable question is processed by a dedicated QA subsystem, in a parallel process. DeepQA then synthesizes final answers using a custom answer combination component. This custom synthesis component allows special synthesis algorithms to be easily plugged into the common framework.

Aditya Kalyanpur, Siddarth Patwardhan and James Fan are the IBM Watson Algorithms Team reasoning experts. In their 2010 paper, titled "PRISMATIC: Inducing Knowledge from a Large Scale Lexicalized Relation Resource", Kalyanpur, Fan and Ferrucci present a system for the statistical aggregation of syntactic frames. A syntactic frame is the position in which a word occurs relative to other classes of words, such as subject, verb, and object. In contrast, a semantic frame can be thought of as a concept with a script used to describe an object, state or event.

(7) Merging and ranking:

Watson uses hierarchical machine learning, a learning methodology inspired by human intelligence, to combine and weigh evidence in order to compute the confidence score, and through training it learns to be predictive. Watson merges answer scores prior to ranking and probabilistic confidence estimation, using a variety of matching, normalization, and coreference resolution algorithms. In this second level of machine learning, metalearner classification systems take classifiers and turn them into more powerful learners, using multiple trained models. Final ranking and merging evaluates hundreds of hypotheses based on hundreds of thousands of scores to identify the best one based on the likelihood it is correct.

(8) Answer and confidence:

After being trained on more or less the entire history of the Jeopardy game, the second level of machine learning kicks in to rank the merged scores using one or more metalearners that have learned to evaluate the results of the first level classifiers. The metalearner combines these predictions by multiplying the probabilities by weights assigned to each base learner and taking the average, and learning how to stack and combine the scores. The ultimate answer results from this statistical confidence.

So, how many PlayStations (PS3) would it take to make an IBM Watson? By my calculation, 320. AFRL Condor Cluster took about 2,000 PS3 to make and does some 500 teraFLOPS. IBM Watson does 80 teraFLOPS. [500/80=6.25 & 2000/6.25=320] The cost of 320 PlayStations would be about $128,000, or half the retail price for one IBM Power 750 32 core cluster at around $350,000. (In comparison, as of 2007 PCWorld put IBM's Blue Gene/P system cost at $1.3M per rack, and the Blue Gene/L at $800K.) Deep Blue was a $100 million project. I'm estimating the cost of IBM Watson at up to $50 million, including at least $18 million labor and potentially up to $31.5 million in material costs. It should be noted that "Jeopardy! And IBM Announce Charities To Benefit From Watson Competition".

The Quora IBM Watson topic is a good place for questions about Watson. To learn more about what makes IBM Watson tick, I suggest watching the IBM video "Building Watson - A Brief Overview of the DeepQA Project" and reading the paper "Building Watson: An Overview of the DeepQA Project". Look for the 'updateable' ebook, "Final Jeopardy", by Stephen Baker. IBM operates an informative web site about their project at ibmwatson.com.

Additional sources:

Ante, Spencer E. "IBM Computer Beats 'Jeopardy!' Champs - WSJ.com." Business News & Financial News - The Wall Street Journal - WSJ.com. Web. 14 Jan. 2011.

Chambers, Mike. "Avatar for Watson Supercomputer on Jeopardy Created with Flash (Adobe Flash Platform Blog)." Adobe Blogs. 13 Jan. 2011.

The Associated Press. "Computer Could Make 2 'Jeopardy!' Champs Deep Blue." Google. 13 Jan. 2011. Web.

Gondek, David. "How Watson “sees,” “hears,” and “speaks” to Play Jeopardy!" IBM Research. 10 Jan. 2011. Web.

Gustin, Sam. "IBM’s Watson Supercomputer Wins Practice Jeopardy Round | Epicenter | Wired.com." Wired.com. 13 Jan. 2011.

McNelly, Rob. "Watson Follows in Deep Blue's Steps." AIXchange. 21 Dec. 2010. Web.

Miller, Paul. "IBM Demonstrates Watson Supercomputer in Jeopardy Practice Match." Engadget. 13 Jan. 2011. Web.

Morgan, Timothy P. "Power 750: Big Bang for Fewer Bucks Compared to Predecessors." Welcome to IT Jungle. 16 Aug. 2010. Web.

NOVA. "Will Watson Win on Jeopardy!?" WGBH. 20 Jan. 2011.

Rhinehart, Craig. "10 Things You Need to Know About the Technology Behind Watson." Craig Rhinehart's ECM Insights. 17 Jan. 2011.

Thompson, Clive. "What Is I.B.M.’s Watson?" NYTimes.com. 16 June 2010.

Wallace, Stein W., and W. T. Ziemba. "Applications of Stochastic Programming." Philadelphia, PA: Society for Industrial and Applied Mathematics, 2005.

Will, Steve. "You and I: IBM Watson’s Storage Requirements." IDevelop. 11 Jan. 2011. Web.

= = =

Appendix 1: Chronological bibliography of David Angelo Ferrucci (David A. Ferrucci, David Ferrucci, D.A. Ferrucci, D. Ferrucci):


Fan J, Ferrucci D, Gondek D, Kalyanpur A. PRISMATIC: Inducing Knowledge from a Large Scale Lexicalized Relation Resource. In: First International Workshop on Formalisms and Methodology for Learning by Reading (FAM-LbR).; 2010:122.

Ferrucci D. Build Watson: an overview of DeepQA for the Jeopardy! challenge. In: Proceedings of the 19th international conference on Parallel architectures and compilation techniques.; 2010:1-2.

Ferrucci D, Brown E, Chu-Carroll J, et al., others. Building Watson: An Overview of the DeepQA Project. AI Magazine. 2010;31(3):59.

Ferrucci D, Lally A. Building an example application with the unstructured information management architecture. IBM Systems Journal. 2010;43(3):455-475.


Ferrucci D, Lally A, Verspoor K, Nyberg A. Unstructured Information Management Architecture (UIMA) Version 1.0. Oasis Standard. 2009.

Ferrucci D, Nyberg E, Allan J, et al., others. Towards the Open Advancement of Question Answering Systems. IBM Research Report. RC24789 (W0904-093), IBM Research, New York. 2009.


Drissi Y, Boguraev B, Ferrucci D, Keyser P, Levas A. A Development Environment for Configurable Meta-Annotators in a Pipelined NLP Architecture. In: LREC.; 2008.

Fodor P, Lally A, Ferrucci D. The prolog interface to the unstructured information management architecture. Arxiv preprint arXiv:0809.0680. 2008.


Bringsjord S, Ferrucci D. BRUTUS and the Narrational Case Against Church’s Thesis (Extended ABstract). 2007.


Chu-Carroll J, Prager J, Czuba K, Ferrucci D, Duboue P. Semantic search via XML fragments: a high-precision approach to IR. In: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval.; 2006:445-452.

Ferrucci D, Grossman RL, Levas A. PMML and UIMA based frameworks for deploying analytic applications and services. In: Proceedings of the 4th international workshop on Data mining standards, services and platforms.; 2006:14-26.

Ferrucci D, Lally A, Gruhl D, et al., others. Towards an interoperability standard for text and multi-modal analytics. IBM Res. Rep. 2006.

Ferrucci D, Murdock JW, Welty C. Overview of Component Services for Knowledge Integration in UIMA (aka SUKI). IBM Research Report RC24074. 2006.

Ferrucci DA. Putting the Semantics in the Semantic Web: An overview of UIMA and its role in Accelerating the Semantic Revolution. In: ; 2006.

Murdock J, McGuinness D, Silva P da, Welty C, Ferrucci D. Explaining conclusions from diverse knowledge sources. The Semantic Web-ISWC 2006. 2006:861-872.


Fikes R, Ferrucci D, Thurman D. Knowledge associates for novel intelligence (kani). In: 2005 International Conference on Intelligence Analysis.; 2005.

Levas A, Brown E, Murdock JW, Ferrucci D. The Semantic Analysis Workbench (SAW): Towards a framework for knowledge gathering and synthesis. In: Proc. Int’l Conf. in Intelligence Analysis.; 2005.

Mcguinness DL, Pinheiro P, William SJ, Ferrucci MD. Exposing Extracted Knowledge Supporting Answers. Stanford Knowledge Systems Laboratory Technical 12. 2005.

Murdock JW, Silva PPD, Ferrucci D, Welty C, Mcguinness D. Encoding Extraction as Inferences. In: Stanford University. AAAI Press; 2005:92-97.

Welty C, Murdock JW, Da Silva PP, et al. Tracking information extraction from intelligence documents. In: Proceedings of the 2005 International Conference on Intelligence Analysis (IA 2005).; 2005.


Ferrucci D, Lally A. UIMA: an architectural approach to unstructured information processing in the corporate research environment. Natural Language Engineering. 2004;10(3-4):327-348.

Nyberg E, Burger JD, Mardis S, Ferrucci D. Software Architectures for Advanced Question Answering. New Directions in Question Answering. 2004.

Nyberg E, Burger JD, Mardis S, Ferrucci DA. Software Architectures for Advanced QA. In: New Directions in Question Answering.; 2004:19-30.


Chu-Carroll J, Ferrucci D, Prager J, Welty C. Hybridization in question answering systems. In: Working Notes of the AAAI Spring Symposium on New Directions in Question Answering.; 2003:116-121.

Chu-Carroll J, Prager J, Welty C, et al. A multi-strategy and multi-source approach to question answering. NIST SPECIAL PUBLICATION SP. 2003:281-288.

Ferrucci D, Lally A. Accelerating corporate research in the development, application and deployment of human language technologies. In: Proceedings of the HLT-NAACL 2003 workshop on Software engineering and architecture of language technology systems-Volume 8.; 2003:67-74.


Welty CA, Ferrucci DA. A formal ontology for re-use of software architecture documents. In: Automated Software Engineering, 1999. 14th IEEE International Conference on.; 2002:259-262.


Bringsjord S, Ferrucci D. Artificial Intelligence and Literary Creativity: Inside the Mind of Brutus, A Storytelling Machine. Lawrence Erlbaum; 1999.

Welty CA, Ferrucci DA. Instances and classes in software engineering. intelligence. 1999;10(2):24-28.

= = =

[APPLICATION] Method For Processing Natural Language Questions And Apparatus Thereof

[APPLICATION] System And Method For Providing Question And Answers With Deferred Type Evaluation

[APPLICATION] System and method for providing answers to questions
US Pat. 12152411 - Filed May 14, 2008 - International business machines corporation

[APPLICATION] Method and system for characterizing unknown annotator and its type system with respect to reference annotation types and associated reference taxonomy nodes
US Pat. 11620189 - Filed Jan 5, 2007

Method and system for characterizing unknown annotator and its type system with respect to reference annotation types and associated reference taxonomy nodes
US Pat. 7757163 - Filed Jan 5, 2007 - International Business Machines Corporation.

[APPLICATION] Method And Apparatus For Managing Instant Messaging
US Pat. 11459694 - Filed Jul 25, 2006

[APPLICATION] Autonomous system and method for creating readable scripts for concatenative text-to-speech synthesis (TTS) corpora
US Pat. 11332292 - Filed Jan 17, 2006 - International Business Machines Corporation

Question answering system, data search method, and computer program
US Pat. 7844598 - Filed Sep 22, 2005 - Fuji Xerox Co., Ltd.

[APPLICATION] System, method and computer program product for performing unstructured information management and automatic text analysis, and including a document common analysis system
US Pat. 10449264 - Filed May 30, 2003 - International Business Machines Corporation

[APPLICATION] System, method and computer program product for performing unstructured information management and automatic text analysis, including an annotation inverted file system facilitating indexing and searching
US Pat. 10449398 - Filed May 30, 2003 - International Business Machines Corporation

[APPLICATION] System, method and computer program product for performing unstructured information management and automatic text analysis, and providing multiple document views derived from different document tokenizations
US Pat. 10449409 - Filed May 30, 2003 - International Business Machines Corporation

System, method and computer program product for performing unstructured information management and automatic text analysis, and providing multiple document views derived from different document tokenizations
US Pat. 7139752 - Filed May 30, 2003 - International Business Machines Corporation

[APPLICATION] System, Method and Computer Program Product for Performing Unstructured Information Management and Automatic Text Analysis
US Pat. 10448859 - Filed May 30, 2003 - International Business Machines Corporation

Method and system for loose coupling of document and domain knowledge in interactive document configuration
US Pat. 7131057 - Filed Feb 4, 2000 - International Business Machines Corporation

Method and system for document component importation and reconciliation
US Pat. 7178105 - Filed Feb 4, 2000 - International Business Machines Corporation

Method and system for automatic computation creativity and specifically for story generation
US Pat. 7333967 - Filed Dec 23, 1999 - International Business Machines Corporation