My Photo

May 2008

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
Recently on this blog
Recently on other blogs
Blog powered by TypePad

Mentor Men of Muraco

Link: Mentor Men of Muraco.

What a cool blog!

Gilbane Boston Conference Session Descriptions November 27 - 29, 2007

Link: Gilbane Boston Conference Session Descriptions November 27 - 29, 2007.

SWT-1: What Are the Current and Future Enablers for Natural Language Queries to Return Answers?

Semantic Web enabling technologies are gathering momentum stimulated by a lot of Web 2.0 hype and rhetoric, but are their solutions that reveal the right content to truly match a query? Our speakers will describe features and operational functions within two product lines, IBM’s OmniFind and Schemalogic that expose semantically better (more relevant) results through search.

Moderator: Lynda Moulton, Lead Analyst, Enterprise Search, Gilbane Group
Speakers:
Mike Moran, Distinguished Engineer and Product Manager, OmniFind Enterprise Edition, IBM Software Group
Lowell Anderson, Vice President of Marketing, Schemalogic



Poodwaddle.com

Applying question answering technology to locating malevolent online content

Link: Applying question answering technology to locating malevolent online content.

Decision Support Systems archive Volume 43 ,  Issue 4  (August 2007) Authors Dmitri Roussinov Department of Information Systems, W.P. Carey School of Business, Arizona State University, USA José A. Robles-Flores Department of Information Systems, W.P. Carey School of Business, Arizona State University, USA and ESAN University, Lima, Peru P

ABSTRACT

We have empirically compared two classes of technologies capable of locating potentially malevolent online content: 1) popular keyword searching, currently widely used by law enforcement and general public, and 2) emerging question answering (QA). The Google search engine exemplified the first approach. To exemplify the second, we further advanced the pattern based probabilistic QA approach and implemented a proof-of-concept prototype that was capable of finding web pages that provide the answers to the given questions, including non-factual ones (e.g. ''How to build a pipe bomb?''). The answers to those question typically indicate the presence of malevolent content. Our findings suggest that QA technology can be a good addition to the traditional keyword searching for the task of locating malevolent online content and, possibly, for a more general task of interactive online information exploration.


Development, implementation, and a cognitive evalu...[J Biomed Inform. 2007] - PubMed Result

Link: Development, implementation, and a cognitive evalu...[J Biomed Inform. 2007] - PubMed Result.

J Biomed Inform. 2007 Jun;40(3):236-51.

Epub 2007 Mar 12.

Development, implementation, and a cognitive evaluation of a definitional question answering system for physicians.

    Yu H, Lee M, Kaufman D, Ely J, Osheroff JA, Hripcsak G, Cimino J.

    Department of Health Sciences, University of Wisconsin-Milwaukee, Enderis Hall 939, 2400 E. Hartford Avenue, P.O. Box 413, Milwaukee, WI 53211, USA. HongYu@uwm.edu

The published medical literature and online medical resources are important sources to help physicians make patient treatment decisions. Traditional sources used for information retrieval (e.g., PubMed) often return a list of documents in response to a user's query. Frequently the number of returned documents from large knowledge repositories is large and makes information seeking practical only "after hours" and not in the clinical setting. This study developed novel algorithms, and designed, implemented, and evaluated a medical definitional question answering system (MedQA). MedQA automatically analyzed a large number of electronic documents to generate short and coherent answers in response to definitional questions (i.e., questions with the format of "What is X?"). Our preliminary cognitive evaluation shows that MedQA out-performed three other online information systems (Google, OneLook, and PubMed) in two important efficiency criteria; namely, time spent and number of actions taken for a physician to identify a definition. It is our contention that question answering systems that aggregate pertinent information scattered across different documents have the potential to address clinical information needs within a timeframe necessary to meet the demands of clinicians.

    PMID: 17462961 [PubMed - indexed for MEDLINE]


» Barney Pell: Pathways to artificial intelligence | Between the Lines | ZDNet.com

Link: » Barney Pell: Pathways to artificial intelligence | Between the Lines | ZDNet.com.

In this podcast interview, [ZDnet] talked with [Baryney] Pell [CEO of Powerset, the natural language search startup] about his views on AI and how the development of machines smarter than humans will play out in coming decades. We also discussed the underpinnings of Powerset as an example of technology and collective human intelligence applied to making a smarter search engine, and how natural language understanding is at an inflection point, moving out of the labs and into the world.

Pell said that AI entities will get smarter but also humans, via intelligence augmentation, will gain new capabilities. He suggested that two approaches will meet in the middle–bottom-up complete brain simulations, which develop like human children, and top-down engineered systems.

Lotfi Zadheh: From Search Engines to Question-Answering Systems

UC BERKELEY BISC SEMINAR
             on Tuesday, 4 September 2007, 4:00pm-5:30pm
                       380 Soda Hall (Berkeley)
                  http://www-bisc.eecs.berkeley.edu/

         "From Search Engines to Question-Answering Systems--
                  A Challenge that is Hard to Meet"
                             Lotfi Zadeh
                             UC Berkeley

Existing search engines, with Google at the top, have many truly
remarkable capabilities. Furthermore, constant progress is being made
in improving their performance. But what is not widely recognized is
that there is a basic capability which existing search engines do not
have: deduction, capability to synthesize an answer to a query by
drawing on bodies of information which reside in various parts of the
knowledge base.  By definition, a question-answering system is a
system which has deduction capability. Can a search engine be upgraded
to a question-answering system through the use of existing tools which
are based on bivalent logic and probability theory? A view which is
articulated in the following is that the answer is: No.

There are three major obstacles: (a) world knowledge; (b) relevance;
and (c) deduction. The problem with world knowledge is that in large
measure it is perception-based and hence is intrinsically
imprecise. Example: Usually it does not rain in San Francisco in
midsummer. Perception-based information is not available to
manipulation through the use of bivalent logic and probability theory.

The problem with relevance is that existing approaches to assessment
of relevance attempt to deal with relevance in a statistical
framework, with no consideration of semantics. The results leave much
to be desired.  The problem with deduction is that in realistic
settings the premises are generally imprecise, uncertain and partially
true. In such settings, conventional methods of deduction do not work.

To deal with the problems of world knowledge, assessment of relevance
and deduction, new tools are needed. The new tools which are outlined
in my lecture are Precisiated Natural Language (PNL), Protoform Theory
(PFT) and Generalized Theory of Uncertainty (GTU). The centerpiece of
these tools is the concept of a generalized constraint. The concept of
a generalized constraint is what makes us possible to deal effectively
with information which is permanently imprecise, uncertain and
partially true.

Michael Wood, The Road to Delphi (The Oracle of Delphi and Ancient Oracles)

Link: Michael Wood, The Road to Delphi (The Oracle of Delphi and Ancient Oracles).

Page linking to several reviews of this book, plus an Amazon page.  Oracles are the original question-answering system, aren't they?  Why do (did) people believe in oracles, and what makes non-human question-answerers credible and authoritative?

Kolmogorov complexity and QA

Link: CWI Lectures 7 september 2007: Abstracts.

Information Distance From a Question to an Answer

Ming Li

Canada Research Chair in Bioinformatics

University of Waterloo

We know how to measure the distance from Toronto to Amsterdam. However, do you know how to measure the distance between two information carrying entities? For example: two genomes, two music scores, two programs, two articles, two emails, or from a question to an answer? Furthermore, such a distance measure must be application-independent, must be universal in the sense it is provably better than all other distances, and must be applicable.

From a simple and accepted assumption in thermodynamics, we have developed such a theory and many applications, together with Paul Vitányi and other colleagues. I will present this theory and its improvements, accompanied by a new application of the theory: a question answering system.

There's something about Kolmogorov. :)  Can't find a version of this research online.  Would be interested.


Update: Found it.  Haven't read it yet.  Here's the abstract:

ABSTRACT
We provide three key missing pieces of a  general theory of
information distance [3,23,24]. We take bold steps in for-
mulating a revised theory to avoid some pitfalls in practical
applications.The new theory is then used to construct a
question answering system.  Extensive experiments are con-
ducted to justify the new theory.

Receptional: A Deeper Look at Google Universal Search (GUS) at SES San Jose

Link: Receptional: A Deeper Look at Google Universal Search (GUS) at SES San Jose.

[Sherwood Stranieri of Catalyst] found that there were a lot of factors that you could manipulate or change in terms of how Google Universal might interpret ... videos. These included tagging, video format, whether the video has inbound links, download popularity and social commentary about the video made by users and viewers. ... [They] found that in SEO terms, ... page rank was not important, but page views and comment activity correlated well to the search results. Page Rank - in some cases - was zero. He also suggested that Google had the ability to work out whether a video was "hot" or "not" down to a 15 minute time segment! Based around Google's older technology of "Zeitgeist", Google can then influence its results in real time based on what people are interested in. What he also revealed was that Google currently have to tailor its process for indexing Video sites and as such, only a few video sites were in the GUS index. This suggests that you should put you video on a biggish video portal instead (or at least as well as your own website to get exposure on Google. This was confirmed in Q&A by Google. One of the reasons for this was that they needed to be able to rely on the video servers being relatively robust for the user experience.

[William Slawski from Commerce 360] noted that question answering on the engines was triggered by the user query and as such, to get returned, you would have to frame your objects appropriately. for example, using "born dd/mm/yy" on a Marilyn Munroe site might help your chances when a person types in "how old is Marilyn Munroe?". Similar examples followed with business address details being preceded by the phrase "address" or a the telephone number being preceded by the phrase "Tel:".