Sunday, September 24, 2006


Hugo Chavez, Please Read My Book

During his recent anti-Bush speech to the UN general assembly, Venezuelan President Hugo Chavez held up a copy of Noam Chomsky’s book called Hegemony or Survival: America’s Quest for Global Dominance and highly recommended it. Since then sales of Chomsky’s book have increased spectacularly, and it even hit #1 at Not bad for a book that "hasn’t sold particularly well" and whose author "does not write page turners, he writes page stoppers" according to the NY Times article I cited in the first sentence

Dear President Chavez,

About a year ago my co-author Tim McGrath and I published a book titled Document Engineering: Analyzing and Designing Documents for Business Informatics and Web Services that we think you would enjoy reading. Like you, we disagree with many of the traditional views held by capitalists about how businesses are designed and operated and we criticize them for their reactionary reluctance to adopt XML and service-oriented architectures. And like you, we disagree with the Bush administration’s approach for how to get people to adopt your ideas – trust us, we will never invade any country or drop bombs on them to impose the Document Engineering approach.

Unlike your approach, the solutions we propose are not political. Our suggestions in our book are primarily conceptual and methodological. But as we propose new ways to think about how firms can work together to automate processes and services to create new value and efficiencies, we would certainly hope that many of these benefits would flow to ordinary people and not to just the greedy capitalists whose firms adopt our methods.

You surely admire Karl Marx for his insights and visions about economic behavior and no doubt recommend his Manifesto. We think you should know that another famous economic thinker, Hal Varian, describes our book as a "MANIFESTO for the document engineering REVOLUTION" in a "blurb" that appears on the book jacket.

So we are sending you a copy of the book. Let us know the next time you come to the United States (or to Australia) so we can show you around and tell you more about our book.


Bob Glushko

Sunday, September 10, 2006


XML Still Isn’t "Self-Describing"

A few weeks ago I wrote a post titled "XML isn't self-describing" and I thought I was done with that topic but a comment to my post (thanks, it is nice to know that some people who are not my students are reading what I'm writing) has made me want to resume talking about it. And for my students taking "Information Organization and Retrieval" from me this semester at the UC Berkeley School of Information , I'll refer back to an article we talked about to reinforce some lessons with this response.

The commenter said:

Don't you think that it is possible within "known" problems to create self-describing information with XML?

For example, a known problem such as ordering dinner at a restaurant or a personal address book might be very capably handled by XML. There might be variations, but I think we could agree on a dozen tags that were obvious within the context. I might not be familiar with the < holdpickles/ > tag, but intuitively I would understand given my familiarity with the problem.

This is a very typical rebuttal attempt that acknowledges that XML isn't self-describing in most cases, but argues that in "familiar" domains there is sufficient agreement about what words mean for it to be so. It seems "obvious" that people agree on things like restaurants and addresses but it just isn't true.

There is hard scientific evidence from experimental studies of "statistical semantics" that there is little agreement on the words that people assign as names to common objects or processes. The classic paper, reporting a number of compelling experiments, is "The Vocabulary Problem in Human-System Communication" by George Furnas, Tom Landauer, Louis Gomez, and Sue Dumais in the Communications of the ACM way back in 1987.

In my IO & IR class last Thursday we replicated the basic results with a few things that I grabbed from my office on the way to my lecture (a dollar bill, a coffee cup, a pocket knife, and a photo with me and my wife). I asked students to think of the best one or two word descriptions for these familiar things. For some of them there were four or five different ones (e.g, dollar, bill, buck, greenback; cup, mug, coffee cup, drink; and so on).

This result is pretty surprising to people. When you ask a few dozen people, you get even more different terms suggested as the "best" or "most natural" name for a common object. But because any given person can only think of a few names as "intuitive" fits, they can't imagine that there could be such diversity, and so they greatly overestimate the amount of agreement about names -- and they conclude that the name of something is a reliable description of it.

So the only thing about names being "self-describing" is that they describe the meaning of something to oneself. Just not necessarily for anyone else.

-Bob Glushko

This page is powered by Blogger. Isn't yours?