Wednesday, July 12, 2006
Needed: Terrorist Target Markup Language
Many news stories and commentaries, like that on page 1 of the 11 July New York Times, whose title is "U.S. Terror Targets: Petting Zoo and Flea Market?" or in the "Homeland Stupidity" blog have focused on the contents of the NADB:
For example, the inventory includes 4,055 malls, shopping centers, and retail
outlets; 224 racetracks; 539 theme or amusement parks and 163 water parks; 514 religious meeting places; 4,164 educational facilities; 1,305 casinos; 234 retail stores; 127 gas stations; 130 libraries; 335 petroleum pipelines; 217 railroad bridges; 140 defense industrial base assets; 224 national monuments and icons; and 8 wind power plants.
(The NADB also includes 159 cruise ships and 34 Coca Cola bottlers/distributors).
But joking about the contents misses the far more important concern emphasized by CNN that the NADB is too flawed to determine allocation of federal security funds, supporting complaints by New York City, Washington DC and other cities that they are being shortchanged.
And now I am going to look at this news from a Document Engineering and Information Architecture perspective. Why did it happen, and how could we prevent this from happening again?
We wouldn't be surprised that the NADB contained bad information if the DHS hadn't provided state and local governments with any criteria or specifications. But that's not the explanation. Two years ago the DHS Office of Infrastructure Protection provided "Guidelines for Identifying National Level Critical Infrastructure and Key Resource" (included as an Appendix of the recent IG’s report) that included detailed definitions, classification criteria, and requirements for how to describe each asset. The Guidelines include a taxonomy with 17 first-level categories, and scores of subcategories, and also specify the information components needed to describe each asset such as state, address, sector, owner, owner type, phone, local law enforcement POC, and latitude and longitude coordinates.
Here, for example, is some of the guidance about Chemical assets:
1. Sites that could cause death or serious injury in the event of a chemical release and have greater than 300,000 persons within a 25-mile radius of the facility.
2. Economic impact of more than one billion dollars per day (e.g., an event impacting multiple sectors and cumulatively cause this amount of economic damage).
NOTE: The term "sites" includes manufacturing plants; rail, maritime, or other transport systems; pipeline and other distribution networks; and storage, stockpile, and supply areas.
Nevertheless, despite this guidance, states and local governments submitted assets that didn't follow the specified formats, were incomplete, were duplicates, and in the case of Puerto Rico, were in Spanish! All of this reflects some mixture of incompetence, negligence, and political calculation to get more than a fair share of Homeland Security funds.
But suppose that the DHS had encoded these narrative specifications in an XML vocabulary called "Terrorist Target Markup Language" and required all asset submissions to conform to it. TTML would have made it possible to detect most of these problems immediately when they were submitted, and the standard organization and format of the data would have enabled additional data mining to detect anomalous information.
This isn't a far-fetched suggestion. There are numerous XML standards activities underway in the homeland security domain, including biometric data exchange, common alerting protocols, and emergency response.
I can't think of any XML document that will fix that problem.
Instead of providing a taxonomy, the DHS should have distributed an expert system that queried the users, navigated the taxonomies, checked requirements and gathered all relevant information. Once the expert system indicated all necessary information was provided then it would upload the data to DHS's website.
That might also reduce the likelihood of "threat inflation".
Links to this post: