Recently in Government Category

We've published a new paper on addressing large-scale integration, storage, and access of complex information. As Dale mentions in his entry over on our main blog, the paper frames the discussion in terms of challenges to Open Government initiatives. We note, though, that the exploration of obstacles to effective, efficient processing of high volumes of data and content is relevant across many industries.

We're cross-posting here on the XML blog because the paper deals wtih XML content and the XML family of standards, including XQuery and XPath.

The Gilbane Beacon is available as a free download from Gilbane and from Mark Logic, sponsor of the paper.

I have a new post over at EMC's Community site, "Preserving Electronic Public Records: Lessons from the Washington State Digital Archives." This is part of our ongoing series for EMC on the use of ECM and XML in the public sector.

Technology is literally exploding: that's a good thing isn't it? PDAs, Twitter, iPods that do everything but cook, social networking and constant connectedness: all of it making our lives more in-touch, immediate, visual and interactive. There is, however, another side to this amazing progress. I like to call it the "technology imperative" and it grows from the fact that as technology and its use grows, it usually follows paths driven by consumers' desires and willingness to spend money--whims if you will. Once unleashed, these technology-triggered, consumer driven appetites tend to return the favor, pointing the way to where and how their technology providers will go next. Sometimes the process literally becomes circular, taking the technology and its uses into a spiral no one would ever have predicted and for which no one is fully prepared. If you're designing chips, selling gadgets or trolling Best Buy for the next version of the iPhone, this looks like the best of all possible worlds. The problem comes when non-consumer sectors of the culture begin to feel the impact of this race to connect. Technology is Neutral but its uses are Often a Poor Guide: In effect, consumer technology becomes the de facto guide for areas of our culture far from the environments for which it was designed and the modes in which consumers use it. For example, as we saw the rise of the Blackberry, instant email and messaging, we eventually saw workers, even in meetings, with their eyes and attention spans glued to their devices, scarcely even aware that they were supposed to be a contributing part of the meeting and its decision making. The situation became so widespread and vexing that many firms have literally banned PDAs from company meetings, and in 2006 a new condition known as Continuous Partial Attention Syndrome was identified in which the individual becomes so distracted by the overload of available information that any attempt to focus on a thought or subject is seriously degraded if not lost. In its extreme form, this syndrome sees the individual succumbing to a virtual addiction to instant information gratification, leading to a mind wandering in a sea of tidbits with no logical relationship to the subject at hand, even if that subject involves controlling a 4,000 pound automobile. Should Government Use Technology or Technology Drive Government? Today, technology has progressed far beyond those days, rudimentary by comparison, into a world of constant connectedness that can deliver not only the linkage but an intense, and seductive, visual, auditory and activity experience. With it, we are seeing an entirely new impact, especially pronounced in government sectors. Should government agencies, for example, put their important decisions out on Twitter and other social media to inform and elicit feedback from citizens? Sounds like a good way to improve the governing process, but in practice it has all manner of problems, not the least of which are mass responses that can overwhelm the agency's ability to make sense of them, egalitarian leveling that makes everyone's opinion on every subject of equal weight if not value, group influenced or generated responses that masquerade as individual opinions, and so on. In the intersection of government and technology, the technology is likely to come out on top, driving the governing process in directions it should not take, but becomes powerless to avoid. So what are we to do? Like Ulysses stuffing his crew's ears with wax to avoid the clarion call of the Sirens, we must ignore how technology is taken up by the consumer world, no matter how enticing the outcome, concentrating instead on how the governing process may be improved by increased transparency and responsiveness. This concentration should be based on a healthy respect for the unintended consequences of any fundamental changes in the governing process coupled with an even healthier skepticism for any of the brave new world claims of the technological community. As we better understand what is broken in our governing process and what can be accomplished more effectively, we will have a foundation to consider, evaluate and adopt technology in a way the improves government as it was envisioned by our founders, always remaining mindful that government as we conceive it is not supposed to be slick or interactive but solid, fair and resistant to both individual whim and mob rule.

With the rise of Web 2.0 and 3.0, growing Internet traffic, social networking and a host of other technologically driven applications and appetities, government at all levels is confronting the burgeoning changes in its role and participation in the society around it.

An important part of this process is the separation of the paths down which technology is taking society at large from the paths government should and should not follow in performing its essential functions.   Experience has shown that not every tool, functionality and resource available to and used by citizens should become part of the governance process. The quandry is deciding up front which is which. This quandry can be seen in the very definition of government being used to described the future: "connected government", "open government", "participatory democracy", "transparent government" are just some of the  terms being used to describe what their users think government should be.

The core challenge, it would seem, is to develop an approach that makes government at once more effective in discharging its myriad day to day duties, more open and responsive to the honestly held beliefs and concerns of its citizens, yet still fully capable of discharging its constitutional responsibilities without infringing on or abrogating the rights of its citizens. History shows that this:

  • Will not be an easy process
  • Will not lend itself to a solution based solely on availablle technnology
  • Is likely to be tried unsuccessfully (or disastrously) more than once before we get it right. 

This would seem to dictate that, whatever the technological imperatives, government should be changed carefully, in small steps and with well-considered fallbacks from the paths that turn out to be ineffective or dangerous to our liberties. One way to do this, for instance, would be to focus on those government functions we know are broken and understand how to fix (yes, there are such things.)  Then we could focus on applying new technology in areas where the target is familiar, the outcome more easily measured and the impact is less likely to spin out of control. 

Yesterday the big stimulus bill cleared the conference committee that resolves the Senate and House versions. If you remember your civics that means it will be likely to pass in the chambers and then be signed into law by the president.

Included in the bill are billions of dollars for digitizing important information such as medical records or government information. Wow! That is a lot of investment! The thinking is that inaccessible information locked in paper or proprietary formats cost us billions each year in productivity. Wow! That's a lot of waste! Also, that access to the information could spawn a billions of dollars of new products and services, and therefore income and tax revenue. Wow! That's a lot of growth!

Many agencies and offices have striven to expose useful official information and reports at the federal and state level. Even so, there is a lot of data still locked away, or incomplete or in difficult to use forms. A while ago a Senate official once told me that they do not maintain a single, complete, accurate, official copy of the US Statutes internally. Even if this is no longer true, the public often relies on the "trusted" versions that are available only through paid online services. Many other data types, like many medical records, only exist in paper.

There are a lot of challenges, such as security and privacy issues, even intellectual property rights issues. But there are a lot of opportunities too. There are thousands of data sources that could be tapped into that are currently locked in paper or proprietary formats.

I don't think the benefits will come at the expense of commercial services already selling this publicly owned information as some may fear. These online sites provide a service, often emphasizing timeliness or value adds like integrating useful data from different sources, in exchange for their fees. I think a combination of free government open data resources and delivery tools, plus innovative commercial products will emerge. Maybe some easily obtained data may become commoditized, but new ways of accessing and integrating information will emerge. The big information services probably have more to fear from startups than from free government applications and data.

As it happens, I saw a demo yesterday of a tool that took all the activity of a state legislature and unified it under one portal. This allows people to track a bill and all related activity in a single place. For free! The bill working its way through both chambers is connected to related hearing agendas and minutes, which are connected to schedules, with status and other information captured in a concise dashboard-like screen format (there are other services you can pay for which fund the site). Each information component came from a different office and was originally in it's own specialized format. What we were really looking at was a custom data integration application done with AJAX technology integrating heterogeneous data in a unified view. Very powerful, and yet scalable. The key to its success was strong integration of data, the connections that were used to tie the information together. The vendor collected and filtered the data, converted to a common format, added the linkage and relationship information to provide an integrated view into data. All source data is stored separately and maintained by different offices. Five years ago it would have been a lot more difficult to create the service. Technology has advanced, and the data are increasingly available in manageable forms.

The government produces a lot of information that affect us daily that we, as taxpayers and citizens, actually own, but have limited or no access to. These include statutes and regulations, court cases, census data, scientific data and research, agricultural reports, SEC filings, FDA drug information, taxpayer publications, forms, patent information, health guidelines, etc., etc., etc. The list is really long. I am not even scratching the surface! It also includes more interactive and real-time data, such as geological and water data, whether information, and the status of regulation and legislation changes (like reporting on the progress of the stimulus bill as it worked it way through both chambers). All of these can be made more current, expanded for more coverage, integrated with related materials, validated for accuracy. There are also new opportunities to open up the process of using forums and social media tools for collecting feedback from constituents and experts (like the demo mentioned above). Social media tools may both give people an avenue to express their ideas to their elected officials, as well as be a collection tool to gather raw data that can be analyzed for trends and statistics, which in turn becomes new government data that we can use.

IMHO, this investment in open government data is a powerful catalyst that could actually create or change many jobs or business models. If done well, it could provide significant positive returns, streamline government, open access to more information, and enable new and interesting products and applications. </>

Following on Dale's inauguration day post, Will XML Help this President?,  we have today's invigorating news that President Obama is committed to more Internet-based openness. The CNET article highlights some of the most compelling items from the two memoes, but I am especially heartened by this statement from the memo on the Freedom of Information Act (FOIA):

I also direct the Director of the Office of Management and Budget to update guidance to the agencies to increase and improve information dissemination to the public, including through the use of new technologies, and to publish such guidance in the Federal Register.

The key phrases are "increase and improve information dissemination" and "the use of new technologies." This is keeping in spirit with the FOIA--the presumption is that information (and content) created by or on behalf of the government is public property and should be accessible to the public.  This means that the average person should be able to easily find government content and be able to readily consume it--two challenges that the content technology industry grapples with every day.

The issue of public access is in fact closely related to the issue of long-term archiving of content and information. One of the reasons I have always been comfortable recommending XML and other standards-based technology for content storage is that the content and data would outlast any particular software system or application. As the government looks to make government more open, they should and likely will look at standards-based approaches to information and content access.

Such efforts will include core infrastructure, including servers and storage, but also a wide array of supporting hardware and software falling into three general categories:

  • Hardware and software to support the collection of digital material. This ranges from hardware and software for digitizing and converting analog materials, software for cataloging digital materials with the inclusion of metadata, hardware and software to support data repositories, and software for indexing the digital text and metadata.
  • Hardware and software to support the access to digital material. This includes access tools such as search engines, portals, catalogs, and finding aids, as well as delivery tools allowing users to download and view textual, image-based, multimedia, and cartographic data.
  • Core software for functions such as authentication and authorization, name administration, and name resolution.

Standards such as PDF-A have emerged to give governments a ready format for long-term archiving of routine government documents. But a collection of PDF/A documents does not in and of itself equal a useful government portal. There are many other issues of navigation, search, metadata, and context left unaddressed. This is true even before you consider the wide range of content produced by the government--pictorial, audio, video, and cartographic data are obvious--but also the wide range of primary source material that comes out of areas such as medical research, energy development, public transportation, and natural resource planning.

President Obama's directives should lead to interesting and exciting work for content technology professionals in the government. We look forward to hearing more.

Will XML Help this President?

user-pic
Vote 2 Votes  

I’m watching the inauguration activity today all day (not getting much work done) and getting caught up in the optimism and history of it all. And what does this have to do with XML you ask? It’s a stretch, but I am giddy from the festivities, so bare with me please. I think there is a big role for XML and structured technologies in this paradigm shift, albeit XML will be quietly doing it’s thing in the background as always.

In 1986, when SGML, XML's precursor, was being developed, I worked for the IRS in Washington. I was green, right out of college. My Boss, Bill Davis, said I should look into this SGML stuff. I did. I was hooked. It made sense. We could streamline the text applications we were developing. I helped write the first DTD in the executive branch (the first real government one was the ATOS DTD from the US Air Force, but that was developed slightly before the SGML standard was confirmed, so we always felt we were pretty close to creating the actual first official DTD in the federal government). Back then we were sending tax publications and instructions to services like CompuServe and BRS, each with their own data formats. We decided to try to adopt structured text technology and single source publishing to make data available in SGML to multiple distribution channels. And this was before the Web.  That specific system has surely been replaced, but it saved time and enabled us to improve our service to taxpayers. We thought the approach was right for many govenrment applications  and should be repeated by other agencies.

So, back to my original point. XML has replaced SGML and is now being used for many government systems including electronic submission of SEC filings, FDA applications, and for the management of many government records. XML has been mentioned as a key technology in the overhaul that is needed in the way the government operates. Obama also plans to create a cabinet level position of CTO, part of the mission of which will be to promote inter-agency cooperation through interchange of content and data between applications formatted in a common taxonomy. He also intends to preserve the open nature of the internet and its content, facilitate publishing important government information and activities on the Web in open formats, and to enhance the national information system infrastructure. Important records are being considered for standardization, such as health and medical records, as well as many other ways we interact with the government. More info on this administration’s technology plan can be found at http://origin.barackobama.com/issues/technology/. Sounds like a job, at least in part, for XML!

 I think it is great and essential that our leaders understand the importance of smartly structured data. There is already a lot of XML expertise through the various government offices, as well as a strong spirit of corporation on which we can build. Anyone who has participated in industry schema application development, or other common vocabulary design efforts, knows how hard it is to create a “one-size-fits-all” data model. I was fortunate enough to participate briefly in the development and implementation of SPL, the Standard Product Label (see http://www.fda.gov/oc/datacouncil/spl.html) schema for FDA drug labels which are submitted to the FDA for approval before the drug product can be sold. This is a very well defined document type that has been in use for years. It still took many months and masterful consensus building to finalize this one schema. And it is just one small piece in the much larger information architecture.  It was a lot of effort from many people within and outside the government.  But now it is in place, working and being used.

So, I am bullish on XML in the government these days. It is a mature, well understood, powerful technology with wide adoption, there are many established civilian and defense  examples across the government. I think there is a very big role for XML and related technology in the aggressive, sweeping change promised by this administration. Even so, these things take time. </>

Bill's latest Tweet

NewsShark

Sign-up for our weekly NewsShark newsletter.
Content technology industry news without the hype:

* Email

* First Name

* Last Name

* = Required Field