Author: dwaldt (Page 2 of 3)

Tell Us About Your Favorite Web 2.0 Tool

March 30, 2009 / dwaldt / 0 Comments

There sure is a lot of news about Web 2.0 these days. It can be hard to take it all in, and there seems to be new tools every day! So how to make sense of it all.

One way to learn more about these tools is to attend the session I will be hosting at the Gilbane San Francisco Conference (http://gilbanesf.com) in June called “My Favorite Web 2.0 Tool“. It will be organized in the fast paced “Lightning Round” style, with 10 speakers covering 10 topics in 60 minutes (yes, that is about 5 minutes each). This unique presentation format allows for presentation of many ideas at once, encourages audience participation, and tends to be fairly hilarious.

Got something to say about Web 2.0 tools? I would love to hear from people interested in participating in this lightning round. Send me a one paragraph description of why your favorite Web 2.0 tool should be included in this session (send to dale@gilbane.com). We’re open to a broad definition of Web 2.0 tools too. We are looking for innovative ideas, game changers, or even just entertaining or fun apps!

We would love to hear from you!. The slots will fill up fast so don’t wait if you hope to participate.

See you in San Francisco!

XML and Belly Buttons: How to “Sell” XML

March 17, 2009 / dwaldt / 0 Comments

Anyone who works with XML has probably had to “sell” the idea of using the standard instead of alternative approaches, whether as an internal evangelist of XML or in a formal sales role. We have developed some pretty convincing arguments, such as automating redundant processes, quality checking and validation of content, reuse of content using a single source publishing approach, and so on. These types of benefits are easily understood by the technical documentation department or developers and administrators in the IT group. And they are easy arguments to make.

Even so, that leaves a lot of people who can benefit from the technology but may never need know that XML is part of the solution. The rest of the enterprise may not be in tune with the challenges faced by the documentation department, and instead focus on other aspects of running a business, like customer support, manufacturing, fulfillment, or finance, etc.. If you tell them the software solution you want to buy has “XML Inside” they may stare off into space and let their eyes glaze over, even fall asleep. But if you tell them you have a way to reduce expensive customer support phone calls by making improvements to their public-facing Web content and capabilities, you might get more of their attention.

I have been around the XML community for a very long time, and we tend to look into our belly buttons for the meaning of XML. This is often doen at the expense of looking around us and seeing what problems are out there before we start talking about solutions to apply to them. Everything looks like a nail because we have this really nifty hammer called XML. But when CD-ROMs were introduced, people didn’t run around talking about the benefits of ISO 9660 (the standard that dictates how data is written to a CD). Okay they did at first to other technologists and executives in big companies adopting the standard, but rarely did the end consumer hear about the standard. Instead, we talked about the massive increase in data storage, and the flexibility of a consistent data storage format across operating systems. So we need to remember that XML is not what we want to accomplish, but rather how we may get things done to meet our goals. Therefore, we need to understand and describe our requirements in terms of these business drivers, not the tools we use to address them.

Part of the problem is that there are several potential audiences for the XML evangelism message, each with their own set of concerns and domain-specific challenges. End users want the ability to get the work out the door in a timely manner, at the right quality level, and that the tools are easy to use. Line Managers may add sensitivity to pricing, performance, maintenance and deployment costs, etc. These types of concerns I would classify as tactical departmental concerns focusing on operational efficiency (bottom line).

Meanwhile Product Managers, Sales, Customer Service, Fulfillment, Finance, etc. are more geared toward enterprise goals and strategies such as reducing product support costs, and increasing revenue, in addition to operational efficiency. Even stated goals like synchronizing releases of software and documentation, making data more flexible and robust to enable new Web and mobile delivery options, are really only supporting the efforts to achieve the first two objectives of better customer service and increased sales, which I would classify as strategic enterprise concerns.

The deft XML evangelist, to succeed in the enterprise discussion, needs to know about a lot more than the technology and processes in the documentation department, or he or she will be limited to tactical, incremental improvements. The boss may want, instead, to focus on how the data can be improved to make robust Web content that can be dynamically assembled according to the viewer’s profile. Or how critical updates can be delivered electronically and as fast as possible, while the complete collection of information is prepared for more time consuming, but equally valuable printed delivery in a multi-volume set of books. Or how content can be queried, rearranged, reformatted and delivered in a completely new way to increase revenue. Or how a business system can automatically generate financial reporting information in a form accurate and suitable enough for submission to the government, but without the army of documentation labor used previously.

At Gilbane we often talk about the maturity of XML approaches, not unlike the maturity model for software. We haven’t finalized a spectrum of maturity levels yet, but I think of XML applications as ad hoc, departmental, and enterprise in nature. Ad hoc is where someone decides to use an XML format for a simple process, maybe configuration files driving printers or other applications. Often XML is adopted with no formal training and little knowledge outside of the domain in which it is being applied.

Departmental applications tend to focus on operational efficiency, especially as it relates to creating and distributing textual content. Departmental applications are governed by a single department head but may interact with other groups and delivery feeds, but can standalone in their own environment. An enterprise application of XML would need governance from several departments or information partners, and would focus on customer or compliance facing issues and possibly growth of the business. They tend to have to work within a broader framework of applications and standards.

Each of these three application types requires different planning and justification. For ad hoc use of XML it is usually up to the individual developer to decide if XML is the right format, if a schema will be needed, and what the markup and data model are, etc. Very little “selling” is needed here except as friendly debate between developers, architects and line managers. Usually these applications can be tweaked and changed easily with little impact beyond local considerations.

Departmental application of XML usually requires a team representing all stakeholders involved in the process, from users to consumers of the info. There may be some departmental architectural standards, but exceptions to these are easier to accommodate than with enterprise applications. A careful leader of a departmental application will look upstream and down stream in the information flow to include some of their needs. Also, they need to realize that the editing process in their department may become more complex and require additional skills and resources, but that these drawbacks are more than offset but savings in other areas, such as page layout, or conversion to Web formats which can be highly automated. Don’t forget to explain these benefits to the users whose work just got a little more complicated!

An Enterprise solution is by definition tied to the business drivers of the enterprise, even if that means some decisions may seem like they come at the expense of one department over another. This is where an evangelist could be useful, but not if they only focus on XML instead of the benefits it provides. Executives need to know how much revenue can be increased, how many problem reports can be avoided in customer service, and whether they can meet regulatory compliance guidelines, etc. This is a much more complicated set of issues with dependencies on and agreement with other departments needed to be successful. If you can’t provide these types of answers, you may be stuck in departmental thinking.

XML may be the center of my universe (my belly button so to speak), but it is usually not the center of my project’s sponsor’s universe. I have to have the right message to covince them to make signifiaccnt investment in the way their enterprise operates. </>

Structured Editing & Wikis

March 3, 2009 / dwaldt / 0 Comments

If you know me you will realize that I tend to revisit XML authoring tools and processes frequently. It is one of my favorite topics. The intersection of structured tools and messy human thinking and behavior is an area fraught with usability issues, development challenges, and careful business case thinking. And therefore, a topic ripe for discussion.

I had an interesting conversation with a friend about word processors and XML editors the other day. His argument was that the word processing product model may not be the best, and certainly isn’t the only, way to prepare and manage structured content.

A word processor is software that has evolved to support the creation of documents. The word processing software model was developed when people needed to create documents, and then later added formatting and other features. This model is more than 25 years old (I remember using a word processor for the first time in college in 1980).

Of course it was logical to emulate how typewriters worked since the vast majority of information at the time was destined for paper documents. Now word processors include features for writing, editing, reviewing, formatting, and limited structural elements like links, indexes, etc. Again, all very document oriented. The content produced may be reused for other purposes if transformed in a post process (e.g., it could output HTML & PDF for Web, breaking into chunks for a repository or secondary use, etc.), but there are limits and other constraints, especially if your information is primarily designed to be consumed in print or document form.

It is easy to think of XML-structured editors, and the word processor software model they are based upon, as the most likely way to create structured content. But in my opinion, structured editors pay too much homage to word processing features and processes. I also think too many project teams assume that the only way to edit XML content is in an XML document editor. Don’t get me wrong, many people have successfully deployed XML editors and achieved targeted business goals, myself included, but I can point out many instances where an alternative approach to editing content might be more efficient.

Database tools that organize the information logically and efficiently are not likely to store that data as documents. For instance, you may have an financial system with a lot of info in relational fields that is extracted to produce printable documents like monthly statements, invoices, etc.

Or software manuals that are customized for specific configurations using reusable data objects and related document maps instead maintaining the information as static, hierarchically-organized documents.

Or aircraft information that needs to match the configuration of a specific plane or tail number, selected from a complete library of data objects stored centrally.

Or statutes that start formatted as bills, then later appear as enacted laws, then later yet again as published, codified statutes, each with their own formatting and structural peccadilloes.

Or consider a travel guide publisher that collects information on thousands of hotels, restaurants, attractions, and services in dozens of countries and cities. Sure, the content is prepared with the intent of publishing it in a book, but it is easy to see how it can be useful for other uses, including providing hotel data to travel-related Web sites, or building specialized, custom booklets for special needs (e.g., a local guide for a conference, guides to historical neighborhoods, etc.).

In these examples of what some might call database publishing, system designers need to ask them selves what would be the best tool for creating and maintaining the information. They are great candidates for a database, some application dialogs and wizards, and some extraction and transformation applications to feed Web and other platforms for consumption by users. They may not even involve an editor per se, but might rely entirely a Wiki or other dialog for content creation and editing.

Word processors require a mix of skills, including domain expertise on the subject being written about, grammar and editing, and some formatting & design, use of the software itself, etc. While I personally believe everyone, not just teachers and writers, should be skilled in writing well and making documents look legible and appealing, I realize many folks are best suited for other roles. That is why we divide labor into roles. Domain experts (e.g., lawyers, aircraft engineers, scientists and doctors, etc.) are usually responsible for accuracy and quality of the ideas and information, while editorial and product support people clean up the writing and formatting and make it presentable. So, for domain experts, it may be more efficient to provide a tool that only manages the content creation, structuring, linking, organization, etc. with limited word processing capabilities, and leave the formatting and organization to the system or another department or automated style sheets.

In my mind, a Wiki is a combination of text functionality and database management features that allow content to be created and managed in a broader Web content platform (which also may include static pages, search interfaces, pictures, PDFs, etc.). In this model, the Web is the primary use and printing is secondary. Domain experts are not bothered with concepts like page layout, running heads, tables of content generation, justification & hyphenation, etc., much to the delight of the domain experts!

I am bullish on Wikis as content creation and management tools, even when the content is destined for print. I have seen some that hide much of the structure and technical “connective tissue” from the author, but produce well formatted, integrated information. The blogging tool I am using to create this article is one example of a Wiki-like interface that has a few bells and whistles for adding structure (e.g., keywords) dedicated to a specific content creation purpose. It only emulates word processing slightly with limited formatting tools, but is loaded with other features designed to improve my blog entries. For instance, I can pick a keyword from a controlled taxonomy from a pull-down list. And all within a Web browser, not a fat client editor package. This tool is optimized for making blog content, but not for, let’s say, scientific papers or repair manuals. It is targeted for a specific class of users, bloggers. Similarly, XML-editors as we have come to know them, are more adept at creating documents and document chunks than other interfaces.

Honestly, on more than one occasion I have pounded a nail with a wrench, or tightened a bolt with the wrong kind of pliers. Usually I get the same results, but sometimes it takes longer or has a less desirable result than if I had used a more appropriate tool. The same is true for editing tools.

On a final note, forgive me if I make a gratuitous plug, but authoring approaches and tools will be the subject of a panel I am chairing at the Gilbane San Francisco conference in early June if you want to hear more. </>

XML Communities on LinkedIn

February 23, 2009 / dwaldt / 0 Comments

Just spent an excessive number of hours perusing the XML and related community groups on various Web community sites. There are several social community tools and sites, but even just LinkedIn (http://www.linkedin.com/). seems to have dozens of community groups that at least mention XML in their descriptions, and several have it prominent in their names and logos. I joined several. Let’s see what happens. Stay tuned… </>

On Stimulating Open Data Initiatives

February 12, 2009 / dwaldt / 0 Comments

Yesterday the big stimulus bill cleared the conference committee that resolves the Senate and House versions. If you remember your civics that means it will be likely to pass in the chambers and then be signed into law by the president.

Included in the bill are billions of dollars for digitizing important information such as medical records or government information. Wow! That is a lot of investment! The thinking is that inaccessible information locked in paper or proprietary formats cost us billions each year in productivity. Wow! That’s a lot of waste! Also, that access to the information could spawn a billions of dollars of new products and services, and therefore income and tax revenue. Wow! That’s a lot of growth!

Many agencies and offices have striven to expose useful official information and reports at the federal and state level. Even so, there is a lot of data still locked away, or incomplete or in difficult to use forms. A while ago a Senate official once told me that they do not maintain a single, complete, accurate, official copy of the US Statutes internally. Even if this is no longer true, the public often relies on the “trusted” versions that are available only through paid online services. Many other data types, like many medical records, only exist in paper.

There are a lot of challenges, such as security and privacy issues, even intellectual property rights issues. But there are a lot of opportunities too. There are thousands of data sources that could be tapped into that are currently locked in paper or proprietary formats.

I don’t think the benefits will come at the expense of commercial services already selling this publicly owned information as some may fear. These online sites provide a service, often emphasizing timeliness or value adds like integrating useful data from different sources, in exchange for their fees. I think a combination of free government open data resources and delivery tools, plus innovative commercial products will emerge. Maybe some easily obtained data may become commoditized, but new ways of accessing and integrating information will emerge. The big information services probably have more to fear from startups than from free government applications and data.

As it happens, I saw a demo yesterday of a tool that took all the activity of a state legislature and unified it under one portal. This allows people to track a bill and all related activity in a single place. For free! The bill working its way through both chambers is connected to related hearing agendas and minutes, which are connected to schedules, with status and other information captured in a concise dashboard-like screen format (there are other services you can pay for which fund the site). Each information component came from a different office and was originally in it’s own specialized format. What we were really looking at was a custom data integration application done with AJAX technology integrating heterogeneous data in a unified view. Very powerful, and yet scalable. The key to its success was strong integration of data, the connections that were used to tie the information together. The vendor collected and filtered the data, converted to a common format, added the linkage and relationship information to provide an integrated view into data. All source data is stored separately and maintained by different offices. Five years ago it would have been a lot more difficult to create the service. Technology has advanced, and the data are increasingly available in manageable forms.

The government produces a lot of information that affect us daily that we, as taxpayers and citizens, actually own, but have limited or no access to. These include statutes and regulations, court cases, census data, scientific data and research, agricultural reports, SEC filings, FDA drug information, taxpayer publications, forms, patent information, health guidelines, etc., etc., etc. The list is really long. I am not even scratching the surface! It also includes more interactive and real-time data, such as geological and water data, whether information, and the status of regulation and legislation changes (like reporting on the progress of the stimulus bill as it worked it way through both chambers). All of these can be made more current, expanded for more coverage, integrated with related materials, validated for accuracy. There are also new opportunities to open up the process of using forums and social media tools for collecting feedback from constituents and experts (like the demo mentioned above). Social media tools may both give people an avenue to express their ideas to their elected officials, as well as be a collection tool to gather raw data that can be analyzed for trends and statistics, which in turn becomes new government data that we can use.

IMHO, this investment in open government data is a powerful catalyst that could actually create or change many jobs or business models. If done well, it could provide significant positive returns, streamline government, open access to more information, and enable new and interesting products and applications. </>

Should you Migrate from SGML to XML?

February 3, 2009 / dwaldt / 2 Comments

An old colleague of mine from more than a dozen years ago found me on LinkedIn today. And within five minutes we got caught up after a gap of several years. I know, reestablishing lost connections happens all the time on social media sites. I just get a kick out of it every time it happens. But this is the XML blog, not the social media one, so…

My colleague works at a company that has been using SGML & XML technology for more than a 15 years. Their data is still in SGML. They feel they can always export to XML and do not plan to migrate their content and applications to SGML any time soon. The funny thing was that he was slightly embarrassed about still being in SGML.

Wait a minute! There is no reason to think SGML is dead and has to be replaced. Not in general. Maybe for specific applications a business case supports the upgrade, but it doesn’t have to every time. Not yet.

I know of several organizations that still manage data in the SGML they developed years ago. Early adopters, like several big publishers, some state and federal government applications, and financial systems were developed when there was only one choice. SGML, like XML, is a structured format. They are very, very similar. One format can be used to create the other very easily. They already sunk their investment into developing the SGML system and data, as well as training their users in it’s use. The incremental benefits of moving to XML do not support the costs of the migration. Not yet.

This brings up my main point, that structured data can be managed in many forms. These include XML, SGML, XHTML, databases, and probably other forms. The data may be structured, follow rules for hierarchy, occurrence and data typing, etc. but not be managed as XML, only exported as XML when needed. My personal opinion is that XML stored in databases provides some of the best combination of structured content management features, but different business needs suggest a variety of approaches may be suitable. Flat files stored in folders and formatted in old school SGML might still be enough and not warrant migration. Then again, it depends on the environment and the business objectives.

When XML first came out, someone coined the phrase that SGML stood for “Sounds Good, Maybe Later” because it was more expensive and difficult to implement. XML is more Web aware and is somewhat more clearly defined and therefore tools operate more consistently. Many organizations that felt SGML could not be justified were able to later justify migrating to XML. Others migrated right away to take advantage of the new tools or related standards. XML does eliminate some features of SGML that never seemed to work right too. It also demands Wellformed data, which reduces ambiguity and simplifies a few things. And tools have come a long way and are much more numerous, as expected.

XML is definitely more successful in terms of number and range of applications and XML adoption is an easier case to make today than SGML was back in the day. But many existing SGML applications still have legs. I would not suggest that a new application start off with SGML today, but I might modify the old saying to “Sounds Good, Migrate Later”.

So, when is it a good idea to migrate from SGML to XML? There are many tools available that do things with XML data better than they do with other structured forms. Many XML tools support SGML as well, but DBMS systems now can managed content as XML data type and use XML XPath nodes in processing. WIKIs and other tools can produce XML content and utilize other standards based on XML, but not SGML that I am aware of. If you want to take advantage of features of Web or XML tools, you might want to start planning your migration. But if your system is operational and stable, the benefits might not yet justify the investment and disruption from migrating. Not yet! </>

XML in Everyday Things

January 30, 2009 / dwaldt / 0 Comments

If you didn’t follow the link below to Bob DuCharme’s response to my January 13 posting on Why it is Difficult to Include Semantics in Web Content, you should read it. Bob does a great job describing tools in use to include semantics in Web content. Bob is a very smart guy. I like to think the complexity of his answer is a good illustration of my point that adding semantics is not easy. Anyway, his response is clearly worth reading and can be found at http://www.snee.com/bobdc.blog/2009/01/publishers-and-semantic-web-te.html.

Also, I have known Bob for some time. I am reminded that a while back he wrote an interesting article about XML data produced by his TiVo device (see http://www.xml.com/pub/a/2006/02/15/hacking-the-xml-in-your-tivo.html). I was intrigued how XML had begun to pop up in everyday things.

Ever since that TiVo article, I think of Bob every time XML pops up in unexpected everyday places (it’s better than associating him with a trauma). Once in a while I get a glimpse of XML data in a printer control file, in Web page source code, or as an export format for some software, but that sort of thing is to be expected. We all have seen examples at work or in commercial settings, but to find XML data at home in everyday devices and applications has always warmed my biased heart.

Recently I was playing a game of Sid Meier’s Civilization IV (all work and no play and so on….) and I noticed while it was booting up a game that one of the messages said “Reading XML FIles”. My first thought was “Bob would like to see this!” Then I was curious to see how XML was being used in game software. A quick Google search and the first entry, from Wikipedia (http://en.wikipedia.org/wiki/Civilization_IV#cite_note-10), says “More game attributes are stored in XML files, which must be edited with an external text editor or application.” Apparently you can “tweak simple game rules and change or add content. For instance, they can add new unit or building types, change the cost of wonders, or add new civilizations. Players can also change the sounds played at certain times or edit the play list for your soundtrack.”

I poked around in the directories and found schemas describing game units, events, etc. and configuration data instances describing artifacts and activities used in the game. A user could, if they wanted to, make buying a specific building very cheap for instance, or have the game play their favorite music instead of what comes with the game. That is if they know how to edit XML data. I think I just found a way to add many hours of enjoyment to an already great game.

I wonder how much everyday XML is out there just waiting for someone to tweak it and optimize it to make something work better. A thermostat, a refrigerator, or a television perhaps.

Will XML Help this President?

January 20, 2009 / dwaldt / 2 Comments

I’m watching the inauguration activity today all day (not getting much work done) and getting caught up in the optimism and history of it all. And what does this have to do with XML you ask? It’s a stretch, but I am giddy from the festivities, so bare with me please. I think there is a big role for XML and structured technologies in this paradigm shift, albeit XML will be quietly doing it’s thing in the background as always.

In 1986, when SGML, XML’s precursor, was being developed, I worked for the IRS in Washington. I was green, right out of college. My Boss, Bill Davis, said I should look into this SGML stuff. I did. I was hooked. It made sense. We could streamline the text applications we were developing. I helped write the first DTD in the executive branch (the first real government one was the ATOS DTD from the US Air Force, but that was developed slightly before the SGML standard was confirmed, so we always felt we were pretty close to creating the actual first official DTD in the federal government). Back then we were sending tax publications and instructions to services like CompuServe and BRS, each with their own data formats. We decided to try to adopt structured text technology and single source publishing to make data available in SGML to multiple distribution channels. And this was before the Web. That specific system has surely been replaced, but it saved time and enabled us to improve our service to taxpayers. We thought the approach was right for many govenrment applications and should be repeated by other agencies.

So, back to my original point. XML has replaced SGML and is now being used for many government systems including electronic submission of SEC filings, FDA applications, and for the management of many government records. XML has been mentioned as a key technology in the overhaul that is needed in the way the government operates. Obama also plans to create a cabinet level position of CTO, part of the mission of which will be to promote inter-agency cooperation through interchange of content and data between applications formatted in a common taxonomy. He also intends to preserve the open nature of the internet and its content, facilitate publishing important government information and activities on the Web in open formats, and to enhance the national information system infrastructure. Important records are being considered for standardization, such as health and medical records, as well as many other ways we interact with the government. More info on this administration’s technology plan can be found at . Sounds like a job, at least in part, for XML!

I think it is great and essential that our leaders understand the importance of smartly structured data. There is already a lot of XML expertise through the various government offices, as well as a strong spirit of corporation on which we can build. Anyone who has participated in industry schema application development, or other common vocabulary design efforts, knows how hard it is to create a “one-size-fits-all” data model. I was fortunate enough to participate briefly in the development and implementation of SPL, the Standard Product Label (see http://www.fda.gov/oc/datacouncil/spl.html) schema for FDA drug labels which are submitted to the FDA for approval before the drug product can be sold. This is a very well defined document type that has been in use for years. It still took many months and masterful consensus building to finalize this one schema. And it is just one small piece in the much larger information architecture. It was a lot of effort from many people within and outside the government. But now it is in place, working and being used.

So, I am bullish on XML in the government these days. It is a mature, well understood, powerful technology with wide adoption, there are many established civilian and defense examples across the government. I think there is a very big role for XML and related technology in the aggressive, sweeping change promised by this administration. Even so, these things take time. </>