Curated for content, computing, and digital experience professionals

Month: January 2009 (Page 1 of 5)

XML in Everyday Things

If you didn’t follow the link below to Bob DuCharme’s response to my January 13 posting on Why it is Difficult to Include Semantics in Web Content, you should read it. Bob does a great job describing tools in use to include semantics in Web content. Bob is a very smart guy. I like to think the complexity of his answer is a good illustration of my point that adding semantics is not easy. Anyway, his response is clearly worth reading and can be found at http://www.snee.com/bobdc.blog/2009/01/publishers-and-semantic-web-te.html.

Also, I have known Bob for some time. I am reminded that a while back he wrote an interesting article about XML data produced by his TiVo device (see http://www.xml.com/pub/a/2006/02/15/hacking-the-xml-in-your-tivo.html). I was intrigued how XML had begun to pop up in everyday things.

Ever since that TiVo article, I think of Bob every time XML pops up in unexpected everyday places (it’s better than associating him with a trauma). Once in a while I get a glimpse of XML data in a printer control file, in Web page source code, or as an export format for some software, but that sort of thing is to be expected. We all have seen examples at work or in commercial settings, but to find XML data at home in everyday devices and applications has always warmed my biased heart.

Recently I was playing a game of Sid Meier’s Civilization IV (all work and no play and so on….) and I noticed while it was booting up a game that one of the messages said “Reading XML FIles”. My first thought was “Bob would like to see this!” Then I was curious to see how XML was being used in game software. A quick Google search and the first entry, from Wikipedia (http://en.wikipedia.org/wiki/Civilization_IV#cite_note-10), says “More game attributes are stored in XML files, which must be edited with an external text editor or application.” Apparently you can “tweak simple game rules and change or add content. For instance, they can add new unit or building types, change the cost of wonders, or add new civilizations. Players can also change the sounds played at certain times or edit the play list for your soundtrack.”

I poked around in the directories and found schemas describing game units, events, etc. and configuration data instances describing artifacts and activities used in the game. A user could, if they wanted to, make buying a specific building very cheap for instance, or have the game play their favorite music instead of what comes with the game. That is if they know how to edit XML data. I think I just found a way to add many hours of enjoyment to an already great game.

I wonder how much everyday XML is out there just waiting for someone to tweak it and optimize it to make something work better. A thermostat, a refrigerator, or a television perhaps.

Podcast on Structured Content in the Enterprise

Traditionally, the idea of structured content has always been associated with product documentation, but this is beginning to change. Featuring Bill Trippe, Lead Analyst at The Gilbane Group, and Bruce Sharpe, XMetaL Founding Technologist at JustSystems, a brand new podcast on The Business Value of Structured Content takes a look into why many companies are beginning to realize that structured content is more than just a technology for product documentation – it’s a means to add business value to information across the whole enterprise. 

From departmental assets such as marketing website content, sales training materials, or technical support documents, structured content can be used to grow revenue, reduce costs, and mitigate risks, ultimately leading to an improved customer experience.  

Listen to the podcast and gain important insight on how structured content can

  • break through the boundaries of product documentation
  • help organizations meet high user expectations for when and where they can access content
  • prove to be especially valuable in our rough economic times
  • …and more!

Gilbane San Francisco pre-conference workshops posted

The main conference program for Gilbane San Francisco 2009 will be published in a week or two, but the 1/2 day pre-conference workshop descriptions for June 2nd have been posted:

  • How to Select a Web Content Management System
    Instructor: Seth Gottlieb, Principal, Content Here
  • Making SharePoint Work in the Enterprise
    Instructor: Shawn Shell, Principal, Consejo, Inc.
  • Managing the Web: The Fundamentals of Web Operations Management
    Instructor: Lisa Welchman, Founding Partner, Welchman Pierpoint
  • Getting Started with Business Taxonomy Design
    Instructors: Joseph A. Busch, Founder and Principal, & Ron Daniel, Principal, Taxonomy Strategies LLC
  • Sailing the Open Seas of New Media
    Instructor: Chris Brogan, President, New Marketing Labs, LLC

Day Software Unveils Cloud-Ready CRX Content Infrastructure for Web 2.0 Applications

Day Software (SWX:DAYN)(OTCQX:DYIHY) announced the availability of three new licensed editions of CRX, Day’s JSR-170-compliant Java Content Repository (JCR). These new editions make it simpler for companies to standardize on an enterprise-ready content infrastructure based on Day’s commercial implementation of Apache Jackrabbit and Apache Sling. This release is enhanced by a licensing model that promotes adoption by individual developers, departments and global enterprises. Day now offers three targeted editions of its content infrastructure platform: CRX One: CRX One is a new version of CRX licensed for use to power a single Web application. CRX One is Day’s entry-level CRX offering, available directly from Day’s Web site for an annual subscription license fee of US$18,500 per server instance per year, regardless of the number of CPUs; CRX Developer: CRX Developer is a limited license version of CRX available free of charge for Apache Jackrabbit and Apache Sling developers directly from Day’s Web site. Web developers can use CRX Developer at no cost under an annual renewal license for building and testing new CRX-based content applications; CRX Enterprise: CRX Enterprise is Day’s premier CRX offering for use in powering multiple Web applications. CRX Enterprise is targeted for IT departments looking to consolidate disparate enterprise content repositories under a single, shared cluster of CRX. Organizations can seamlessly update their CRX One licenses to CRX Enterprise to host multiple applications without installing or managing new software. CRX Enterprise is offered under a perpetual license model that starts at US$50,000 per server instance. Day CRX Developer is available free of charge immediately from Day’s Web site at http://www.day.com

Churning in the Search Sector – Two BIG Events in One Week

Analysts having been projecting major consolidation in the enterprise search marketplace for a couple of years. What is interesting to me is how slowly this is evolving. For every merger or acquisition, whether small or large (acquisition of Mondosoft by SurfRay or FAST by Microsoft), other companies emerge or evolve with diverse and potentially competitive technologies (e.g. Attivio, Connotate, Expert System, EyeAlike, Truevert, Temis).

We have seen companies like Exalead, ISYS, and Vivisimo gain on former leaders. Microsoft is often listed as an industry leader because it acquired former leader FAST while companies with solid products for verticals, like Recommind in law and financial services, are often overlooked because they lack the total company revenues of a Microsoft that sells a lot more software than enterprise search.

This past week two industry news items caused me to reflect on the potential impact of announcements that, while not surprising, can upset the plans of buyers of search technology. The first was the announcement that Autonomy is planning to procure Interwoven. That Interwoven is being acquired is no surprise, since the company was being groomed for acquisition. However, this appears to be the first instance of a “search” company acquiring a “content management/document management” company. The norm has been that search companies get bought to fill a need by ECM or CMS vendors. For anyone planning to procure Interwoven because of its embedded Vivisimo Velocity for Universal search in its Worksite product, this does put a wrinkle in the fabric. What a shame because it is going to be a while before the actual impact is really known and could slow sales. The cost to buyers having to accept Autonomy’s IDOL instead of Velocity could be significant. The effect could be on both licensing and deployment because Velocity has been an efficient install for most enterprises. Autonomy has got a big ramp up to shift from being a search company to becoming an ECM supplier and some will take a wait and see attitude, regardless of the Idol reputation.

The second big announcement, of course, is the departure from Microsoft of John Marcus Lervik, a co-founder of FAST and recently named Executive VP in a newly created position for Enterprise Search at Microsoft. I’m sure you’ll be seeing plenty about the reasons elsewhere. However, the difficulty for those buyers who are depending on FAST‘s search technology to be integrated sooner rather than later in Microsoft‘s offerings has just been made more complicated as one of the original leaders of FAST is leaving the team.

Two years ago I commented to FAST executives about the need for vendors on a rapid growth path to make the buying, business and support experience for customers a priority, beyond technology enhancements; so, I take little consolation in seeing this turmoil. If you are a buyer, take a good hard look behind the technology to see what else you will be dealing with as you make plans to acquire software.

Webinar: Making the Business Case for SaaS WCM

Updated April 9, 2009: View the recorded webinar.
January 27, 2009, 2:00 pm ET

When customer experience becomes increasingly important even as budgets are tightening, the SaaS value proposition–faster time to results, reduced dependency on IT resources, predictable costs–can be especially compelling. If your organization wants or needs to move ahead with web business initiatives in today’s uncertain economic climate, you’re probably investigating software-as-a-service solutions for web content management.

But SaaS WCM is fundamentally different from licensing software (open source or proprietary) and installing it on your own servers. Which means the process of evaluating solutions is different. It’s not all apples when SaaS is on the short list, but rather apples and oranges.This webinar explores the implications for technology acquisition. How do you make a business case that enables your organization to fairly evaluate all options and make the best decision for the business?

Join us in a lively discussion with Robert Carroll from Clickability. Register today. Presented by Gilbane. Sponsored by Clickability. Based on a new Gilbane Beacon entitled Communicating SaaS WCM Value.

Open Government Initiatives will Boost Standards

Following on Dale’s inauguration day post, Will XML Help this President?,  we have today’s invigorating news that President Obama is committed to more Internet-based openness. The CNET article highlights some of the most compelling items from the two memoes, but I am especially heartened by this statement from the memo on the Freedom of Information Act (FOIA):

I also direct the Director of the Office of Management and Budget to update guidance to the agencies to increase and improve information dissemination to the public, including through the use of new technologies, and to publish such guidance in the Federal Register.

The key phrases are "increase and improve information dissemination" and "the use of new technologies." This is keeping in spirit with the FOIA–the presumption is that information (and content) created by or on behalf of the government is public property and should be accessible to the public.  This means that the average person should be able to easily find government content and be able to readily consume it–two challenges that the content technology industry grapples with every day.

The issue of public access is in fact closely related to the issue of long-term archiving of content and information. One of the reasons I have always been comfortable recommending XML and other standards-based technology for content storage is that the content and data would outlast any particular software system or application. As the government looks to make government more open, they should and likely will look at standards-based approaches to information and content access.

Such efforts will include core infrastructure, including servers and storage, but also a wide array of supporting hardware and software falling into three general categories:

  • Hardware and software to support the collection of digital material. This ranges from hardware and software for digitizing and converting analog materials, software for cataloging digital materials with the inclusion of metadata, hardware and software to support data repositories, and software for indexing the digital text and metadata.
  • Hardware and software to support the access to digital material. This includes access tools such as search engines, portals, catalogs, and finding aids, as well as delivery tools allowing users to download and view textual, image-based, multimedia, and cartographic data.
  • Core software for functions such as authentication and authorization, name administration, and name resolution.

Standards such as PDF-A have emerged to give governments a ready format for long-term archiving of routine government documents. But a collection of PDF/A documents does not in and of itself equal a useful government portal. There are many other issues of navigation, search, metadata, and context left unaddressed. This is true even before you consider the wide range of content produced by the government–pictorial, audio, video, and cartographic data are obvious–but also the wide range of primary source material that comes out of areas such as medical research, energy development, public transportation, and natural resource planning.

President Obama’s directives should lead to interesting and exciting work for content technology professionals in the government. We look forward to hearing more.

Taxonomy and Glossaries for Enterprise Search Terminology

Two years ago when I began blogging for the Gilbane Group on enterprise search, the extent of my vision was reflected in the blog categories I defined and expected to populate with content over time. They represented my personal “top terms” that were expected to each have meaningful entries to educate and illuminate what readers might want to know about search behind the firewall of enterprises.

A recent examination of those early decisions showed me where there are gaps in content, perhaps reflecting that some of those topics were:

  • Not so important
  • Not currently in my thinking about the industry
  • OR Not well defined

I also know that on several occasions I couldn’t find a good category in my list for a blog I had just written. Being a former indexer and heavy user of controlled vocabularies, on most occasions I resisted the urge to create a new category and found instead the “best fit” for my entry. I know that when the corpus of content or domain is small, too many categories are useless for the reader. But now, as I approach 100 entries, it is time to reconsider where I want to go with blogging about enterprise search.

In the short term, I am going to try to provide entries for scantily covered topics because I still think they are all relevant. I’ll probably add a few more along the way or perhaps make some topics a little more granular.

Taxonomies are never static, and require periodic review, even when the amount of content is small. Taxonomists need to keep pace with current use of terminology and target audience interests. New jargon creeps in although I prefer to use generic and terms broadly understood in the technology and business world.

That gives you an idea of some of my own taxonomy process. To add to the entries on terminology (definitions) and taxonomies, I am posting a glossary I wrote for last year’s report on the enterprise search market and recently updated for the Gilbane Workshop on taxonomies. While the definitions were all crafted by me, they are validated through the heavy use of the Google “define” feature. If you aren’t already a user, you will find it highly useful when trying to pin down a definition. At the Google search box, simply type define: xxx xxx (where xxx represents a word or phrase for which you seek a definition). Google returns all the public definition entries it finds on the Internet. My definitions are then refined based on what I learn from a variety of sources I discover using this technique. It’s a great way to build your knowledge-base and discover new meanings.

Glossary Taxonomy and Search-012009

« Older posts

© 2020 The Gilbane Advisor

Theme by Anders NorenUp ↑