Curated for content, computing, and digital experience professionals

Category: Web technologies & information standards (Page 24 of 58)

Here we include topics related to information exchange standards, markup languages, supporting technologies, and industry applications.

Winds of Change at Tools of Change

O’Reilly’s Tools of Change conference in New York City this week was highly successful, both inside and outside the walls of the Marriott Marquis. The sessions were energetic, well-attended, and–on the whole–full of excellent insight and ideas about the digital trends taking a firm hold of nearly all sectors of the publishing business. Outside the walls, especially on Twitter, online communities were humming with news and commentary on the the conference. (You almost could have followed the entire conference just by following the #toc hash tag at Twitter and accessing the online copies of the presentations.)

But if you had done that, you would have missed the fun of being there. There were some superb keynotes and some excellent general sessions. Notable among the keynotes were Tim O’Reilly himself, Neelan Choksi from Lexcycle (Stanza), and Cory Doctorow. The general sessions  covered a fairly broad spectrum of topics but were heavy on eBooks and community. Because of my own and my clients’ interests, I spent most of my time in the eBook sessions. The session eBooks I: Business Models and Strategy was content-rich. To begin with, you heard straight from senior people at major publishers with significant eBook efforts (Kenneth Brooks from Cengage Learning, Leslie Hulse from Harper Collins Publishers, and Cynthia Cleto from Springer Science+Business Media). Along with their insight, the speakers–and moderator Michael Smith from IDPF–assembled an incredibly valuable wiki of eBook business and technical material to back up their talk. I also really enjoyed a talk from Gavin Bell of Nature, The Long Tail Needs Community, where he made a number of thoughtful points about how publishers need to think longer and harder about how reading engages and changes people and specifically how a publisher can build community around those changes and activities.

There were a few soft spotsin the schedule. Jeff Jarvis’ keynote, What Would Google do with Publishing?, was more about plumping his new book (What Would Google Do?) than anything else, but was also weirdly out of date, even though the book is hot off the presses, with 20th century points like “The link changes everything” and “If you’re not searchable, you won’t be found.” (Publishers are often, somewhat unfairly, accused of being Luddite, but they are not that Luddite.) There were also a couple of technical speakers who didn’t seem to make the necessary business connections to the technical points they were making, which would have been helpful to those members of the audience who were less technical and more publishing-product and -process oriented. But these small weaknesses were easily outshone by the many high points, the terrific overall energy, and the clear enthusiasm of the attendees.

One question I have for the O’Reilly folks is to ask how they will keep the energy going. They have a nascent Tools of Change community site. Perhaps they could enlist some paid community managers to seed and moderate conversations, and also tie community activities to other O’Reilly products such as the books and other live and online events.

O’Reilly has very quickly established a very strong conference and an equally strong brand around the conference. With the publishing industry so engulfed in digital change now, I have to think this kind of conference and community can only continue to grow.

On Stimulating Open Data Initiatives

Yesterday the big stimulus bill cleared the conference committee that resolves the Senate and House versions. If you remember your civics that means it will be likely to pass in the chambers and then be signed into law by the president.

Included in the bill are billions of dollars for digitizing important information such as medical records or government information. Wow! That is a lot of investment! The thinking is that inaccessible information locked in paper or proprietary formats cost us billions each year in productivity. Wow! That’s a lot of waste! Also, that access to the information could spawn a billions of dollars of new products and services, and therefore income and tax revenue. Wow! That’s a lot of growth!

Many agencies and offices have striven to expose useful official information and reports at the federal and state level. Even so, there is a lot of data still locked away, or incomplete or in difficult to use forms. A while ago a Senate official once told me that they do not maintain a single, complete, accurate, official copy of the US Statutes internally. Even if this is no longer true, the public often relies on the “trusted” versions that are available only through paid online services. Many other data types, like many medical records, only exist in paper.

There are a lot of challenges, such as security and privacy issues, even intellectual property rights issues. But there are a lot of opportunities too. There are thousands of data sources that could be tapped into that are currently locked in paper or proprietary formats.

I don’t think the benefits will come at the expense of commercial services already selling this publicly owned information as some may fear. These online sites provide a service, often emphasizing timeliness or value adds like integrating useful data from different sources, in exchange for their fees. I think a combination of free government open data resources and delivery tools, plus innovative commercial products will emerge. Maybe some easily obtained data may become commoditized, but new ways of accessing and integrating information will emerge. The big information services probably have more to fear from startups than from free government applications and data.

As it happens, I saw a demo yesterday of a tool that took all the activity of a state legislature and unified it under one portal. This allows people to track a bill and all related activity in a single place. For free! The bill working its way through both chambers is connected to related hearing agendas and minutes, which are connected to schedules, with status and other information captured in a concise dashboard-like screen format (there are other services you can pay for which fund the site). Each information component came from a different office and was originally in it’s own specialized format. What we were really looking at was a custom data integration application done with AJAX technology integrating heterogeneous data in a unified view. Very powerful, and yet scalable. The key to its success was strong integration of data, the connections that were used to tie the information together. The vendor collected and filtered the data, converted to a common format, added the linkage and relationship information to provide an integrated view into data. All source data is stored separately and maintained by different offices. Five years ago it would have been a lot more difficult to create the service. Technology has advanced, and the data are increasingly available in manageable forms.

The government produces a lot of information that affect us daily that we, as taxpayers and citizens, actually own, but have limited or no access to. These include statutes and regulations, court cases, census data, scientific data and research, agricultural reports, SEC filings, FDA drug information, taxpayer publications, forms, patent information, health guidelines, etc., etc., etc. The list is really long. I am not even scratching the surface! It also includes more interactive and real-time data, such as geological and water data, whether information, and the status of regulation and legislation changes (like reporting on the progress of the stimulus bill as it worked it way through both chambers). All of these can be made more current, expanded for more coverage, integrated with related materials, validated for accuracy. There are also new opportunities to open up the process of using forums and social media tools for collecting feedback from constituents and experts (like the demo mentioned above). Social media tools may both give people an avenue to express their ideas to their elected officials, as well as be a collection tool to gather raw data that can be analyzed for trends and statistics, which in turn becomes new government data that we can use.

IMHO, this investment in open government data is a powerful catalyst that could actually create or change many jobs or business models. If done well, it could provide significant positive returns, streamline government, open access to more information, and enable new and interesting products and applications. </>

DPCI Announces Partnership with Mark Logic to Deliver XML-Based Content Publishing Solutions

DPCI, a provider of integrated technology solutions for organizations that need to publish content to Web, print, and mobile channels, announced that it has partnered with Mark Logic Corporation to deliver XML-based content publishing solutions. The company’s product, MarkLogic Server, allows customers to store, manage, search, and dynamically deliver content. Addressing the growing need for XML-based content management systems, DPCI and Mark Logic have been collaborating on several projects including one that required integration with Amazon’s Kindle reading device. Built specifically for content, MarkLogic Server provides a single solution for search and content delivery that allows customers to build digital content products: rrom task-sensitive online content delivery applications that place content in users’ workflows to digital asset distribution systems that automate content delivery; from custom publishing applications that maximize content re-use and repurposing to content assembly solutions to integrate content. http://www.marklogic.com, http://www.databasepublish.com

Will Downward eBook Prices Lead to New Sales Models?

UK-based publishing consultant Paul Coyne asked a good question on LinkedIn: Can e-books ever support a secondary (second-hand) market?

I love books. And eBooks. However, many of my books are second hand from booksellers, car-boot sales and friends. How important is this secondary market to books and can ebooks ever really go mainstream without a secondary market? BTW I have no clue how this would work!

I offered the following thoughts…

Great question. The secondary market is incredibly important to the buyer of course, and perhaps a blessing and a curse to the publisher–a blessing because it creates more value in the buyer’s mind and a curse because it slows and eliminates some sales in markets like college and school book publishing.

One of the great ongoing questions about eBooks is price point. There is a growing feeling they should be very inexpensive compared to their print counterparts, both because of the perception they are less costly to produce and the reality that there is no current secondary market. Thus you see Amazon trying to get all Kindle books under $10 (US).

I still like the idea of superdistribution for digital products. By my crude definition (some authoritative links in a moment), a buyer of an eBook would be able to pass along the eBook and gain something from the eventual use of it by another user. Think of it as me getting a small commission when someone I pass it along to ends up buying it. I guess you could also think about it as a kind of viral sales model.

See also:

A decent Wikipedia entry on superdistribution.
An old but well written Wired
magazine article on superdistribution.

We covered this in a DRM book I cowrote with Bill Rosenblatt and Steve Mooney.

XML in Everyday Things

If you didn’t follow the link below to Bob DuCharme’s response to my January 13 posting on Why it is Difficult to Include Semantics in Web Content, you should read it. Bob does a great job describing tools in use to include semantics in Web content. Bob is a very smart guy. I like to think the complexity of his answer is a good illustration of my point that adding semantics is not easy. Anyway, his response is clearly worth reading and can be found at http://www.snee.com/bobdc.blog/2009/01/publishers-and-semantic-web-te.html.

Also, I have known Bob for some time. I am reminded that a while back he wrote an interesting article about XML data produced by his TiVo device (see http://www.xml.com/pub/a/2006/02/15/hacking-the-xml-in-your-tivo.html). I was intrigued how XML had begun to pop up in everyday things.

Ever since that TiVo article, I think of Bob every time XML pops up in unexpected everyday places (it’s better than associating him with a trauma). Once in a while I get a glimpse of XML data in a printer control file, in Web page source code, or as an export format for some software, but that sort of thing is to be expected. We all have seen examples at work or in commercial settings, but to find XML data at home in everyday devices and applications has always warmed my biased heart.

Recently I was playing a game of Sid Meier’s Civilization IV (all work and no play and so on….) and I noticed while it was booting up a game that one of the messages said “Reading XML FIles”. My first thought was “Bob would like to see this!” Then I was curious to see how XML was being used in game software. A quick Google search and the first entry, from Wikipedia (http://en.wikipedia.org/wiki/Civilization_IV#cite_note-10), says “More game attributes are stored in XML files, which must be edited with an external text editor or application.” Apparently you can “tweak simple game rules and change or add content. For instance, they can add new unit or building types, change the cost of wonders, or add new civilizations. Players can also change the sounds played at certain times or edit the play list for your soundtrack.”

I poked around in the directories and found schemas describing game units, events, etc. and configuration data instances describing artifacts and activities used in the game. A user could, if they wanted to, make buying a specific building very cheap for instance, or have the game play their favorite music instead of what comes with the game. That is if they know how to edit XML data. I think I just found a way to add many hours of enjoyment to an already great game.

I wonder how much everyday XML is out there just waiting for someone to tweak it and optimize it to make something work better. A thermostat, a refrigerator, or a television perhaps.

Podcast on Structured Content in the Enterprise

Traditionally, the idea of structured content has always been associated with product documentation, but this is beginning to change. Featuring Bill Trippe, Lead Analyst at The Gilbane Group, and Bruce Sharpe, XMetaL Founding Technologist at JustSystems, a brand new podcast on The Business Value of Structured Content takes a look into why many companies are beginning to realize that structured content is more than just a technology for product documentation – it’s a means to add business value to information across the whole enterprise. 

From departmental assets such as marketing website content, sales training materials, or technical support documents, structured content can be used to grow revenue, reduce costs, and mitigate risks, ultimately leading to an improved customer experience.  

Listen to the podcast and gain important insight on how structured content can

  • break through the boundaries of product documentation
  • help organizations meet high user expectations for when and where they can access content
  • prove to be especially valuable in our rough economic times
  • …and more!

Open Government Initiatives will Boost Standards

Following on Dale’s inauguration day post, Will XML Help this President?,  we have today’s invigorating news that President Obama is committed to more Internet-based openness. The CNET article highlights some of the most compelling items from the two memoes, but I am especially heartened by this statement from the memo on the Freedom of Information Act (FOIA):

I also direct the Director of the Office of Management and Budget to update guidance to the agencies to increase and improve information dissemination to the public, including through the use of new technologies, and to publish such guidance in the Federal Register.

The key phrases are "increase and improve information dissemination" and "the use of new technologies." This is keeping in spirit with the FOIA–the presumption is that information (and content) created by or on behalf of the government is public property and should be accessible to the public.  This means that the average person should be able to easily find government content and be able to readily consume it–two challenges that the content technology industry grapples with every day.

The issue of public access is in fact closely related to the issue of long-term archiving of content and information. One of the reasons I have always been comfortable recommending XML and other standards-based technology for content storage is that the content and data would outlast any particular software system or application. As the government looks to make government more open, they should and likely will look at standards-based approaches to information and content access.

Such efforts will include core infrastructure, including servers and storage, but also a wide array of supporting hardware and software falling into three general categories:

  • Hardware and software to support the collection of digital material. This ranges from hardware and software for digitizing and converting analog materials, software for cataloging digital materials with the inclusion of metadata, hardware and software to support data repositories, and software for indexing the digital text and metadata.
  • Hardware and software to support the access to digital material. This includes access tools such as search engines, portals, catalogs, and finding aids, as well as delivery tools allowing users to download and view textual, image-based, multimedia, and cartographic data.
  • Core software for functions such as authentication and authorization, name administration, and name resolution.

Standards such as PDF-A have emerged to give governments a ready format for long-term archiving of routine government documents. But a collection of PDF/A documents does not in and of itself equal a useful government portal. There are many other issues of navigation, search, metadata, and context left unaddressed. This is true even before you consider the wide range of content produced by the government–pictorial, audio, video, and cartographic data are obvious–but also the wide range of primary source material that comes out of areas such as medical research, energy development, public transportation, and natural resource planning.

President Obama’s directives should lead to interesting and exciting work for content technology professionals in the government. We look forward to hearing more.

Mark Logic Corporation Releases MarkLogic Toolkit for Word

Mark Logic Corporation announced the MarkLogic Toolkit for Word. Distributed under the open-source Apache 2.0 license, the MarkLogic Toolkit for Word delivers a free, simple way for developers to combine native XML-based functionality in both MarkLogic Server and the most common content authoring environment, Microsoft Office Word 2007. Developers can build applications for finding and reusing enterprise content, enriching documents for search and analytics, and enhancing documents with custom metadata. The MarkLogic Toolkit for Word includes a pre-built plug-in framework for Microsoft Office Word 2007, a sample application, and an extensive library for managing and manipulating Microsoft Office Word 2007 documents. Intelligent Authoring – the MarkLogic Toolkit for Word provides the ability to build a role- and task-aware application within Microsoft Office Word 2007 to improve the content authoring process. This functionality allows users to easily locate and preview content at any level of granularity and insert it into an active document, as well as manage custom document metadata. The MarkLogic Toolkit for Word allows developers to build content applications that leverage Office Open XML, the native XML-based format of Microsoft Office Word 2007. The MarkLogic Toolkit for Word includes an add-in application for deploying web-based content applications into Microsoft Office Word 2007. This enables developers to use web development techniques, such as HTML, JavaScript, and .NET to build applications that work in concert with the Microsoft Office Word 2007 authoring environment. The MarkLogic Toolkit for Word also provides XQuery libraries that simplify working with Office Open XML for granular search, dynamic assembly, transformation, and delivery with MarkLogic Server. By leveraging the underlying XML markup, content applications built with MarkLogic and Microsoft Office Word 2007 can “round-trip” documents between various formats. The MarkLogic Toolkit for Word allows developers to inspect, modify, and even redistribute the source code to meet specific needs. You can download the latest release of MarkLogic Toolkit for Word at the Mark Logic Developer Workshop. http://www.marklogic.com

« Older posts Newer posts »

© 2024 The Gilbane Advisor

Theme by Anders NorenUp ↑