The Gilbane Advisor

Curated for content, computing, and digital experience professionals

Page 237 of 917

When is a Book Not a Book?

I recently wrote a short Gilbane Spotlight article for the EMC XML community site about the state of Iowa going paperless (article can be found here) in regards to its Administrative Code publication. It got me to thinking, “When is a book no longer a book?”

Originally the admin code was produced as a 10,000 page loose-leaf publication service containing all the regulations of the state. For the last 10 years it has also appeared on the Web as PDFs of pages, and more recently, independent data chunks in HTML. And now they have discontinued the commercial printing of the loose-leaf version and only rely on the electronic versions to inform the public. They still produce PDF pages that resemble the printed volumes that are intended for local printing of select sections by public users of the information. But the electronic HTML version is being enhanced to improve reusability of the content, present it in alternative forms and integrated with related materials, etc. Think mashups and improved search capabilities. The content is managed in an XML-based Single Source Publishing system that produces all output forms.

I have migrated many, many printed publications to XML SSP platforms. Most follow the same evolutionary path regarding how the information is delivered to consumers. First they are printed. Then a second electronic copy is produced simultaneously with the print using separate production processes. Then the data is organized in a single database and reformatted to allow editing that can produce both print and electronic. Eventually the data gets enhanced and possibly broken into chunks to better enable reusing the content, but the print is still a viable output format. Later, the print is discontinued as the subscription list falls and the print product is no longer feasible. Or the electronic version is so much better, that people stop buying the print version.
So back to the original question, is it no longer a book? Is it when you stop printing pages? Or when you stop producing the content in page-oriented PDFs? Or does it have to do with how you manage and store the information?

Other changes take place in how the information is edited, formatted, and stored that might influence the answer to the question. For instance, if the content is still managed as a series of flat files, like chapters, and assembled for print, it seems to me that it is still a book, especially if it still contains content that is very book oriented, like tables of contents and other front matter, indexes, and even page numbers. Eventually, the content may be reorganized as logical chunks stored in a database, extracted for one or more output formats and organized appropriately for each delivery version, as in SSP systems. Print artifacts like TOCs may be completely generated and not stored as persistent objects, or they can be created and managed as build lists or maps (like with DITA). As long as one version is still book-like, IMHO it is still a book.

I would posit that once the printed versions are discontinued, and all electronic versions no longer contain print-specific artifacts, then maybe this is no longer a book, but simply content.

Random House: Creating a 21st Century Publishing Framework

As part of our new report, Digital Platforms and Technologies for Publishers: Implementations Beyond “eBook,” we researched and wrote a number of case studies about how major publishing companies are moving to digital publishing. The following is case study of Random House and its use of Digital Asset Management (DAM) technology from OpenText to create a much more dynamic and agile publishing process.

Background

Random House, Inc. is the world’s largest English-language general trade book publisher. It is a division of Bertelsmann AG, one of the foremost media companies in the world.

Random House, Inc. assumed its current form with its acquisition by Bertelsmann in 1998, which brought together the imprints of the former Random House, Inc. with those of the former Bantam Doubleday Dell. Random House, Inc.’s publishing groups include the Bantam Dell Publishing Group, the Crown Publishing Group, the Doubleday Broadway Publishing Group, the Knopf Publishing Group, the Random House Audio Publishing Group, the Random House Publishing Group, and Random House Ventures.

Together, these groups and their imprints publish fiction and nonfiction, both original and reprints, by some of the foremost and most popular writers of our time. They appear in a full range of formats—including hardcover, trade paperback, mass market paperback, audio, electronic, and digital, for the widest possible readership from adults to young adults and children.

The reach of Random House, Inc. is global, with subsidiaries and affiliated companies in Canada, the United Kingdom, Australia, New Zealand, and South Africa. Through Random House International, the books published by the imprints of Random House, Inc. are sold in virtually every country in the world.

Random House has long been committed to publishing the best literature by writers both in the United States and abroad. In addition to the company’s commercial success, books published by Random House, Inc. have won more major awards than those published by any other company—including the Nobel Prize, the Pulitzer Prize, the National Book Award, and the National Book Critics Circle Award.

Bennett Cerf and Donald Klopfer founded the company in 1925, after purchasing The Modern Library—reprints of classic works of literature—from publisher Horace Liveright. Two years later, in 1927, they decided to broaden the company’s publishing activities, and the Random House colophon made its debut.

Random House first made international news by successfully defending in court the U.S. publication of James Joyce’s masterpiece, Ulysses, setting a major legal precedent for freedom of speech. Beginning in the 1930s, the company moved into publishing for children, and over the years has become a leader in the field. Random House entered reference publishing in 1947 with the highly successful American College Dictionary, which was followed in 1966 by the equally successful unabridged Random House Dictionary of the English Language. It continues to publish numerous reference works, including the Random House Webster’s College Dictionary.

In 1960, Random House acquired the distinguished American publishing house of Alfred A. Knopf, Inc., and, a year later, Pantheon Books, which had been established in New York by European editors to publish works from abroad. Both were assured complete editorial independence—a policy which continues in all parts of the company to this day.

The Open Text Digital Media Group, formerly Artesia, is a leader in enterprise and hosted digital asset management (DAM) solutions, bringing a depth of experience around rich media workflows and capabilities. Open Text media management is the choice of leading companies such as Time, General Motors, Discovery Communications, Paramount, HBO and many more.

When clients work with the Open Text Digital Media Group, they tap into a wealth of experience and the immeasurable value of:

  • A decade of designing, delivering, and implementing award-winning rich media solutions
  • A global client base of marquee customer installations
  • An experienced professional services staff with hundreds of successful implementations
  • A proven DAM implementation methodology
  • Endorsements by leading technology and implementation partners
  • Domain expertise and knowledge across a variety of industries and sectors
  • The global presence and financial strength of Open Text, a leading provider of Enterprise Content Management solutions with a track record of financial growth and stability

Continue reading

Semantic Search has Its Best Chance for Successes in the Enterprise

I am expecting significant growth in the semantic search market over the next five years with most of it focused on enterprise search. The reasons are pretty straightforward:

  • Semantic search is very hard and to scale it to the Web compounds the complexity.
  • Because the semantic Web is so elusive and results have been spotty with not much traction, it will be some time before it can be easily monetized.
  • Like many things that are highly complex, a good model will be to break the challenge of semantic search into smaller targeted business problems where focus is on a particular audience seeking content from a narrower domain.

I base this predication on my observation of the on-going struggle for organizations to get a strong framework in place to manage content effectively. By effectively I mean, establishing solid metadata, governance and publishing protocols that ensure that the best information knowledge workers produce is placed in range for indexing and retrieval. Sustained discipline and the people to exercise it just aren’t being employed in many enterprises to make this happen in a cohesive and comprehensive fashion. I have been discouraged by the number of well-intentioned projects I have seen flounder because organizations just can’t commit long-term or permanent human resources to the activity of content governance. Sometimes it is just on-again-off-again. What enterprises need are people with deep knowledge about the organization and how its content fits together in a logical framework for all types of knowledge workers. Instead, organizations tend to assign this job to external consultants or low-level staffers who are not well-grounded in the work of the particular enterprise. The results are predictably disappointing.

Enter semantic search technologies where there are multiple algorithmic tools available to index and retrieve content for complex and multi-faceted queries. Specialized semantic technologies are often well suited to shorter term projects for which domain specific vocabularies can be built more quickly with good results. Maintaining targeted vocabulary ontologies for a focused topic can be done with fewer human resources and a carefully bounded ontology can become an intelligent feed to a semantic search engine, helping it index with better precision and relevance.

This scenario is proposed with one caveat; enterprises must commit to having very smart people with enterprise expertise to build the ontology. Having a consultant coach the subject matter expert in method, process and maintenance guidelines for doing so is not a bad idea but the consultant has to prepare the enterprise for sustainability after exiting the scene.

The wager here is that enterprises can ramp up semantic search with a series of short, targeted projects, each of which establishes a goal of solving one business problem at a time and committing to efficient and accurate content retrieval as part of the solution. By learning what works well in each situation, intranet web retrieval will improve systematically and thoughtfully. The ramp to a better semantic Web will be paved with these interlocking pieces.

Keep an eye on these companies to provide technologies for point solutions in business critical applications: Basis Technology, Cognition Technology, Connotate, Expert Systems, Lexalytics, Linguamatics, Metatomix, Semantra, Sinequa and Temis.

Digital Publishing Visionary Profile: Lulu’s Bob Young

As part of our new report, Digital Platforms and Technologies for Publishers: Implementations Beyond “eBook,” we interviewed a number of industry visionaries. The following is a summary of a discussion between Lulu’s Bob Young and Gilbane’s Steve Paxhia.

Bob Young: Lulu—Next Steps

Bob Young is the founder and CEO of Lulu.com, a premier international marketplace for new digital content on the Internet, with more than 1.1 million recently published titles and more than 15,000 new creators from 80 different countries joining each week. Founded in 2002, Lulu.com is Young’s most recent endeavor. The success of this company has earned Young notable recognition; he was named one of the “Top 50 Agenda-Setters in the Technology Industry in 2006” and was ranked as the fourth “Top Entrepreneur for 2006,” both by Silicon.com. In 1993, Young co-founded Red Hat, the open source software company that gives hardware and software vendors a standard platform on which to certify their technology. Red Hat has evolved into a Fortune 500 company and chief rival to Microsoft and Sun. His success at Red Hat won him industry accolades, including nomination as one of Business Week’s “Top Entrepreneurs” in 1999. Before founding Red Hat, Young spent 20 years at the helm of two computer leasing companies that he founded. His experiences as a high-tech entrepreneur combined with his innate marketing savvy led to Red Hat’s success. His book, “Under the Radar,” chronicles how Red Hat’s open-source strategy successfully won industry wide acceptance in a market previously dominated by proprietary binary-only systems. Young has also imparted the lessons learned from his entrepreneurial experiences through his contributions to the books “You’ve GOT to Read This Book!” and “Chicken Soup for the Entrepreneur’s Soul.

For many years, authors who were unsuccessful in getting their books published by a commercial publishing company could underwrite the costs of publishing their books and sell them through “vanity presses.” It was rare that books published in this manner ever recouped the author’s investment and earned a profit.

Bob Young admits that when he was in college that he never fully appreciated the writings of philosopher Jean Paul Sartre. However, one of Sartre’s teachings—“We see the world the way that we expect to see it”—stuck with him. This passage helps explain how established practices and entities become so entrenched. Yet in 2002, Bob Young had an idea that would attack the established policies and practices of the book publishing industry. The industry had consolidated tremendously in the previous decade, and the distribution and retail networks changed dramatically. These changes have had a profound impact on potential authors. The reduction in the number of publishing entities has resulted in it becoming more difficult for authors to get their works published. The publishing company may already have a similar title or be unwilling to take a chance on an unpublished author. Sometimes, a book is written by a prominent author but the market niche is too small for traditional publishers to serve. These phenomena leave a significant number of high quality books without a publisher.

The publishing industry and its distribution network were becoming more digital. Another of Young’s favorite philosophers points out that when new media take prominence leaders in the previous medium often fail to succeed. Digital technologies are now used to create all types of content and the move towards digital distribution networks, as demonstrated by the popularity of Amazon.com and its peers, opening up markets for these books.

With the goal of allowing every author to have access to a professional publishing platform an extensive sales and distribution platform, Young’s idea became a company named “Lulu” and has evolved and thrived during the past six years. The company now has three main product lines:

  • Print—books, brochures, manuals and materials for business solutions
  • Photo Creations—calendars, photo books, art and images
  • Social Networking–marketing, commerce and exposure via weRead, the most popular social book discovery application, allowing readers to catalogue, rate and review books.

The value proposition is very simple and appealing to authors. Lulu.com presents authors with total editorial and copyright control with additional protection provided from the Lulu.com backend. They make money on their projects with an 80/20 revenue split (80% for the authors) and by Lulu.com providing a unique on-line sales and distribution system, a viable business model for the current economy and beyond.

Powerful search engines and social community applications help match willing readers with niche titles. Content on Lulu.com is easily accessible—perfect for niche communities searching for specific topics. Lulu.com is home to a new economy, a digital marketplace of buyers and sellers, where sellers are selling “intellectual property” and buyers buy the intellectual property in either a physical or digital format. Lulu.com allows for personalization and customization for individual or business needs.

During the 2009 O’Reilly TOC Conference, Jason Fried of 37signals described the book that he and his colleagues had written based on lessons learned from creating and servicing their successful project management and collaboration product named Basecamp. They published their book with Lulu.com and report sales of almost $500,000 in the last several years. This enabled them to reach number three on the Lulu best seller list at one point. Ideally, this story would have a happy ending and they would publish their next book with Lulu.com. Alas, the success of their previous book motivated a traditional publisher to offer them a significant advance for their second book. The offer was too tempting to refuse. They now have to hope that the traditional economic model with 10-20% royalties will generate more than Lulu.com’s 80-20 split. In essence, they are wagering that the traditional publisher will be able to sell at least four times the number of books that Lulu.com would have sold.

When asked about this, Young was nonplussed. He simply stated that it was his goal to publish their third book and to make them loyal authors in the future. It is his number one goal to help his authors become successful. He believes that discoverability is the key to helping his authors sell more books. Hence, he acquired weRead, the most popular community of readers. This technology helps readers find, read and rate new books on topics of interest to them no matter what the genre or how small the niche. The connection between weRead and e-tailers such as Amazon forms a powerful combination of capabilities that erode the advantages once monopolized by traditional publishers and bookstores. For many books, Lulu.com’s print on demand publishing and distribution model is faster, cheaper, and more efficient than the traditional publishing model, and is much less risky.

Lulu also offers authors publishing templates and a set of tools to create Websites, storefronts, widgets and blogs for their books. While these are self-service offerings, they further erode the service advantages provided by traditional publishers. The service has been so successful that new small publishers are using Lulu.com as a platform for their own publishing companies. Other publishing companies are using Lulu to keep books in print once the current print run has been exhausted.

Young believes that there are many books available at Lulu.com that are superior to those published by traditional publishers. The key is to help each book become discovered. He concludes “we’re not in the business of choosing the best books to be published, we give authors the technologies and services to be successful and let the market decide which books are the best.” The type of content that Lulu supports is continuing to expand. Lulu just announced the acquisition of Poetry.com and has rebranded it to Lulu Poetry.

Gilbane Conclusions

This is a very disruptive approach to publishing. The change to digital development and the increasing popularity of eBooks combined with the increasing market share enjoyed by e-tailers makes Lulu.com’s strategy very powerful. We expect to see this model gain greater acceptance as economies offered by print on demand drive up the cost threshold versus long run printing. Lulu is the established leader in this segment which, according to Sartre, bodes well for the company’s future.

Ontopia 5.0.0 Released

The first open source version of Ontopia has been released, which you can download from Google Code. This is the same product as the old commercial Ontopia Knowledge Suite, but with an open source license, and with the old license key restrictions removed. The new version has been created by not just by the Bouvet employees who have always worked on the product, but also by open source volunteers. In addition to bug fixes and minor changes, the main new features in this version are: Support for TMAPI 2.0; The new tolog optimizer; The new TologSpy tolog query profiler; The net.ontopia.topicmaps.utils. QNameRegistry and QNameLookup classes have been added, providing  lookup of topics using qnames; Ontopia now uses the Simple Logging Facade for Java (SLF4J), which makes it easier to switch logging engines, if desired. http://www.ontopia.net/

Digital Platforms and Technologies for Publishers: Implementations Beyond “eBook”

We are very happy to announce that we’ve published our new report, Digital Platforms and Technologies for Publishers: Implementations Beyond “eBook.” The 142 page report is available at no charge for download here.

From the Introduction:

Much has changed since we decided to write a comprehensive study on the digital book publishing industry. The landscape has changed rapidly during the past months and we have tried to reflect as many of these changes as possible in the final version of our report. For example:

  • Sales of eBooks finally reached their inflection point in late 2008.
  • Customer acceptance of digital reading platforms such including dedicated reading devices like the Kindle and the Sony Reader and mobile devices like the iPhone and the BlackBerry have helped accelerate the market for digital products.
  • The Google settlement, once finally approved by the courts, will substantially increase the supply of titles available in digital formats.
  • New publishing technologies and planning processes are enabling publishers and authors to create digital products that have their own set of features that take full advantage of the digital media and platforms. Embedded context-sensitive search and the incorporation of rich media are two important examples.
  • Readers are self-organizing into reading communities and sharing their critiques and suggestions about which books their fellow readers should consider. This is creating a major new channel for authors and publishers to exploit.
  • Print-on-demand and short-run printing continue to make significant advances in quality and their costs per unit are dropping. These developments are changing the economics of publishing and are enabling publishers to publish books that would have been too risky in the previous economic model.
  • Lower publishing and channel costs are making it possible for publishers to offer their digital titles at lower prices. This represents greater value for readers and fair compensation for all stakeholders in the publishing enterprise.

We are privileged to report such a fine collection of best practices. And we are thankful that so many smart people were willing to share their perspectives and vision with us and our readers. We thank our sponsors for their ardent and patient support and hope that the final product will prove worth the many hours that went into its preparation.

We encourage readers of this report to contact us with their feedback and questions. We will be pleased to respond and try to help you find solutions to your own digital publishing challenges!

Go With the (Enterprise 2.0 Adoption) Flow

People may be generally characterized as one of the following: optimists, realists, or pessimists. We all know the standard scenario used to illustrate these stereotypes.

Optimists look at the glass and say that it is partially full. Pessimists remark that the glass is mostly empty. Realists note that there is liquid in the glass and make no value judgment about the level.

The global Enterprise 2.0 community features the same types of individuals. I hear them speak and read their prose daily, noticing the differences in the way that they characterize the current state of the E2.0 movement. E2.0 evangelists (optimists) trumpet that the movement is revolutionary. Doubters proclaim that E2.0 will ultimately fail for many of the same reasons that earlier attempts to improve organizational collaboration did. Realists observe events within the E2.0 movement, but don’t predict its success or demise.

All opinions should be heard and considered, to be sure. In some ways, the position of the realist is ideal, but it lacks the spark needed to create forward, positive momentum for E2.0 adoption or to kill it. A different perspective is what is missing in the current debate regarding the health of the E2.0 movement.

Consider again the picture of the glass of liquid and the stereotypical reactions people have to it. Note that none of those reactions considers flow. Is the level of liquid in the glass rising or falling?

Now apply the flow question to the E2.0 movement. Is it gaining believers or is it losing followers? Isn’t that net adoption metric the one that really matters, as opposed to individual opinions, based on static views of the market, about the success or failure of the E2.0 movement to-date?

The E2.0 community needs to gather more quantitative data regarding E2.0 adoption in order to properly access the health of the movement. Until that happens, the current, meaningless debate over the state of E2.0 will continue. The effect of that wrangling will be neither positive or negative — net adoption will show little gain —  as more conservative adopters continue to sit on the sideline, waiting for the debate to end.

Anecdotal evidence suggests that E2.0 adoption is increasing, albeit slowly. The surest way to accelerate E2.0 adoption is to go with the flow — to measure and publicize increases in the number of organizations using social software to address tangible business problems. Published E2.0 case studies are great, but until more of those are available, simply citing the increase in the number of organizations deploying E2.0 software should suffice to move laggards off the sideline and on to the playing field.

It Takes Work to Get Good-to-Great Enterprise Search

It takes patience, knowledge and analysis to tell when search is really working. For the past few years I have seen a trend away from doing any “dog work” to get search solutions tweaked and tuned to ensure compliance with genuine business needs. People get cut, budgets get sliced and projects dumped because (fill the excuse) and the message gets promoted “enterprise search doesn’t work.” Here’s the secret, when enterprise search doesn’t work the chances are it’s because people aren’t working on what needs to be done. Everyone is looking for a quick fix, short cut, “no thinking required” solution.

This plays out in countless variations but the bottom line is that impatience with human processing time and the assumption that a search engine “ought to be able to” solve this problem without human intervention cripple possibilities for success faster than anything else.

It is time for search implementation teams to get realistic about the tasks that must be executed and milestones to be reached. Teams must know how they are going to measure success and reliability, then to stick with it, demanding that everyone agrees on the requirements before throwing the towel in at the first executive anecdote that the “dang thing doesn’t work.”

There are a lot of steps to getting even an out-of-the-box solution working well. But none is more important than paying attention to these:

  • Know your content
  • Know your search audience
  • Know what needs to be found and how it will be looked for
  • Know what is not being found that should be

The operative verb here is to know and to really know anything takes work, brain work, iterative, analytical and thoughtful work. When I see these reactions from IT upon setting off a search query that returns any results: “we’re done” OR “no error messages, good” OR “all these returns satisfy the query,” my reaction is:

  • How do you know the search engine was really looking in all the places it should?
  • What would your search audience be likely to look for and how would they look?
  • Who is checking to make sure these questions are being answered correctly
  • How do you know if the results are complete and comprehensive?

It is the last question that takes digging and perseverance. It is pretty simple to look at search results and see content that should not have been retrieved and figure out why it was. Then you can tune to make sure it does not happen again.

To make sure you didn’t miss something takes systematic “dog work” and you have to know the content. This means starting with a small body of content that it is possible for you to know, thoroughly. Begin with content representative of what your most valued search audience would want to find. Presumably, you have identified these people through establishing a clear business case for enterprise search. (This is not something for the IT department to do but for the business team that is vested in having search work for their goals.) Get these “alpha worker” searchers to show you how they would go about trying to find the stuff they need to get their work done every day, to share with you some of what they consider some of the most valuable documents they have worked with over the past few years. (Yes, years – you need to work with veterans of the organization whose value is well established, as well as with legacy content that is still valuable.)

Confirm that these seminal documents are in the path of the search engine for the index build; see what is retrieved when they are searched for by the seekers. Keep verifying by looking at both content and results to be sure that nothing is coming back that shouldn’t and that nothing is being missed. Then double the content with documents on similar topics that were not given to you by the searchers, even material that they likely would never have seen that might be formatted very differently, written by different authors, and more variable in type and size but still relevant. Re-run the exact searches that were done originally and see what is retrieved. Repeat in scaling increments and validate at every point. When you reach points where content is missing from results that should have been found using the searcher’s method, analyze, adjust, and repeat.

A recent project revealed to me how willing testers are to accept mediocre results when it became apparent how closely content must be scrutinized and peeled back to determine its relevance. They had no time for that and did not care how bad the results were because they had a pre-defined deadline. Adjustments may call for refinements in the query formulation that might require an API to make it more explicit, or the addition of better category metadata with rich cross-references to cover vocabulary variations. Too often this type of implementation discovery signals a reason to shut down the project because all options require human resources and more time. Before you begin, know that this level of scrutiny will be necessary to deliver good-to-great results; set that expectation for your team and management, so it will be acceptable to them when adjustments are needed for more work to be done to get it right. Just don’t blame it on the search engine – get to work, analyze and fix the problem. Only then can you let search loose on your top target audience.

« Older posts Newer posts »

© 2024 The Gilbane Advisor

Theme by Anders NorenUp ↑