We’re back after our annual December break and looking forward to a year of consequential, if not yet tectonic, shifts in enterprise and consumer content strategies and applications. We’ll be closely watching how, and how fast, three major technology areas will drive these changes: 1) The tension between the Open Web and proprietary platforms; 2) Machine learning, in particular for unstructured data and mixed data; 3) New content types and uses — AR is here, but when will it grow beyond cute apps to serious industry breakthroughs? Each of these has the potential to dramatically re-arrange industry landscapes. Stay tuned!
A plan to rescue the Web from the Internet
André Staltz published The Web began dying in 2014, here’s how in the late Fall. It was a depressing post but there wasn’t much to argue with except an apparent ready acceptance of defeat. It is a good read, and for those familiar with the history, skim to the second half. Fortunately, he followed up with a post on a plan that already has some pieces in place. The plan “in short is: Build the mobile mesh Web that works with or without Internet access, to reach 4 billion people currently offline”. This is not a quick fix, and its future is not certain, but it is just the kind of bold thinking we need. Crucially, it recognizes the need for both open and closed systems. Highly recommended. Read More
A letter about Google AMP
More than other major platforms, Google has a stake in the Open Web and is largely supportive of it, Progressive Web Apps (PWAs) for example. And while they have been somewhat responsive to publisher concerns, there is reason to worry that AMP could end up as a wall for Google’s garden. There is a lot to like about AMP but ensuring it evolves in ways compatible with the Open Web is critical for Google and the health the Open Web. This succinct letter signed by a growing list of (mostly) developers has a couple of reasonable recommendations for Google to consider. Read More
What does the publishing industry bring to the Web?
The short answer is that the Open Web should not be limited to pointers to either walled gardens or proprietary applications. Complex collections of content and metadata that require or benefit from unique presentation or organization, in other words, documents, are too valuable not to be included as full Web citizens. Ivan Herman goes into more detail on the W3C blog…
Web Publications should put the paradigm of a document on the Web back in the spotlight. Not in opposition to Web Applications but to complement them. (Web) Publications should become first class entities on the Web. This should lead to a right balance between a Web of Applications and a Web of Documents as two, complementary faces of the World Wide Web. Read More
Does long-form content work in today’s small attention span world?
“Social media moves fast and rewards scrolling quickly past one message and onto the next. And mobile devices aren’t usually associated with spending long periods of time sitting and reading. It’s natural for people to assume these trends point toward a preference for shorter, “snackable”
content that can be consumed quickly… And yet, actual research looking into the issue of how content of different lengths performs doesn’t back up that assumption.” Read More
HTML5 Proposed Recommendation published on schedule.
The HTML Working Group has published a Proposed Recommendation of “HTML5.” This specification defines the 5th major revision of the core language of the World Wide Web: the Hypertext Markup Language (HTML). In this version, new features are introduced to help Web application authors, new elements are introduced based on research into prevailing authoring practices, and special attention has been given to defining clear conformance criteria for user agents in an effort to improve interoperability. Comments are welcome through 14 October. Learn more about the HTML Activity.
W3C announced Web Platform Docs, which promises to be a valuable new resource for web developers of all levels. Imagine a single site that you can depend on for up-to-date, accurate, and browser and device neutral answers and advice for both simple and complex questions. It is brand new and “alpha” but already useful. Below is info from their announcement and a short video. For those of us that prefer textual info see this blog post from Doug Schepers: http://blog.webplatform.org/2012/10/one-small-step/
W3C, in collaboration with Adobe, Facebook, Google, HP, Microsoft, Mozilla, Nokia, Opera, and others, announced today the alpha release of Web Platform Docs (docs.webplatform.org). This is a new community-driven site that aims to become a comprehensive and authoritative source for web developer documentation. With Web Platform Docs, web professionals will save time and resources by consulting with confidence a single site for current, cross-browser and cross-device coding best practices.
Adobe Systems Incorporated announced that it has acquired EchoSign, a leading Web-based provider of electronic signatures and signature automation. EchoSign’s electronic signature solution will be a key component of Adobe’s document exchange services platform. The EchoSign solution will be integrated with other Adobe document services including SendNow for managed file transfer, FormsCentral for form creation and CreatePDF for online PDF creation. The EchoSign electronic signature solution automates the entire signature process from the request for signature to the distribution and execution of the form or agreement. The EchoSign service includes a rich set of APIs for incorporation with company-specific solutions to improve the process of sending, tracking and signing digital documents. EchoSign is based in Palo Alto, Calif. with a sales presence in the U.K. and Germany. The founders of EchoSign and all full-time employees will join Adobe. http://www.adobe.comhttp://www.echosign.com/
Traditionally, publishing is a pushy process. When I have something to say, I write it down. Perhaps I revise it, check with colleagues, and verify my facts with appropriate authorities. Then I publish it, and move on to the next thing – without directly interacting with my audience and stakeholders. Whether I distribute the content electronically or in a hard copy format, I leave it to my readers to determine the value of whatever I publish.
However, as we describe in our recently completed report Smart Content in the Enterprise, XML applications can transform this conventional publishing paradigm. By smart content, we mean content that is granular at the appropriate level, semantically rich, useful across applications, and meaningful for collaborative interaction.
NetApp As a provider of storage and data management solutions, NetApp has invested a lot of time and effort embracing DITA and restructuring its technical documentation. By systematically tagging and managing content components, and by focusing on the underlying content development processes, writers and editors can keep up with the pace of product releases.
Warrior Gateway As a content aggregator, Warrior Gateway collects, organizes, enriches, and redistributes content about a wide range of health, welfare, and veteran-related services to soldiers, veterans, and their families. Rather than simply compiling an online catalog of service providers’ listings, Warrior Gateway restructures the content that government, military, and local organizations produce, and enriches it by adding veteran-related categories and other information. Furthermore, Warrior Gateway adds a social dimension by encouraging contributions from veterans and family members.
Once stored within the XML application powering Warrior Gateway, the content is easily reorganized and reclassified to provide the veterans’ perspective about areas of interest and importance. Volunteers working with Warrior Gateway can add new categories when necessary. Service providers can claim their profile and improve their own data details. Even the public users can contribute to content to the gateway, a crowd sourcing strategy to efficiently collect feedback from users. With contributions from multiple stakeholders, the published listings can be enriched over time without requiring a large internal staff to add the extra information.
Capturing New Business Value There’s a lot more detail about how the XML applications work in our case studies – I recommend that you check them out.
What I find intriguing is the range of promising and potentially profitable business models engendered by smart content. Enterprise publishers have new options and can go beyond simply pushing content through a publishing process. Now they can build on their investments, and capture the pull of content value.
The use of microblogging and activity streams is maturing in the enterprise. This was demonstrated by recent announcements of enhancements to those components in two well-regarded enterprise social software suites.
On February 18th, NewsGatorannounced a point release to its flagship Enterprise 2.0 offering, Social Sites 3.1. According to NewsGator, this release introduces the ability for individuals using Social Sites to direct specific microblogging posts and status updates to individuals, groups, and communities. Previously, all such messages were distributed to all followers of the individual poster and to the general activity stream of the organization. Social Sites 3.1 also introduced the ability for individuals to filter their activity streams using “standard and custom filters”.
Yesterday (March 3rd), Socialtextannounced a major new version of its enterprise social software suite, Socialtext 4.0. Both the microblogging component of Socialtext’s suite and its stand-along microblogging appliance now allow individuals to broadcast short messages to one or more groups (as well as to the entire organization and self-selected followers.) Socialtext 4.0 also let individuals filter their incoming activity stream to see posts from groups to which they belong (in addition to filtering the flow with the people and event filters that were present in earlier versions of the offering.)
The incorporation of these filters for outbound and incoming micro-messages are an important addition to the offerings of NewsGator and Socialtext, but they are long overdue. Socialcast has offered similar functionality for nearly two years and Yammer has included these capabilities for some time as well (and extended them to community members outside of an organization’s firewall, as announced on February 25th.) Of course, both Socialcast and Yammer will need to rapidly add additional filters and features to stay one step ahead of NewsGator and Socialtext, but that represents normal market dynamics and is not the real issue. The important question is this:
What other filters do individuals within organizations need to better direct microblogging posts and status updates to others, and to mine their activity streams?
I can easily imagine use cases for location, time/date, and job title/role filters. What other filters would be useful to you in either targeting the dissemination of a micro-message or winnowing a rushing activity stream?
One other important question that arises as the number of potential micro-messaging filters increases is what should be the default setting for views of outgoing and incoming messages? Should short bits of information be sent to everyone and activity streams show all organizational activity by default, so as to increase ambient awareness? Perhaps a job title/role filter should be the default, in order to maximize the focus and productivity of individuals?
There is no single answer other than “it depends”, because each organization is different. What matters is that the decision is taken (and not overlooked) with specific corporate objectives in mind and that individuals are given the means to easily and intuitively change the default target of their social communications and the pre-set lens through which they view those of others.
Microsoft has a lot to lose if they are unable to coax customers to continue to use and invest in Office. Google is trying to woo people away by providing a complete online experience with Google Docs, Email, and Wave. Microsoft is taking a different tact. They are easing Office users into a Web 2.0-like experience by creating a hybrid environment, in which people can continue to use the rich Office tools they know and love, and mix this with a browser experience. I use the term Web 2.0 here to mean that users can contribute important content to the site.
SharePoint leverages Office to allow users to create, modify, and display “deep” content, while leveraging the browser to navigate, view, discover, and modify “shallow” content. SharePoint is not limited to this narrow hybrid feature set, but in this post I examine and illustrate how Microsoft is focusing its attention on the Office users. The feature set that I concentrate on in this post is referred to as the “Collaboration” portion of SharePoint. This is depicted in Microsoft’s canonical six segmented wheel shown in Figure 1. This is the most mature part of SharePoint and works quite well, as long as the client machine requirements outlined below are met.
Figure 1: The canonical SharePoint Marketing Tool – Today’s post focuses on the Collaboration Segment
Preliminaries: Client Machine Requirements
SharePoint out-of-the-box works well if all client machines adhere to the following constraints:
The client machines must be running Windows OS (XP, Vista, or WIndows 7) NOTE: The experience for users who are using MAC OS, Linux, iPhones, and Google phones is poor. 
The only truly supported browser is Internet Explorer (7 and 8.)  NOTE: Firefox, Safari, and Opera can be used, but the experience is poor.
The client machines need to have Office installed, and as implied by bullet 1 above, the MAC version of Office doesn’t work well with SharePoint 2007.
All the clients should have the same version of Office. Office 2007 is optimal, but Office 2003 can be used. A mixed version of Office can cause issues.
A number of tweaks need to be made to the security settings of the browser so that the client machine works seamlessly with SharePoint.
I refer to this as a “Microsoft Friendly Client Environment.”
Some consequences of these constraints are:
SharePoint is not a good choice for a publicly facing Web 2.0 environment (More on this below)
SharePoint can be okay for a publicly facing brochureware site, but it wouldn’t be my first choice.
SharePoint works well as an extranet environment, if all the users are in a Microsoft Friendly Client Environment, and significant attention has been paid to securing the site.
A key take-away of these constraints is that a polished end user experience relies on:
A carefully managed computing environment for end users (Microsoft Friendly Client Environment) and / or
A great deal of customization to SharePoint.
This is not to say that one cannot deploy a publicly facing site with SharePoint. In fact, Microsoft has made a point of showcasing numerous publicly facing SharePoint sites .
What you should know about these SharePoint sites is:
A nice looking publicly facing SharePoint site that works well on multiple Operating Systems and browsers has been carefully tuned with custom CSS files and master pages. This type of work tends to be expensive, because it is difficult to find people who have a good eye for aesthetics, understand CSS, and understand SharePoint master pages and publishing.
A publicly facing SharePoint site that provides rich Web 2.0 functionality requires a good deal of custom .NET code and probably some third party vendor software. This can add up to considerably more costs than originally budgeted.
An important consideration, before investing in custom UI (CSS & master pages) , third party tools, and custom .NET code is that they will most likely be painful to migrate when the underlying SharePoint platform is upgraded to the next version, SharePoint 2010. 
By the sound of these introductory paragraphs, you might get the wrong idea that I am opposed to using SharePoint. I actually think SharePoint can be a very useful tool, assuming that one applies it to the appropriate business problems. In this post I will describe how Microsoft is transitioning people from a pure Office environment to an integrated Office and browser (SharePoint) environment.
So, What is SharePoint Good at?
When SharePoint is coupled closely with a Microsoft Friendly Client Environment, non-technical users can increase their productivity significantly by leveraging the Web 2.0 additive nature of SharePoint to their Office documents.
Two big problems exist with the deep content stored inside Office documents (Word, Excel, PowerPoint, and Access,)
Hidden Content: Office documents can pack a great deal of complex content in them. Accessing the content can be done by opening each file individually or by executing a well formulated search. This is an issue! The former is human intensive, and the latter is not guaranteed to show consistent results.
Many Versions of the Truth: There are many versions of the same files floating around. It is difficult if not impossible to know which file represents the “truth.”
SharePoint 2007 can make a significant impact on these issues.
Go into any organization with more than 5 people, and chances are there will be a shared drive with thousands of files, Microsoft and non-Microsoft format, (Word, Excel, Acrobat, PowerPoint, Illustrator, JPEG, InfoPath etc..) that have important content. Yet the content is difficult to discover as well as extract in an aggregate fashion. For example, a folder that contains sales documents, may contain a number of key pieces of information that would be nice to have in a report:
Date of sale
Total Sale in $’s
Categorizing documents by these attributes is often referred to as defining a taxonomy. SharePoint provides a spectrum of ways to associate taxonomies with documents. I mention spectrum here, because non-microsoft file formats can have this information loosely coupled, while some Office 2007 file formats can have this information bound tightly to the contents of the document. This is a deep subject, and it is not my goal to provide a tutorial, but I will shine some light on the topic.
SharePoint uses the term “Document Library” to be a metaphor for a folder on a shared drive. It was Microsoft’s intent that a business user should be able to create a document library and add a taxonomy for important contents. In the vernacular of SharePoint, the taxonomy is stored in “columns” and they allow users to extract important information from files that reside inside the library. For example, “Customer”, “Date of Sale,” or “Total Sale in $’s” in the previous example. The document library can then be sorted or filtered based on values that are present in these columns. One can even provide aggregate computations based the column values, for example total sales can be added for a specific date or customer.
The reason I carefully worded this as a “spectrum” is because the quality of the solution that Microsoft offers is dependent upon the document file format and its associated application. The solution is most elegant for Word 2007 and InfoPath 2007, less so for Excel and PowerPoint 2007 formats, and even less for the remainder of the formats that are non-Microsoft products.. The degree to which the taxonomy can be bound to actual file contents is not SharePoint dependent, rather it is dependent upon how well the application has implemented the SharePoint standard around “file properties.”
I believe that Microsoft had intended for the solution to be deployed equally well for all the Office applications, but time ran out for the Office team. I expect to see a much better implementation when Office 2010 arrives. As mentioned above, the implementation is best for Word 2007. It is possible to tag any content inside a Word document or template as one that should “bleed” through to the SharePoint taxonomy. Thus key pieces of content in Word 2007 documents can actually be viewed in aggregate by users without having to open individual Word documents.
It seems clear that Microsoft had the same intention for the other Office products, because the product documentation states that you can do the same for most Office products. However, my own research into this shows that only Word 2007 works. A surprising work-around for Excel is that if one sticks to the Excel 2003 file format, then one can also get the same functionality to work!
The next level of the spectrum operates as designed for all Office 2007 applications. In this case, all columns that are added as part of the SharePoint taxonomy can penetrate through to a panel of the office application. Thus users can be forced to fill in information about the document before saving the document. Figure 2 illustrates this. Microsoft refers to this as the “Document Information Panel” (DIP). Figure 3 shows how a mixture of document formats (Word, Excel, and PowerPoint) have all the columns populated with information. The disadvantage of this type of content binding is that a user must explicitly fill out the information in the DIP, instead of the information being bound and automatically populating based on the content available inside the document.
Figure 2: Illustrates the “Document Information Panel” that is visible in PowerPoint. This panel shows up because there are three columns that have been setup in the Document library: Title, testText, and testNum. testText and testNum have been populated by the user and can be seen in the Document Library, see figure 3.
Figure 3: Illustrates that the SharePoint Document Library showing the data from the Document Information Panel (DIP) “bleeding through.” For example the PowerPoint document has testText = fifty eight, testNum = 58.
Finally the last level on the taxonomy feature spectrum is for Non-Microsoft documents. SharePoint allows one to associate column values with any kind of document. For example, a jpeg file can have SharePoint metadata that indicates who the copyright owner is of the jpeg. However this metadata is not embedded in the document itself. Thus if the file is moved to another document library or downloaded from SharePoint, the metadata is lost.
A Single Version of the Truth
This is the feature set that SharePoint implements the best. A key issue in organizations is that files are often emailed around and no one knows where the truly current version is and what the history of a file was. SharePoint Document libraries allow organizations to improve this process significantly by making it easy for a user to email a link to a document, rather than email the actual document. (See figure 4.)
Figure 4: Illustrates how easy it is to send someone a link to the document, instead of the document itself.
In addition to supporting good practices around reducing content proliferation, SharePoint also promotes good versioning practices. As figure 5 illustrates any document library can easily be setup to handle file versions and file locking. Thus it is easy to ensure that only one person is modifying a file at a time and that the there is only one true version of the file.
Figure 5: Illustrates how one can look at the version history of a document in a SharePoint Document Library..
In this post I focus on the feature set of SharePoint that Microsoft uses to motivate Office users to migrate to SharePoint. These features are often termed the “Collaboration” features in the six segmented MOSS wheel. (See figure 1) The collaboration features of MOSS are the most mature part of SharePoint and thus the most . Another key take-away is the “Microsoft Friendly Client Environment.” I have worked with numerous clients that were taken by surprise, when they realized the tight restrictions on the client machines.
Finally, on a positive note, the features that I have discussed in this post are all available in the free version of SharePoint (WSS), no need to buy MOSS. In future posts, I will elaborate on MOSS only features.
 The terms “deep” and “shallow” are my creation, and not a standard. By “deep” content I am referring to the complex content such as a Word documents (contracts, manuscripts) or Excel documents (complex mathematical models, actuarial models, etc…)
 Microsoft has addressed this by stating that SharePoint 2010 would support some of these environments. I am somewhat skeptical.
 Public Facing internet sItes on MOSS, http://blogs.microsoft.nl/blogs/bartwe/archive/2007/12/12/public-facing-internet-sites-on-moss.aspx
 Microsoft has stated frequently that as long as one adheres to best practices, the migration to SharePoint 2010 will not be bad. However, Microsoft does not have a good track record on this account for the SharePoint 2003 to 2007 upgrade, as well as many other products.
So Microsoft was asleep at the wheel and didn’t use good procedures to backup and restore Sidekick data. It was just a matter of time until we saw a breakdown in cloud computing. Is this the end to cloud computing? Not at all! I think it is just the beginning. Are we going to see other failures? Absolutely! These failures are good, because they help sensitize potential consumers of cloud computing on what can go wrong and what contractual obligations service providers must adhere to.
There is so much impetus for having centralized computing, that I think all the risk and downside will be outweighed by the positives. On the positive side, security, operational excellence, and lower costs will eventually become mainstream in centralized services. Consumers and corporations will become tired of the inconvenience and high cost of maintaining their own computing facilities in the last mile.
Willie Sutton, a notorious bank robber, is often misquoted as saying that he robbed banks "because that’s where the money is." Yet all of us still keep our money with banks of one sort or another. Even though online fraud statistics are sharply increasing , the trend to use online and mobile banking as well as credit/debit transactions is on a steep ascent. Many banking experts suggest that this trend is due to convenience.
Whether a corporation is maintaining their own application servers and desktops, or consumers are caring and feeding for their MAC’s and PC’s the cost of doing this, measured in time and money is steadily growing. The expertise that is required is ever increasing. Furthermore, the likelihood of having a security breach when individuals care for their own security is high.
The pundits of cloud computing say that the likelihood of breakdowns in highly concentrated environments such as Cloud computing servers is high. The three main factors they point to are:
Lack of Redundancy
Vulnerability to Network Outages
I believe that in spite of these, seemingly large obstacles, we will see a huge increase in the number of cloud services and the number of people using these services in the next 5 years. When we keep data on our local hard drives, the security risks are huge. We are already pretty much dysfunctional when the network goes down, and I have had plenty of occasions where my system administrator had to reinstall a server or I had to reinstall my desktop applications. After all, we all trust the phone company to give us a dial tone.
The savings that can be attained are huge: A Cloud Computing provider can realize large savings by using specialized resources that are amortized across millions of users.
There is little doubt in my mind that cloud computing will become ubiquitous. The jury is still out as to what companies will become the service providers. However, I don’t think Microsoft will be one of them, because their culture just doesn’t allow for solid commitments to the end user.
 The Beauty in Redundancy, http://gadgetwise.blogs.nytimes.com/2009/10/12/the-beauty-in-redundancy/?scp=2&sq=sidekick&st=cse
 Microsoft Project Pink – The reason for sidekick data loss, http://dkgadget.com/microsoft-project-pink-the-reason-for-sidekick-data-loss/