Tribal Knowledge - The History of Content Management (Part II) Demystifying Rich Media
by Sebastian Holst February 26, 2003
History of Content Management (Part 2) Demystifying Rich Media
New technology
renders existing products, services and training obsolete on a daily basis.
Some will tell you that this wipes the slate clean - well, think again. I'm
here to tell you that all that you know - you need to remember. This is a monthly
column dedicated to dispelling hi tech myths and sharing life's lessons.
The
History of Content Management (Part II) - Demystifying Rich Media
In
the last column, we looked at how the evolution of persistent storage from basic
file systems to today's most sophisticated content management systems can be
mapped as a tug-of-war between content modeling capabilities and content management
functionality.
One
of the milestones in this two-dimensional mapping of content management progress
was the extension of data types to include rich media and highly structured
content such as XML. The limitation of this 2D view of these data types is
that rich media and XML are much more than just data types; most use-cases for
these classes of content require additional functionality that go far beyond
traditional data type support. The special behaviors implied (but often left
unimplemented) by rich media support sets the stage for significant confusion
and missed expectations.
I
am here to say that there is no reason or excuse for this kind of murkiness
(other than for those who see opportunity in your confusion). Let's simplify.
As
Glenda, the Good Witch of the North says, the best place to start is at the
beginning. What is data type support? Wait! Before you scan to the next paragraph
(or the next web page), the answer is not as simple or as obvious as one might
think. Data type support has little or nothing to do with content, rather,
it has everything to do with operations on content. For example, perhaps the
biggest scam ever perpetrated on the content management buying public was the
debut and subsequent demonization of the BLOB (Binary Large OBject). BLOBs (catchy
name, don't you think?) can store any data type but offer little or no support.
Operations that one might expect against every other data type such as intelligent
indexing, aggregation, presentation, etc. are nowhere to
be found.
When
you add two numbers together, you get a result; it does not matter if one
number is an integer and the other a floating-point decimal of arbitrary precision.
While your brain does not think twice about this mental calculation, the code
required to support this operation against two different data type formats is
actually surprising complex. Put simply, every data type has a set of operations
that should cover all of its expected behaviors in the real world. Sometimes
there are unique behaviors associated with particular classes of data type (mathematical
operations on numbers) and sometimes the same operation requires different logic
(sorting on multinational character sets).
Western
languages flow from left to right then down. Eastern languages are often top
down then right to left and each character is twice as long. Some Middle Eastern
languages move right to left with the exception of numbers, which flow from
left to right. The identical sort operation on the identical bits and bytes
will return entirely different results depending on which logic is applied.
However,
one common element to traditional data type support (non-rich media), is that
all of these problems can be solved with software. Sorting, aggregating, indexing,
presentation, etc. are all software problems that are mostly
manageable by the content management vendor (fonts and a few other special cases
are software-based, but require additional logic).
Rich
media, on the other hand, often has very real dependencies on hardware, in ways
that the content management architects never planned for. Managing rich media
requires a very thorough understanding of the limitations of the display device,
the process by which the media was produced, and the bandwidth of the real-time
environment. High quality video, audio, and image content can easily devour
even the beefiest system, so numerous strategies must be developed to down sample
content, to navigate through logically contiguous content, to store content,
and to transport content from one place to another. Log files, caches, workspaces,
etc. that needed little or no updating to support international
character sets have to be completely rewritten (or alternates developed) to
support these large and semantically complex data types.
In
summary, rich media requires significant extensions to the set of operations
supported, entirely new visualization and interaction metaphors, and significant
extensions to the internals that handle storage, indexing, and transport of the
raw content.
The
following chart gives a sampling of how similar operations require entirely
different logic (and code) by data type class. Caution, this is an over-simplification
(I have never seen an exhaustive table in one place.)
|
|
Video
|
XML
|
Quark
|
|
Ingest
|
Encode
and log
|
Parse
and validate
|
Decompose
|
|
Index
|
Metadata
and closed captions
|
Structure,
attributes and content
|
Extracted
content
|
|
Store
|
Media-server
and HSM support
|
Fragments
or tags
|
De-binerize
|
|
Metadata
|
Format
and codex dependent
|
Semantic
web, RDF and industry specific DTD support
|
Extract
|
|
Model
|
Clips,
tracks, key frames and storyboard
|
DTD
or schema
|
Decompose
|
|
Search
|
Visual
search, frame accurate, offsets, metadata
|
Contextual
within DTD structure, and metadata
|
Content
and metadata
|
|
Navigate
|
Storyboard,
low resolution versions
|
URL,
Xlink, Xpointer
|
Component
|
|
Preview
|
Clip
sequencing
|
XSLT,
CSS styles/gist generation
|
Page
preview
|
|
Export
|
SMIL,
encode
|
Transform
via XSLT, DOM, etc.
|
Re-assemble
|
|
Distribute
|
Transcode,
stream
|
Metadata
wrappers
|
Insert
into production workflow
|
Before
you give up on your current content management provider, you must answer some questions first. Do you need every rich media format supported? Do you need
every conceivable operation against those data types? The answers are probably
"No" and "No" but what are
your requirements? My advice is to develop a simple matrix that captures the
data types your application/business need to manage, and the operations your
use cases demand, before
making any momentous decision about your content management system(s).
If
there is an important set of rich media types included in this matrix, additional
due diligence on the underlying architecture is also warranted because the likelihood
that you will require additional data types over time is most probably very
high (evaluating architectures is beyond the scope of the column-format, but email
me if you have questions in this arena).
For
those of you that actually read the above table, you will note that there are
operations such as "insert into production workflow" and "clip sequence".
These are not simply computational operations; they are most often applications
that are dependent on domain experts (people) to complete. Also, the dreaded
and over-used term metadata has crept into the discussion (as you all knew
it must). I assert that these have nothing to do with data type support, and
that your decision-making process will be greatly simplified if you choose to
agree with me. So, until next time, when we will cover these two issues. Ciao
(for you DBMS hacks, that's Italian for Commit).
Next
Column Content and Context
Care to share some of your
tribal knowledge? We'd love to hear it send comments and insights to sebastian@gilbane.com
|