Curated for content, computing, and digital experience professionals

Category: Computing & data (Page 86 of 90)

Computing and data is a broad category. Our coverage of computing is largely limited to software, and we are mostly focused on unstructured data, semi-structured data, or mixed data that includes structured data.

Topics include computing platforms, analytics, data science, data modeling, database technologies, machine learning / AI, Internet of Things (IoT), blockchain, augmented reality, bots, programming languages, natural language processing applications such as machine translation, and knowledge graphs.

Related categories: Semantic technologies, Web technologies & information standards, and Internet and platforms.

Recent reports by Frank on mobile development and big data

While I was still at Outsell Inc, I started writing some reports on information technologies for our publishing and information provider CEO clients. I will most likely be writing a few more similar reports for Outsell this year. While special attention is paid to the interests of publishing and information industry CEOs, the topics are all (so far) about technologies that are important to all industries. These reports are available from Outsell:

Five Technologies to Watch 2012-2013, January 25, 2012

Mobile Development Strategies: What Information Industry Executives Need to Know, November 29, 2011

Big-Data: Big Deal or Just Big Buzz?, August 2, 2011.

Informatica Delivers Data Parser for Hadoop

Informatica Corporation, the provider of data integration software, announced the immediate availability of Informatica HParser, a data parsing transformation solution for Hadoop environments. Informatica HParser runs on distributions of Apache Hadoop, exploiting the parallelism of the MapReduce framework to efficiently turn unstructured complex data, such as web logs, social media data, call detail records and other data formats, into a structured or semi-structured format in Hadoop. Once transformed into a more structured format, the data can be used and validated to drive business insights and improve operations. Available in a free community edition and commercial editions, Informatica HParser provides organizations with the solution they require to extract the value of complex, unstructured data. http://www.informatica.com

MarkLogic 5 Announced for Big Data in the Enterprise

MarkLogic Corporation announced the availability of MarkLogic 5, the latest version of its product designed for Big Data applications across the enterprise. MarkLogic 5 defines Big Data by empowering organizations to build Big Data applications that make information actionable. With MarkLogic 5, organizations analyze structured, unstructured, and semi-structured data in the same application. A key feature is the MarkLogic Connector for Hadoop. www.marklogic.com

IBM Unveils Big Data Software

IBM unveiled new software for managing and analyzing big data to the workplace. The new offerings span a wide variety of big data and business analytics technologies across multiple platforms from mobile devices to the data center to IBM’s SmartCloud. Now employees from any department inside an organization can explore unstructured data such as Twitter feeds, Facebook posts, weather data, log files, genomic data and video, and make sense of it as part of their everyday work experience. IBM is also placing the power of mobile analytics into the hands of iPad users with a free download in Apple’s iTunes Store. The new software is designed to help employees in industries such as financial services, healthcare, government, communications, retail, and travel and transportation use and benefit from business analytics on the go. IBM is delivering new analytics and information management offerings: New Hadoop-based analytics software on the cloud, which helps employees tap into massive amounts of unstructured data from a variety of sources including social networks, mobile devices and sensors; New mobile analytics software for iPad users; and new predictive analytics software with a mapping feature that can be used across industries for marketing campaigns, retail store allocation, crime prevention, and academic assessment. http://www.ibm.com/

Adobe Announces Availability of AudienceResearch

Adobe Systems Incorporated announced the immediate availability of Adobe AudienceResearch, a new audience measurement tool that provides publishers and digital marketers with certified metrics on the size and engagement of digital audiences for websites, mobile applications and digital magazine editions. These key metrics are captured by Adobe SiteCatalyst, an online analytics application, and provide publishers with the information critical to attract advertising dollars. AudienceResearch is available at no additional cost to SiteCatalyst customers. In conjunction with AudienceResearch, the company also announced the general availability of the Adobe Audience Certification Program. Under this program, publishers become Adobe Certified Publishers, meaning Adobe has certified that their digital audience data meets certain criteria regarding the accuracy of data collection and reporting. Adobe Certified Publishers can contribute their data to the AudienceResearch tool. AudienceResearch provides census-based measurement of metrics, meaning that the metrics are generated by counting all relevant traffic, a method considered more accurate and representative of actual traffic and behavior than panel-based methods. Panel-based methods monitor the behavior of a small group of volunteer consumers (i.e. the panel) and then use statistics to generate estimate metrics. The statistically generated results from panel-based estimates often differ significantly from census-based results and have been a point of controversy in the advertising industry. Additionally, publishers using the Adobe Digital Publishing Suite to create digital magazine editions for tablet devices may elect to have their metrics automatically certified as analytics is natively built into the Digital Publishing Suite. This native integration ensures the integrity of data collection. http://www.adobe.com/

Cloud-Based Marketing Analytics Suite Launched By IBM

IBM has announced a cloud-based Web analytics and digital marketing suite aimed at helping its business customers automate online marketing campaigns across digital channels, such as Websites, social media networks and mobile phones. The new IBM offering combines software from the acquisitions of Coremetrics and Unica and provides analytics that help companies fine-tune marketing campaigns and create personalised offers in real-time across online channels. For example, businesses would be able to evaluate Facebook or Twitter activity, and offer tailored promotions delivered to their mobile devices on the fly. IBM’s suite also enables businesses to deliver and fine-tune digital marketing programmes based on what customers are doing offline. For instance, a consumer who purchased a new tablet in a brick-and-mortar store would receive special offers via email to purchase tablet accessories. http://www.ibm.com

Adobe Announces Adobe Tag Manager for the Online Marketing Suite

Adobe Systems Incorporated announced Adobe Tag Manager for the Adobe Online Marketing Suite, powered by Omniture. This new solution provides a tag management framework for the entire Adobe Online Marketing Suite as well as for other digital marketing technologies. Capturing anonymous audience data is typically done using a tag (a small piece of JavaScript or HTML image call) placed in a piece of content or on a Web page. Various applications for capturing and taking action on analytics data usually require their own tags. The process of implementing and maintaining separate tags on a Web page and across partners such as analytics providers, site and content optimization vendors, ad servers, ad networks, affiliate networks and audience measurement firms often requires significant technical skills to implement or change and can become costly, time consuming and error prone. Adobe Tag Manager solves these industry problems with a tag management framework that serves as a tag container, housing the tags that an Adobe customer may require, including all Online Marketing Suite tags and third-party tags. While some customers may deploy a new standalone tag container, Adobe SiteCatalyst customers can deploy the tag container without having to change their existing SiteCatalyst page tag. The tags within the container are managed through an administrative user interface where the customer can insert or remove tags (from Adobe or other partners) without making changes to a website. http://www.adobe.com/

ETL and Building Intelligence Behind Semantic Search

A recent inquiry about a position requiring ETL (Extraction/Transformation/Loading) experience prompted me to survey the job market in this area. It was quite a surprise to see that there are many technical positions seeking this expertise, plus experience with SQL databases, and XML, mostly in healthcare, finance or with data warehouses. I am also observing an uptick in contract positions for metadata and taxonomy development.

My research on Semantic Software Technologies placed me on a path for reporters and bloggers to seek my thoughts on the Watson-Jeopardy story. Much has been written on the story but I wanted to try a fresh take on the meaning of it all. There is a connection to be made between the ETL field and building a knowledgebase with the smarts of Watson. Inspiration for innovation can be drawn from the Watson technology but there is a caveat; it involves the expenditure of serious mental and computing perspiration.

Besides baked-in intelligence for answering human questions using natural language processing (NLP) to search, an answer-platform like Watson requires tons of data. Also, data must be assembled in conceptually and contextually relevant databases for good answers to occur. When documents and other forms of electronic content are fed to a knowledgebase for semantic retrieval, finely crafted metadata (data describing the content) and excellent vocabulary control add enormous value. These two content enhancers, metadata and controlled vocabularies, can transform good search into excellent search.

The irony of current enterprise search is that information is in such abundance that it overwhelms rather than helps findability. Content and knowledge managers can’t possibly contribute the human resources needed to generate high quality metadata for everything in sight. But there are numerous techniques and technologies to supplement their work by explicitly exploiting the mountain of information.

Good content and knowledge managers know where to find top quality content but may not know that, for all common content formats, there are tools to extract key metadata embedded (but hidden) in it. Some of these tools can also text mine and analyze the content for additional intelligent descriptive data. When content collections are very large but too small to justify (under a million documents) the most sophisticated and complex semantic search engines, ETL tools can relieve pressure on metadata managers by automating a lot of mining, extracting entities and concepts needed for good categorization.

The ETL tool array is large and varied. Platform tools from Microsoft (SSIS) and IBM (DataStage) may be employed to extract, transform and load existing metadata. Other independent products such as those from Pervasive and SEAL may contribute value across a variety of platforms or functional areas from which content can be dramatically enhanced for better tagging and indexing. The call for ETL experts is usually expressed in terms of engineering functions who would be selecting, installing and implementing these products. However, it has to be stressed that subject and content experts are required to work with engineers. The role of the latter is to help tune and validate the extraction and transformation outcomes, making sure terminology fits function.

Entity extraction is one major outcome of text mining to support business analytics, but tools can do a lot more to put intelligence into play for semantic applications. Tools that act as filters and statistical analyzers of text data warehouses will help reveal terminology for use in building specialized controlled vocabularies for use in auto-categorization. A few vendors that are currently on my radar to help enterprises understand and leverage their content landscape include EntropySoft Content ETL, Information Extraction Systems, Intelligenx, ISYS Document Filters, RAMP, and XBS, something here for everyone.

The diversity of emerging applications is a leading indicator that there is a lot of innovation to come with all aspects of ETL. While RAMP is making headway with video, another firm with a local connection is Inforbix. I spoke with a co-founder, Oleg Shilovitsky for my semantic technology research last year before they launched. As he then asserted, it is critical to preserve, mine and leverage the data associated with design and manufacturing operations. This area has huge growth potential and Inforbix is now ready to address that market.

Readers who seek to leverage ETL and text mining will gain know-how from the cases presented at the 2011 Text Analytics Summit, May 18-19 in Boston. As well, the exhibits will feature products to consider for making piles of data a valuable knowledge asset. I’ll be interviewing experts who are speaking and exhibiting at that conference for a future piece. I hope readers will attend and seek me out to talk about your metadata management and text mining challenges. This will feed ideas for future posts.

Finally, I’m not the only one thinking along these lines. You will find other ideas and a nudge to action in these articles.

Boeri, Bob. Improving Findability Behind the Firewall, 28 slides. Enterprise Search Summit 2010, NY, 05/2010.
Farrell, Vickie. The Need for Active Metadata Integration: The Hard Boiled Truth. DM Direct Newsletter, 09/09/2005, 3p
McCreary, Dan. Entity Extraction and the Semantic Web, Semantic Universe, 01/01/2009
White, David. BI or bust? KMWorld, 10/28/2009, 3p.

« Older posts Newer posts »

© 2025 The Gilbane Advisor

Theme by Anders NorenUp ↑