Unstructured data

Unstructured Data (or unstructured information) refers to information that either does not have a predefined data model and/or does not fit well into relational tables, such as narrative text, audio, or visual data.

In the early days of information technology (1950s -1970s), information systems focused on structured data. Until the late 1970s there was little interest in managing unstructured data. In the 1980s computerized publishing systems were built to process unstructured information for creating, formatting, editing, and printing documents. And SGML was created to add structure to document information for computer processing. Electronic publishing and document management systems grew steadily until the early 1990s when the Web produced an explosion of unstructured data.

Unstructured data is also the main ingredient to most of today’s machine learning applications, which involve natural language processing, and image and streaming pattern recognition.

Modern data management strategies need to include a variety of structured and unstructured data types. PostgreSQL, MongoDB, Cassandra, Neo4j, Snowflake, and DataStax are some examples of modern database products. Many current versions of traditional SQL-based database products can also support NoSQL (non-SQL or not-onlySQL) data.

Choose Language

Topics we cover

Policies

Contact