To Get Full AI Payoff, Invest First in Data Management

by Tom Thomassen on December 3, 2018

Garbage in, garbage out. That’s long been the issue for successful data analytics, and it’s an even bigger issue today, given the rapid pace of investment in artificial intelligence (AI).

About 80 percent of enterprises report they are investing in related AI technologies in some capacity and 30 percent plan on expanding their investments, research shows. Businesses expect AI to help keep them ahead of the competition.

But risk abounds. How can companies break the cycle of garbage in, garbage out and assure that insights derived from AI are sound?

After all, any information system—AI or other—is only as good as the quality of its data. Information systems are largely reliant on user input and entries need to be correct and credible. AI uses computer algorithms to replicate the human ability to learn and make predictions, and AI software finds patterns and make inferences from large quantities of data. That’s why the success of any AI effort will start with the data. Good data in means good intelligence out. This requires that enterprises:

Ingest data quickly and seamlessly. Old and messed up data will result in misinformed intelligence. Enterprises need to ingest data quickly when it is fresh, and that means as it is. This is not an easy task. First, the volume of available data is exploding: IDC forecasts that by 2025, the global data-sphere will grow to 163ZB (i.e., a trillion gigabytes). That’s 10 times the 16.1ZB of data generated in 2016. Second, much of that data growth will be in “unstructured” data, such as videos and social media. That’s a problem for many enterprises because “unstructured data doesn’t easily lend itself to older models of data storage and analysis,” IDC reports. Nor will such data easily lend itself to AI engines. Enterprises need to deploy technologies that enable the rapid ingestion of data, including unstructured, so that AI engines analyze the whole picture and make full use of data to better serve customers and anticipate their needs. For data to be a winning competitive edge, it needs to be processed in real-time—not after never ending cycles of data normalization.
Require data-level security. Security at the data level ensures the safety and integrity of the data within AI algorithms. Data quality and security policies should be crafted at the data level and be based off the metadata, which is the data about the data. This guarantees that no matter where the data comes from or where it travels to, security policies will be with the data. It also means that, if data is changed, those changes are accounted for downstream and the AI can adjust. The explosion of data sources—driven by such forces as the internet of things (IoT) and growth of mobile technologies—makes data-level security ever more critical.
Drive teamwork/collaboration. With a consistent view of data across organizations and departments, AI engines will be more productive because only the best, most tailored data will be accessed for analysis. Without teamwork and collaboration, different departments may not know the full scope of available data. Teamwork and collaboration will break down data and department silos so that only the right data at the right time is fed into the AI engine, and so that data can be found more easily. It is also essential to have a consistent view of data that can be used across the larger organization even if a slice of the data that a particular department needs is unique.

AI Empowers

No doubt, AI has the capacity to disrupt all areas of a business and improve business performance. Market researcher Vanson Bourne, after conducting online surveys of 260 senior IT and business leaders last year, found that companies expect a $2.87 return on investment over 10 years for AI investments.

To get there, enterprises will need the full value of the data going into the AI engine. They need to ingest data quickly, use that data, review what data is useful and then get more of that useful data. AI will enable enterprises to do this more quickly, as long as the data and the data management infrastructure is in good shape.

— Tom Thomassen

December 3, 2018March 4, 2020 Tom Thomassen AI, Artificial Intelligence, Data management, return on investment, unstructured data

Tom Thomassen

Tom Thomassen is a senior staff engineer of security at MarkLogic. He is responsible for helping identify and implement secure development practices into the company engineering process, educating the team on security best practices, monitoring and responding to changes in the software security landscape, and driving MarkLogic's security incident response process. Tom has been a part of the MarkLogic team for over two years. Prior to working for MarkLogic, Tom worked at Sun Microsystems, iPlanet, Veritas, and Symantec.

tom-thomassen has 1 posts and counting.See all posts by tom-thomassen