Enterprise Executive - 2017: Issue 2

Big Data

Jared Decker 2017-04-12 02:27:48

Are We at the Beginning of an External Data Gold Rush? The acquisition and processing of external data is a business strategy that many organizations have not taken advantage of largely because of the challenges inherent in procuring the data. These challenges may have been discouraging to any organization that has attempted to download data from an obvious (and free) source, such as government or NGO sources. These entities typically make their data sets available in many partitioned files which require FTP connections and extensive mapping logic to create comprehensive structured data sets that are useful for building statistical models, analytics as a service or for incorporation into business intelligence (BI) platforms for added insights when used in combination with internal organizational data. External data refers to any data that is generated outside the physical and logical boundaries of an organization. Though external data is known by many organizations to be theoretically useful, many have left it on the table due to the perceived acquisition challenges and costs. However, in recent years a variety of companies have stepped in to create data market offerings that can be subscribed to and easily accessed for incorporation into an analytics platform. The result is that an opportunity now exists in which external data can be easily acquired and leveraged for internal organizational insights or even as repackaged analytical offerings that are often based on proprietary methods of combining external data or mixing external data with privately procured data. External data processes have long been used by hedge funds which search out data of any form or fashion that might be used to detect patterns before the broader market does. In one hedge fund example, an automated process analyzes the shadows of buildings in China from satellite images to infer the growth rate of real estate and consequentially drive buy/sell decisions of Chinese developer companies’ stock. But external data need not solely be the purview of hedge funds and other businesses that operate principally on competitive information (sometimes called competitive intelligence); businesses of all types are learning to procure external data and build it into their analytics. From government to NGO to i nternet company and commercial data products, there are vast amounts of external data hiding in plain sight, which can be procured and used for competitive advantage. From the perspective of an organization, external data falls into two categories: • Unknown Data: The organization is not aware of certain external data that would otherwise be of benefit. • Known Potential Data: The organization is aware of useful external data but has not established a process for procuring and integrating it. With unknown data, the organization is ignorant to the potential data sets that are readily available and potentially very useful. To avoid this predicament, we recommend some emphasis be put on surveying the data market opportunities provided by BI software companies and cloud services vendors. Though the ROI is not laid out in advance, the potential returns of finding useful data may well be worth the investment of budgeting an external data investigation project. In the case of known potential data, the organization has an interest in procuring external data but has not done so, typically due to perceived barriers that may not be realistic. Depending on the source, the task of procuring external data can range from trivial to nontrivial on a scale of procurement effort, and from free to expensive on a scale of cost. An example of a trivial procurement effort would be a direct data feed provided by a BI software platform that comes with simple tools and data markets for the integration of structured external data at desired refresh intervals. In more complicated procurement cases, an FTP server must be reached and specific instructions provided (presumably in an automated fashion) to retrieve the requisite files for the desired time frames, after which the data in these files may or may not be restructured en route to be useful. An additional perceived impediment to acquiring external data is restrictions placed on the use of the acquired data. There are different rules for different data, but in general, most of the data from government and NGO sources is not restrictive. For example, the U.S. Census Bureau provides many data points such as population counts by gender, race and age down to the ZIP code and company characteristics (such as employment and age) by sector, age and size. This data is refreshed annually, is public domain and may be freely used for non-commercial and commercial purposes (in fact, it may be used in derivative works and sublicensed with no source attribution requirement). Given the facilities that are now available for acquiring external data at low cost, opportunities abound for enterprises large and small. In fact, new business opportunities are likely to emerge. In combination with cheaper and more efficient storage (or cloud-based infrastructure) and vastly improved data software platforms with which to process and analyze the information, there may be something of a gold rush that could play out as enterprises increasingly use external data for competitive advantage. Don’t get left behind! Jared decker is VP of Business Intelligence at Cyber Group and brings more than 15 years of career experience and 13 years of consulting experience exclusively in the areas of business intelligence and analytics. He has a Bachelor’s Degree in Decision Support Systems from the University of Tampa and an MBA from the University of Houston. Email: jared.decker@cygrp.com

Published by Enterprise Systems Media. View All Articles.

This page can be found at http://ourdigitalmags.com/article/Big+Data/2761241/399971/article.html.

Using a screen reader? Click Here