What is it and Why should we concern ourselves with it?
Big data affects organizations and groups across practically every industry. These include Banking, Government, Manufacturing, Health Care, Retail and Education and so on. It is none other then what the very term describes, it refers to the large volume of data that is created and made available to businesses on a daily bases. This data can either be structured or unstructured. When data is structured, it can easily be stored and processed using simple algorithms and queries such as with traditional relational databases and spreadsheets however, unstructured data usually does not conform to these rules and can be large and varies so much so that queering it with traditional methods can be very difficult or virtually impossible. The importance of good use and analysis of this data can help businesses to gain deeper insights into behaviors and trends both externally and internally that can lead to better decision making and strategic business moves by managers. So how how much data are we actually producing and processing today?
According to Cisco, a worldwide leader in IT and Networking, by the end of 2016 it was estimated that the annual global data traffic would reach an unprecedented 6.6 Zettabytes. Yes thats right, I did say Zettabytes. What on earth is a Zettabyte you say? Well, its very simple,
“A zettabyte is roughly 1000 exabytes. To place that amount of volume in more practical terms, an exabyte alone has the capacity to hold over 36,000 years worth of HD quality video…or stream the entire Netflix catalog more than 3,000 times. A zettabyte is equivalent to about 250 billion DVDs.”
Ah you get it now. Good. But there is more.
Of course anybody that you speak to in the field of analytics will not speak to you about Big Data without the mentioning of at least one of the four elements or characteristics of Big Data. These four elements are often referred to as the 4 V’s of Big Data. They are Volume, Variety, Velocity Veracity.
That’s right, when it comes to big data, Size definitively matters.
Exponentially the size of available data being collected has been growing at an increasingly fast rate. The question is, where does all this data come from? In the past this data may have been produced by companies and a few individuals however nowadays it is not limited to the aforementioned. It now applies to companies and to individuals. For example, we now have hundreds of millions of smart phone users around the world that send a variety of information to the network infrastructure. This large amount of new data did not exist a few years ago and its increasing with every minute. We have more sources of data and we are constantly creating more with larger size capabilities of processing that combine to increase the volume of data that has to be analyzed. This includes data collected by organizations from a variety of sources, including business transactions, point of sales, social media and information from sensors or machine-to-machine data. This volume presents major issues for those looking to store, process and put that data to good use instead of letting it just disappear.
As mentioned earlier in this post, data comes in all types of formats – from structured which is numeric data stored in various documents such as Excel files and traditional databases to unstructured text documents, web browsers, logs, click-streams email, video, social media, audio, stock ticker data and financial transactions. These varieties are increasing as technology advances. Unlike in the past, as input devices and platforms increase rapidly, data miners and analysts are losing their grip over the control of the structure of the data being created and stored.
Variety is most definitely the spice when it comes to big data.
Data velocity refers to the rate at which data are generated and to the speed at which it should be processed, analyzed and acted upon to improves decisions. The wide availability of mobile smartphones and other digital devices such as sensors leading to endless and fast data creation call for real time processing and analytics of data to provide fast evidence-based planning. A fast delivery of information which can be analyzed such as geographical location, demographics, and past buying patterns of customers is essential to business organizations to create real customer value and tailored purchasing campaigns.
This characteristic of Big data refers to the accuracy and quality of the data, of course if we take in junk data likely our output is likely to be junk. Therefore when managers make decisions based on incorrect information, their decisions are likely to fail. When considering Veracity of the data we must consider the reliability of the source of the data, the accuracy of the data, the time validity of the data and the context of the data.
Benefits of Big Data to You and I.
The benefits of analyzing big data are endless both for companies and for individuals. As mentioned earlier, the insight gained from Big Data can be used across many industries. Imagine being able to have foresight leading to:
- Detecting and preparing for flue epidemics before they start..
- Detecting and preventing diseases before they are contacted by patients in the medical field
- Determining root causes of failures, issues and defects in near-real time. eg.Airline parts
- Generating coupons at the point of sale based on the customer’s buying habits.
- Recalculating entire risk portfolios in minutes and missing nothing.
- Detecting patterned behaviors involving fraud before it affects your company.
The importance of big data doesn’t revolve around how much data you have, but what you do with it. When we collect data with the four qualities mentioned above and analyze it, We end up with answers that enable us to 1) reduce costs, 2) reduce decision time, 3) produce and develop new products that optimize our offerings, and 4) we can make smart decision. When you combine big data with high-powered analytics, you gain Business Intelligence. Check our our next blog for a deep view of Business Intelligence.