Tuesday, December 9, 2014

The Big Deal About Big Data

First of all, what is Big Data? Wikipedia defines Big Data as “… any collection of data sets so large and complex that it becomes difficult to process them using traditional data processing applications.”

OK, when we read this definition, most often we don’t proceed beyond the “data sets so large” part and end up thinking that Big Data is “really huge and vast amounts of data”. Consequently, it might lead one to believe that all you need is monster processing power to mine, scrutinize and bring to heel this big daddy.

In fact, when I asked a senior executive at a multinational telecom company about what they thought of Big Data and Analytics, they said that it was “… an exciting field of business analysis where you got to examine gazillion bytes of complex data and figured the usage and other characteristics of their subscribers so that they could serve them better.” Not a bad response, however, it is only partly correct.

So, in order to get the most out of the Big Data, let us get our Big Data 101 primer and go through some definitions and descriptions real quick.

Definition of Big Data

Big Data refers to a collection of data that is vast in its size, coming from varied sources such as traditional customer information systems & other enterprise level transactions, machine-generated data such as smart billing systems, call registers, online and automated store transactions, social media streams, web and e-commerce stores and now, you could even add data from the Internet of Things (IoT).

The 3 V’s of Big Data

There are three basic V’s – Volume, Velocity and Variety - that can be used to define Big Data. (In fact, you have a few more V’s that, last I saw, added up to 6 V’s and double what I am describing here, but more about that later). For now, the 3 V’s that characterize Big Data are:

Volume:      This is the first part of the definition from Wikipedia referenced at the top of this post and is basically the size of the data, represented maybe not in terabytes (TB, 1012 bytes), but perhaps more likely in petabytes (PB, 1015 bytes), exabyte (EB, 1018 bytes), zettabyte (ZB, 1021 bytes), and so on. To get an idea of the scale of this, imagine all the data that comes in to a call data register at a telecom company considering the millions subscribers calling each other. What volume of data was generated in the entire year of 2000 is perhaps now being generated every minute! Therefore, traditional databases, software programs and analytics find it too difficult to handle data in this large scale. Newer technologies and data tools will have to be deployed to mine, manage and analyze.

Velocity:      This is simply the speed and frequency at which the data gets generated. We are talking about data that is generated from the Enterprise Systems, Web & e-Commerce transactions, machines and subscriber engagements, all coming in real-time, near real-time, batch and periodic rates. How fast one can sample these fast flowing data to examine and derive useful business information and intelligence.

Variety:        This represents the different types of data that come from a multiplicity of sources. They could be structured data as in financial databases or unstructured data as in text, emails, images, audio, video, etc., and coming in from different types of sources such as ERP, web, social media, e-commerce transactions, etc. And then there is the Internet of Things (IoT) which tends to combine traditional and non-traditional data that stream-in and change at a rapid pace.

4 More V’s

Here are four more V’s beyond the above that are being used to define Big Data:

Veracity / Validity :   This signifies the dependability and uncertainty of data. Considering the multiple sources from which data gets generated, it is important to consider how much accuracy and trustworthiness can be attached to the different sets of data coming in. New technology helps us handle the quality, accuracy and abbreviated forms of data such as emoticons, hashtags, etc., better.

Value:           Many consider this as the most important V, the holy grail of Big Data because it all boils down to how the vast volume of a variety of data that gets blasted into our servers can be converted to economic value. How the intrinsic information can be extracted from it and transformed to revenue makes this a key characteristic. For example, a telecom company can use the information extracted to analyze subscriber and do better churn management.

Viability:     This means practicability and is indicative of what the analysis and data infrastructure can provide in real terms, over and above just the ability to handle and store large scale, complex data. What business rules can be generated, what attributes of data relating to purchase cycles or purchasing history can be used to predict buying behavior, etc.

Volatility:    This is about what is the window of opportunity the company has for particular pieces or sets of data. It is important to know the relevance and life of the data so that you can have efficient and accurate business analytics.

Big Data Myths

Big data is not just confined to the field of technology, nor is it the responsibility of the IT department. It belongs to the entire company, specifically the business and marketing functions and helps in product development, customer engagement, service & retention, revenue enhancement, positioning and a myriad of other critical aspects of business.

Also, Big Data is not a hype or fad generated by some Silicon Valley data dweebs. Big Data technologies and tools help examine and analyze the various structured and unstructured data to segment, understand customer behavior, get direct and indirect feedback and make customer engagement more fulfilling to generate value.

While it is tempting to see Big Data as the solution to all the problems and as an answer to the question of what next in the growth cycle, the most critical thing is not technology but a clear strategy on how as to how you would harness this to generate insights that help build business rules and fine tune processes to move your business forward.


2.       Wikipedia – Big Data
4.       WIRED – Missing V’s
5.       inside BIGDATA

Kall Ramanathan

ValueStrat Consulting @ValueStrat helps businesses understand where they are currently and what they need to do to get where they want to go. For this, we provide essential strategic plans and approaches, called “Keys”, to enable businesses to open up competencies and clear inefficiencies.

ValueStrat gets to the DNA of business - Desire, Need and Ability - to help you ask some critical questions such as discussed above. Check out http://www.valuestrat.in for more

No comments:

Post a Comment