Big Data continues to buzz and feed the wildest expectations in organizations. Without objectification, some of these expectations will remain at the stage of fantasy and therefore inevitably frustrate and disappoint. However, the concrete achievements of Big Data we work are passing from fantasy to reality.
To extend your pleasure, this article is offered in two parts. This first post shows the first 3 temptations of Big Data.
Temptation # 1: Leverage gargantuan volumes of data
It is almost a tautology, big data allows to exploit huge volumes. This is also one of the components of its most commonly accepted definition, based on the 3V Volume, Variety, Velocity. The volumes managed in Big Data bases are theoretically infinite (through extreme scalability enabled by these architectures). However, most real projects range from a few hundred GB and tens of terabytes of stored data.
We also regularly ask the question whether it is appropriate to carry out a project for Big Data volumes lower than TB. The answer is clearly yes! The volume of data is not the first criterion which justifies the use of Big Data … There are 6 other temptations you must yield.
Moreover, even if the current volume of data you want to use is not yet very important, nothing says it will not be in the future. Indeed, once the first initiatives implemented, it will be tempting to exploit new information that can be large. In addition, with the spread of sensors and other connected objects, there likely that your organization has to deal also with higher volumes of data very quickly to TB.
Temptation 2: Make use of new sources of information
Gartner measured only 30% of corporate data that is used. 70% of information “in the shadow” can thus miss out on active opportunities. This data includes in contracts, reports, exchanges between customers and service … but especially Excel tables scattered on the company workstations.
In addition, it is the exploitation of external data that is source of value:
- Open-data or resold by other companies data (INSEE, weather, telecom operators …) to contextualize and enrich the understanding of the environment
- Sensor Data or objects connected to acquire everyday information at the place of use of the topic (machine factory, a log IF, use of a product by a customer…)
All these data, yet few (or no in most cases) exploited open new fields of investigation and certainly the key to new business opportunities.
Temptation 3: Play with unstructured data
Until then, only the data called “structured” were operated by organizations. Structured means that it must be able to organize information by storing them in boxes at predetermined format (imposed by the SQL database models) and that the data is manipulated directly (ie in 90 % of digital information). And this has gradually led our organizations to put blinders on and ignore the wealth of knowledge that can reveal other types of data.
With Big Data, everything is given (you will notice the voluntary dual meaning of this assertion)! Thus, one can use the information we do not know a priori how they are organized or whose shape is changing. One can also use images, sound and video. For example, analysis of the SNCF rail pictures (simply taken by trains in circulation) to identify where preventive maintenance is to be performed.