We often create and share things like emails, photos,
tweets, and Facebook posts. And then there are log files from everything you do
on your devices, and data from smart devices like Fitbit trackers and smart
speakers like Alexa. This data is being generated rapidly and must be
collected, processed, analysed, and stored. For data to grow to such large
volumes in short periods, it must be generated at extreme velocity. “Velocity”
of data is the speed with which data are being generated. Traditional supermarkets often produce
high-frequency information, e.g., Walmart handles more than one million
purchases per hour. Retail is becoming a rapidly data-driven area as more
companies transform digitally, providing even more possibilities for data
collection. In the retail industry, it is better to know which product is
expired or out of stock within seconds or minutes rather than two or three
days. The faster a retailer can restock or exchange its products, the more
quickly it can return to generating product sales.
Data Processing
Data processing refers to the collection and manipulation of
data to produce the right information. There are two ways of data processing:
batch processing and stream processing.
Batch processing can collect a large volume of data at once
rapidly and effectively. For example- A retailer keeps track of overall revenue
across all of its stores every day. Rather than processing every purchase in
real-time, the retailer processes the daily revenue totals of each store in
batches at the end of the day.
Apache Hadoop
The number of devices that capture information in real-time
is increasing exponentially. This fuels the need for technologies that can
balance fast data generation. Apache Hadoop is an open-source platform used to
effectively process and store massive databases varying in size from gigabytes
to petabytes of data. It is A distributed computation system modelled after
Google MapReduce helps multiple computers aggregating to explore large
databases more easily in parallel. Big data analysis is completely featured in
Hadoop, but with considerable variance. (Understanding Big Data Processing and
Analytics - Developer.com, 2021)
The Velocity problem arises when the organization's data
storage or data management is slower than the production of its data. And If
the information is outdated, it doesn't matter how reliable the information is
there is no use of that data. Is it challenging to process a huge amount of
data at a fast speed in the upcoming time, what is your opinion on that?
Author: Akshata Dalvi
velocity in big data analytics
data processing
velocity of big data
Blog, R. and Cons, B., 2021. Batch vs. Stream Processing:
Pros and Cons | Rivery. [online] Rivery. Available at:
<https://rivery.io/blog/batch-vs-stream-processing-pros-and-cons-2/>
[Accessed 2 March 2021].
Developer.com. 2021. Understanding Big Data Processing and
Analytics - Developer.com. [online] Available at:
<https://www.developer.com/db/understanding-big-data-processing-and-analytics.html>
[Accessed 2 March 2021].
Great article!
ReplyDeleteReally helpful blog to understand the details in velocity of big data..In retail it’s better to know which products are out-of-stock in terms of seconds or minutes rather than days or weeks. The more quickly a retailer can restock its products, the faster it can return to generating product sales.
ReplyDeleteBeautiful piece of writing! You have well sounded that big data velocity, in addition to volume and variety, has a significant effect on companies. Not only does data need to be obtained quickly, but it also needs to be processed and used rapidly. “Many types of data have a finite shelf life, and their value can deteriorate over time—in some cases, rapidly.” Analyzing data in a timely manner will alert companies to stocking problems, allowing them to be addressed before the problem worsens. To keep up with market trends, data velocity should also speed up the decision-making process.
ReplyDeleteIt's easy to get caught up in the increased speed with which data is flowing into most companies today, particularly from "firehose" data sources such as social media. Velocity, on the other hand, emphasizes the need to process data rapidly and, more importantly, to use it at a quicker rate than ever before. These velocity-related issues are sometimes misunderstood as technological issues, but there's also more to it than that. People, process, and cultural barriers can stifle your company's speed and agility, regardless of how quickly you gather and process data.
-- Thomas Devasia
wonderful piece of writing! velocity of big data refers to the speed which data is generated. Velocity of big data has a large impact on businesses same as volume of big data. Data are need to be acquired quickly and also it should be processed and used at a faster rate. So that velocity is important in big data. Analysing data faster would help to alert a business’ stocking issue and that will help to fix it before getting worse. Data velocity also help to keep up with market changes through speeding up the decision-making process. This article is very helpful to understand about importance of velocity in big data and its roles.
ReplyDeleteVery interesting piece on content!
ReplyDeleteInteresting blog! Keep going:)
ReplyDelete