Understanding Data processing: Velocity of Big Data

 


We often create and share things like emails, photos, tweets, and Facebook posts. And then there are log files from everything you do on your devices, and data from smart devices like Fitbit trackers and smart speakers like Alexa. This data is being generated rapidly and must be collected, processed, analysed, and stored. For data to grow to such large volumes in short periods, it must be generated at extreme velocity. “Velocity” of data is the speed with which data are being generated.  Traditional supermarkets often produce high-frequency information, e.g., Walmart handles more than one million purchases per hour. Retail is becoming a rapidly data-driven area as more companies transform digitally, providing even more possibilities for data collection. In the retail industry, it is better to know which product is expired or out of stock within seconds or minutes rather than two or three days. The faster a retailer can restock or exchange its products, the more quickly it can return to generating product sales.

Data Processing

Data processing refers to the collection and manipulation of data to produce the right information. There are two ways of data processing: batch processing and stream processing.

Batch processing can collect a large volume of data at once rapidly and effectively. For example- A retailer keeps track of overall revenue across all of its stores every day. Rather than processing every purchase in real-time, the retailer processes the daily revenue totals of each store in batches at the end of the day.

Stream data processing is one of the fastest-growing areas of processing where data flow is continuously transported and processed while generated. Stream processing is essential to selecting the Big Data Analysis Tool since the real-time process is more frequently time-dependent and needs an instant and quick analytic result. For Example, after airing a commercial during a sporting event, a soda company wants to increase brand awareness. The company pushes social media data directly into an analytics system in order to track audience response and decide how to improve brand messaging in real-time. (Blog and Cons, 2021)

Apache Hadoop


The number of devices that capture information in real-time is increasing exponentially. This fuels the need for technologies that can balance fast data generation. Apache Hadoop is an open-source platform used to effectively process and store massive databases varying in size from gigabytes to petabytes of data. It is A distributed computation system modelled after Google MapReduce helps multiple computers aggregating to explore large databases more easily in parallel. Big data analysis is completely featured in Hadoop, but with considerable variance. (Understanding Big Data Processing and Analytics - Developer.com, 2021)

The Velocity problem arises when the organization's data storage or data management is slower than the production of its data. And If the information is outdated, it doesn't matter how reliable the information is there is no use of that data. Is it challenging to process a huge amount of data at a fast speed in the upcoming time, what is your opinion on that?

Author: Akshata Dalvi

 Keywords:

velocity in big data analytics

data processing

velocity of big data

Reference:

Blog, R. and Cons, B., 2021. Batch vs. Stream Processing: Pros and Cons | Rivery. [online] Rivery. Available at: <https://rivery.io/blog/batch-vs-stream-processing-pros-and-cons-2/> [Accessed 2 March 2021].

Developer.com. 2021. Understanding Big Data Processing and Analytics - Developer.com. [online] Available at: <https://www.developer.com/db/understanding-big-data-processing-and-analytics.html> [Accessed 2 March 2021].

Comments

  1. Really helpful blog to understand the details in velocity of big data..In retail it’s better to know which products are out-of-stock in terms of seconds or minutes rather than days or weeks. The more quickly a retailer can restock its products, the faster it can return to generating product sales.

    ReplyDelete
  2. Beautiful piece of writing! You have well sounded that big data velocity, in addition to volume and variety, has a significant effect on companies. Not only does data need to be obtained quickly, but it also needs to be processed and used rapidly. “Many types of data have a finite shelf life, and their value can deteriorate over time—in some cases, rapidly.” Analyzing data in a timely manner will alert companies to stocking problems, allowing them to be addressed before the problem worsens. To keep up with market trends, data velocity should also speed up the decision-making process.
    It's easy to get caught up in the increased speed with which data is flowing into most companies today, particularly from "firehose" data sources such as social media. Velocity, on the other hand, emphasizes the need to process data rapidly and, more importantly, to use it at a quicker rate than ever before. These velocity-related issues are sometimes misunderstood as technological issues, but there's also more to it than that. People, process, and cultural barriers can stifle your company's speed and agility, regardless of how quickly you gather and process data.

    -- Thomas Devasia

    ReplyDelete
  3. wonderful piece of writing! velocity of big data refers to the speed which data is generated. Velocity of big data has a large impact on businesses same as volume of big data. Data are need to be acquired quickly and also it should be processed and used at a faster rate. So that velocity is important in big data. Analysing data faster would help to alert a business’ stocking issue and that will help to fix it before getting worse. Data velocity also help to keep up with market changes through speeding up the decision-making process. This article is very helpful to understand about importance of velocity in big data and its roles.

    ReplyDelete
  4. Very interesting piece on content!

    ReplyDelete

Post a Comment