What Is Data Proliferation?
"Data proliferation" is an umbrella term concerned with the large number of files and amount of data stored by entities such as governments and businesses. The massive amount of data coming in daily means these entities need more space and hardware, but data proliferation is moving faster than computer advancements as of 2011. It does not matter what type of information is stored — whether it is structured or unstructured; all that matters is that computer memory is being taken up. Storing all this data can be difficult, leading to extra costs. Another problem with data proliferation is that the network on which the data is stored and all associated programs tend to slow down.
The problem of data proliferation is not one that readily concerns consumers and average computer users. While average computer users have required more memory over time, computers have been able to advance at a rate to satisfy these needs. When it comes to businesses, governments and other entities collecting massive data on a daily basis, however, the problem of proliferation of data may manifest.
If an average computer user needs more computer memory, he typically just gets a larger hard drive. When a large entity needs more memory, it typically must get more servers. At a normal rate, this should not present any problems, but many large entities in 2011 are storing increasing amounts of data at rates that outpace technology, and a massive number of servers may be needed to hold everything the entity needs to store. This is because computer technology is not yet able to make a device capable of holding all the information, which means a large entity must continue buying and using more and more hardware.
Some data terms or problems only concern one type of information. When it comes to proliferation of data, however, it does not matter what type of data are involved. As long as computer memory is taken up at a rapid rate, then data proliferation becomes a problem.
One of the many problems with data proliferation is cost. Aside from the cost of extra storage hardware, there also are physical storage and human resources costs. The servers must be placed somewhere and people must be employed to run the servers, resulting in costs that theoretically could become too much for an entity to sustain and lead to severely decreased profits. Another problem concerns network speed, because the clogging of data may lead programs to move much slower, meaning employees can do less work during a workday.
@Logicfest -- this problem also pops up if you are streaming a lot of files. A lot of wireless router manufacturers brag about having USB ports so someone can attach hard drives and stream data. What they don't mention is that you are limited to the number of files you can index on that router.
Let's say your DVD ripping consumer decided to do the same thing with his massive compact disc collection. If your router is limited to 5,000 files it can index (i.e., find), then that person could be out of luck if the CD collection is huge. Sure, you can dump maybe 100,000 songs on that hard drive, but you can't access them if the router can only find 5,000 of them.
That is a problem for consumers, too. Let's say you've got someone with a large number of compact discs who gets the idea of converting them all to digital files and storing them on a networked drive to stream to a smart television.
That person will find out something in a hurry -- video files are huge and someone will need a huge hard drive to hold them all. It is very easy to collect two or more hard drives used to hold all that information. The problem is twofold -- finding a device to stream all that information and keeping backups in case a hard drive fails.
The more data you get, the more difficult it is to deal with. That is a problem consumers face as they get more sophisticated in how they store and stream data.
Post your comments