What is the Difference Between Data Mining and Data Warehousing?
The terms data mining and data warehousing are often confused by both business and technical staff. The entire field of data management has experienced a phenomenal growth with the implementation of data collection software programs and the decreased cost of computer memory. The primary purpose behind both these functions is to provide the tools and methodologies to explore the patterns and meaning in large amount of data.
The primary differences between data mining and data warehousing are the system designs, methodology used, and the purpose. Data mining is the use of pattern recognition logic to identity trends within a sample data set and extrapolate this information against the larger data pool. Data warehousing is the process of extracting and storing data to allow easier reporting.
Data mining is a general term used to describe a range of business processes that derive patterns from data. Typically, a statistical analysis software package is used to identify specific patterns, based on the data set and queries generated by the end user. A typical use of data mining is to create targeted marketing programs, identify financial fraud, and to flag unusual patterns in behavior as part of a security review.
An excellent example of data mining is the process used by telephone companies to market products to existing customers. The telephone company uses data mining software to access its database of customer information. A query is written to identify customers who have subscribed to the basic phone package and the Internet service over a specific time frame. Once this data set is selected, another query is written to determine how many of these customers took advantage of free additional phone features during a trial promotion. The results of this data mining exercise reveal patterns of behavior that can drive or help refine a marketing plan to increase the use of additional telephone services.
It is important to note that the primary purpose of data mining is to spot patterns in the data. The specifications used to define the sample set has a huge impact on the relevance of the output and the accuracy of the analysis. Returning to the example above, if the data set is limited to customers within a specific geographical area, the results and patterns will differ from a broader data set. Although both data mining and data warehousing work with large volumes of information, the processes used are quite different.
A data warehouse is a software product that is used to store large volumes of data and run specifically designed queries and reports. Business intelligence is a growing field of study that focuses on data warehousing and related functionality. These tools are designed to extract data and store it in a method designed to provide enhanced system performance. Much of the terminology in data mining and data warehousing are the same, leading to more confusion.
Someone in the next town to me got caught for fraud by an agency using data mining ideas. I heard the guy was claiming a very low income and paying hardly any tax. But he'd written all sorts of contradictory information on his MySpace and personal website.
@yumdelish - That is pretty disturbing, and I don't actually know if Facebook can really monitor what you do on the web in general. Maybe you clicked a 'like' button on some of the sites?
On the other hand there is plenty of evidence that the company use data mining services to make the most of your activity on the site. So if you join groups then the ads will probably be related to thoe content of those.
In my mind this is a modern interpretation of data mining concepts, but it makes sense as social networking sites are such a part of everyday life these days. I'd like to read more about how data warehousing tools are being used in a similar way.
The other day I logged onto Facebook and noticed that some of the ads on my page were for websites I had looked at recently! I find this really scary. Does it mean that Facebook are using data mining tools to target these to me?
I can't really imagine that this is possible, as that would mean they have some way of knowing what I do away from the site as well as on it!
Post your comments