We are independent & ad-supported. We may earn a commission for purchases made through our links.

Advertiser Disclosure

Our website is an independent, advertising-supported platform. We provide our content free of charge to our readers, and to keep it that way, we rely on revenue generated through advertisements and affiliate partnerships. This means that when you click on certain links on our site and make a purchase, we may earn a commission. Learn more.

How We Make Money

We sustain our operations through affiliate commissions and advertising. If you click on an affiliate link and make a purchase, we may receive a commission from the merchant at no additional cost to you. We also display advertisements on our website, which help generate revenue to support our work and keep our content free for readers. Our editorial team operates independently from our advertising and affiliate partnerships to ensure that our content remains unbiased and focused on providing you with the best information and recommendations based on thorough research and honest evaluations. To remain transparent, we’ve provided a list of our current affiliate partners here.

What is Data Mining?

Michael Anissimov
By
Updated May 16, 2024
Our promise to you
EasyTechJunkie is dedicated to creating trustworthy, high-quality content that always prioritizes transparency, integrity, and inclusivity above all else. Our ensure that our content creation and review process includes rigorous fact-checking, evidence-based, and continual updates to ensure accuracy and reliability.

Our Promise to you

Founded in 2002, our company has been a trusted resource for readers seeking informative and engaging content. Our dedication to quality remains unwavering—and will never change. We follow a strict editorial policy, ensuring that our content is authored by highly qualified professionals and edited by subject matter experts. This guarantees that everything we publish is objective, accurate, and trustworthy.

Over the years, we've refined our approach to cover a wide range of topics, providing readers with reliable and practical advice to enhance their knowledge and skills. That's why millions of readers turn to us each year. Join us in celebrating the joy of learning, guided by standards you can trust.

Editorial Standards

At EasyTechJunkie, we are committed to creating content that you can trust. Our editorial process is designed to ensure that every piece of content we publish is accurate, reliable, and informative.

Our team of experienced writers and editors follows a strict set of guidelines to ensure the highest quality content. We conduct thorough research, fact-check all information, and rely on credible sources to back up our claims. Our content is reviewed by subject matter experts to ensure accuracy and clarity.

We believe in transparency and maintain editorial independence from our advertisers. Our team does not receive direct compensation from advertisers, allowing us to create unbiased content that prioritizes your interests.

Data mining uses a relatively large amount of computing power operating on a large set of data to determine regularities and connections between data points. Algorithms that employ techniques from statistics, machine learning and pattern recognition are used to search large databases automatically. Data mining is also known as Knowledge-Discovery in Databases (KDD).

Like the term artificial intelligence, data mining is an umbrella term that can be applied to a number of varying activities. In the corporate world, data mining is used most frequently to determine the direction of trends and predict the future. It is employed to build models and decision support systems that give people information they can use. Data mining takes a front-line role in the battle against terrorism. It was supposedly used to determine the leader of the 9/11 attacks.

Data miners are statisticians who use techniques with names like near-neighbor models, k-means clustering, holdout method, k-fold cross validation, the leave-one-out method, and so on. Regression techniques are used to subtract irrelevant patterns, leaving only useful information. The term Bayesian is seen frequently in the field, referring to a class of inference techniques that predict the likelihood of future events by combining prior probabilities and probabilities based on conditional events. Spam filtering is arguably a form of data mining, which automatically brings relevant messages to the surface from a chaotic sea of phishing attempts and Viagra pitches.

Decision trees are used to filter mountains of data. In a decision tree, all data passes through an entrance node, where it faces a filter that separates the data into streams depending on its characteristics. For example, data about consumer behavior is likely to be filtered based on demographic factors. Data mining is not primarily about fancy graphs and visualization techniques, but it does employ them to show what it has found. It is known that we can absorb more statistical information visually than verbally and this format for presentation can be very persuasive and powerful if used in the right context.

As our civilization becomes increasingly data-saturated and sensors are distributed en masse into our local environments, we will inadvertently discover things that might be missed on the first pass over. Data mining will let us correct these mistakes and discover new insights based on past data, giving us more bang for our data storage buck.

EasyTechJunkie is dedicated to providing accurate and trustworthy information. We carefully select reputable sources and employ a rigorous fact-checking process to maintain the highest standards. To learn more about our commitment to accuracy, read our editorial process.
Michael Anissimov
By Michael Anissimov

Michael is a longtime EasyTechJunkie contributor who specializes in topics relating to paleontology, physics, biology, astronomy, chemistry, and futurism. In addition to being an avid blogger, Michael is particularly passionate about stem cell research, regenerative medicine, and life extension therapies. He has also worked for the Methuselah Foundation, the Singularity Institute for Artificial Intelligence, and the Lifeboat Foundation.

Discussion Comments

By anon163097 — On Mar 26, 2011

Data mining helped find the leader of the 9/11 attacks? They found George Bush? 9/11 was an inside job.

By recapitulate — On Jan 30, 2011

While data mining can refer to the act of simply trying to organize information, it more likely is a word for the other use mentioned here, when groups like social networks, companies, and other advertisers manage to acquire information about possible customers so that they can tailor ads to them and otherwise inundate consumers with more means of consumption. While perhaps more helpful than getting whatever ads they want to throw at people, it still is an invasion of privacy.

By massoud — On Oct 23, 2008

an exigent question :

what is the difference between 'K' and 'k' or 'B' and 'b' and 'disk' and 'disc' ?

Michael Anissimov

Michael Anissimov

Michael is a longtime EasyTechJunkie contributor who specializes in topics relating to paleontology, physics, biology...

Read more
EasyTechJunkie, in your inbox

Our latest articles, guides, and more, delivered daily.

EasyTechJunkie, in your inbox

Our latest articles, guides, and more, delivered daily.