Data mining is usually defined as searching, analyzing and sifting through large amounts of data to find relationships, patterns, or any significant statistical correlations. With the advent of computers, large databases and the internet, it is easier than ever to collect millions, billions and even trillions of pieces of data that can then be systematically analyzed to help look for relationships and to seek solutions to difficult problems. Besides governmental uses, many marketers use data mining to find strong consumer patterns and relationships. Large organizations and educational institutions also data mine to find significant correlations that can enhance our society.
While data mining is amoral in the fact that it only looks for strong statistical correlations or relationships, it can be used for either good or not so good purposes. For instance, many government organizations depend on data mining to help them create solutions for many societal problems. Marketers use data mining to help them pin point and focus their attention on certain segments of the market to sell to, and in some cases black hat hackers can use data mining to steal and scam thousands of people.
How does data mining work? Well the quick answer is that large amounts of data are collected. Usually most entities that perform data mining are large corporations and government agencies. They have been collecting data for decades and they have lots of data to sift through. If you are a fairly new business or individual, you can purchase certain types of data in order to mine for your own purposes. In addition, data can also be stolen from large depositories by hackers by hacking their way into a large database or simply stealing laptops that are ill protected.
If you are interested in a small case study on how data mining is collected, used and profited off of, you can look at your local supermarket. Your supermarket is usually an extremely lean and organized entity that relies on data mining to make sure that it is profitable. Usually your supermarket employs a POS (Point Of Sale) system that collects data from each item that is purchased. The POS system collects data on the item brand name, category, size, time and date of the purchase and at what price the item was purchased at. In addition, the supermarket usually has a customer rewards program, which also is input into the POS system. This information can directly link the products purchased with an individual. All this data for every purchase made for years and years is stored in a database in a computer by the supermarket.
Now that you have a database with millions upon millions of data fields and records what are you going to do with it? Well, you data mine it. Knowledge is power and with so much data you can uncover trends, statistical correlations, relationships and patterns that can help your business become more efficient, effective and streamlined.
The supermarket can now figure out which brands sell the most, what time of the day, week, month or year is the most busiest, what products do consumers buy with certain items. For instance, if a person buys white bread, what other item would they be inclined to buy? Typically we can find its peanut butter and jelly. There is so much good information that a supermarket can use just by data mining their own data that they have collected.