A data warehouse is a place where data is stored for archival, analysis, and security purposes. Usually a data warehouse is either a single computer or many computers (servers) tied together to create one giant computer system.
Data consists of raw data or formatted data. It can be on various types of topics including the organization’s sales, salaries, operational data, summaries of data including reports, copies of data, human resource data, inventory data, external data to provide simulations and analysis, etc.
Besides being a storehouse for a large amount of data, they must possess systems in place that make it easy to access the data and use it in day to day operations. A data warehouse is sometimes said to be a major role player in a decision support system (DSS). DSS is a technique that organizations use to come up with facts, trends, or relationships that can help them make effective decisions or create effective strategies to accomplish their organizational goals.
Data Warehousing Models
There are many different data warehouse models. Online Transaction Processing, which is a data warehouse model, is built for speed and easy use. Another type of data warehouse model is called Online Analytical processing, which is more difficult to use and adds an extra step of analysis within the data. Usually it requires more steps that slows the process down and requires much more data in order to analyze certain queries.
In addition to this model, one of the more common data warehouse models include a data warehouse that is subject oriented, time variant, not volatile, and integrated. Subject oriented means that data is linked together and is organized by relationships. Time variant means that any data that is changed in the data warehouse can be tracked. Usually all changes to data are stamped with a time-date and with a before and after value, so that the changes throughout a period of time can be shown. Non volatile means that the data is never deleted or erased. This is a great way to protect the most crucial data. Because this data is retained, users can continue to use it in a later analysis. Finally, the data is integrated, which means that a data warehouse uses data that is organizational wide instead of from just one department.
Besides the term data warehouse, a frequently used term is data mart. Data marts are smaller and less integrated than data housings. They might be just a database on human resources records or sales data on just one division.
Types of Data Warehouses
With improvements in technology, as well as innovations in using data warehousing techniques, data warehouses have changed from Offline Operational Databases to include an Online Integrated data warehouse.
Offline Operational Data Warehouses are data warehouses where data is usually copied and pasted from real time data networks into an offline system where it can be used. It is usually the simplest and less technical type of data warehouse.
Offline Data Warehouses are data warehouses that are updated frequently (daily, weekly, or monthly). Data is stored in an integrated structure where others can access it and perform reporting.
Real Time Data Warehouses are data warehouses where it is updated each moment with the influx of new data. For instance, a Real Time Data Warehouse might incorporate data from a Point of Sales system and is updated with each sale that is made.
Integrated Data Warehouses are data warehouses that can be used for other systems to access them for operational systems. Other data warehouses use some Integrated Data Warehouses, allowing them to access them to process reports, as well as look up current data.
Advantages & Disadvantages
The number one reason for implementing a data warehouse is so that employees or end users can access the data warehouse and use the data for reports, analysis, and decision making. Using the data in a warehouse can help locate trends, focus on relationships, and help users understand more about the environment that a business operates in.
Data warehouses also increase the data’s consistency and allow it to be checked repeatedly to determine how relevant it is. Because most data warehouses are integrated, users can pull data from many different areas of their business, for instance human resources, finance, IT, accounting, etc.
While there are plenty of reasons to have a data warehouse, it should be noted that there are a few negatives associated with this. This includes the fact that it is time consuming to create and to keep operating.
Users might also have a problem with current systems being incompatible with the data. It is also important to consider future equipment and software upgrades. These may also need to be compatible with the data.
Finally, security might be a huge concern, especially if the data is accessible over an open network such as the Internet. Users will not want their competitor to view, hack, or destroy their data.