1. Database vs Datawarehouse
The primary difference between you application database and a data warehouse is that while the former is designed (and optimized) to record, the latter has to be designed (and optimized) to respond to analysis questions that are critical for your business. Application databases are OLTP (On-Line Transaction Processing) systems where every transaction has to be recorded, and super-fast at that. Consider the scenario where a bank ATM has disbursed cash to a customer but was unable to record this event in the bank records. If this started happening frequently, the bank wouldn't stay in business for too long. So the banking system is designed to make sure that every transaction gets recorded within the time you stand before the ATM machine. This system is write-optimized, and you shouldn't crib if your analysis query (read operation) takes a lot of time on such a system.
A Data Warehouse (DW) on the other end, is a database (yes, you are right, it's a database) that is designed for facilitating querying and analysis. Often designed as OLAP (On-Line Analytical Processing) systems, these databases contain read-only data that can be queried and analyzed far more efficiently as compared to your regular OLTP application databases. In this sense an OLAP system is designed to be read-optimized.
ref:
http://opensourceanalytics.com/2005/11/02/database-vs-data-warehouse/
ref:
http://opensourceanalytics.com/2005/11/02/database-vs-data-warehouse/
2. Database, Datawarehousing, Generic Data Flow
Database:
1. Used for Online Transactional Processing (OLTP). This records the data from the user for history.
2. The tables and joins are complex since they are normalized. This is done to reduce redundant data and to save storage space.
3. Entity – Relational modeling techniques are used for database design.
4. Optimized for write operation.
5. Performance is low for analysis queries.
Data Warehouse:
1. Used for Online Analytical Processing (OLAP). This reads the historical data for the Users for business decisions.
2. The Tables and joins are simple since they are de-normalized. This is done to reduce the response time for analytical queries.
3. Data – Modeling techniques are used for the Data Warehouse design.
4. Optimized for read operations.
5. High performance for analytical queries.
General Data Flow – (Ex: Online Insurance Registration)
1. Customer enters the details in the Online Registration form.
2. The details are saved into the Database when the customer presses the Submit button in the form.
3. Business Intelligence Team of the Insurance Company uses an ETL tool to pull the data from the Database tables to the Data Warehouse tables.
4. Business Management uses Business Reporting Tools to pull Data from Data Warehouse tables for generating business reports.
ref:
http://www.shvoong.com/internet-and-technologies/blog/1827096-database-vs-data-warehouse/
5. High performance for analytical queries.
General Data Flow – (Ex: Online Insurance Registration)
1. Customer enters the details in the Online Registration form.
2. The details are saved into the Database when the customer presses the Submit button in the form.
3. Business Intelligence Team of the Insurance Company uses an ETL tool to pull the data from the Database tables to the Data Warehouse tables.
4. Business Management uses Business Reporting Tools to pull Data from Data Warehouse tables for generating business reports.
ref:
http://www.shvoong.com/internet-and-technologies/blog/1827096-database-vs-data-warehouse/
3. Operational / informational data:
Operational data is the data you use to run your business and is realtime. This data is what is typically stored, retrieved, and updated by your Online Transactional Processing (OLTP) system. An OLTP system may be, for example, a reservations system, an accounting application, or an order entry application.
Informational data(Historical Data) is created from the wealth of operational data that exists in your business and some external data useful to analyze your business. Informational data is what makes up a data warehouse. Once data has been extracted from the operational systems into the new database it is then referred to as historical data.
ref:
Datawarehouse - http://www.pharmacyinformatics.org/datawarehouse.htm
4. What is Business Intelligence:
Business intelligence is a broad set of applications, technologies and knowledge for gathering and analyzing data for the purpose of helping users make better business decisions.
The main challenge of Business Intelligence is to gather and serve organized information regarding all relevant factors that drive the business and enable end-users to access that knowledge easily and efficiently and in effect maximize the success of an organization.
Business intelligence produces analysis and provides in depth knowledge about performance indicators such as company's customers, competitors, business counterparts, economic environment and internal operations to help making effective and good quality business decisions.
From a technical standpoint, the most important areas that Business Intelligence (BI) covers are:
• DW - Data warehousing - architecture, modeling, managing, processing
• ETL process and data integration
• Reporting, Information visualization and Dashboards
• OLAP - Online Analytical Processing and multidimensional analysis
• Data cleansing and data quality management
• Performance management
• Data mining, statistical analysis, forecasting
• MIS - Management Information Systems
Popular ETL Tools
• IBM WebSphere Information Integration (Ascential DataStage)
• Ab Initio
• Informatica
ref:
http://etl-tools.info/
5. Datawarehousing Reporting Tools:
Popular Tools
• Cognos
• Hyperion/Brio
• Business Objects (Crystal Reports)
• Argos
• Actuate