Friday, March 20, 2009
The Conclusion
Success of a data warehouse implementation can be measured when the consulting firm and the client have executed the project according to the agreed requirements and project scope. The implementation of the technology along with “Subject Matter Experts” (SME) who can provide and support the organisation with decision support tools are not the only elements for a successful data warehouse implementation.
As discussed in the earlier articles, data for decision support is backed up by factual evidence from transactional systems. The use of a Data warehouse is used to address problems relating to limited decision support tools (operational reports) in a transactional system and will prevent the 're-inventing the wheel' syndrome for each new decision support tool (Meredith, O'Donnell et al. 2008).
Good project management by itself will not determine the success of a data warehouse implementation. There are other pitfalls that the project manager should be aware of and therefore need to take steps to avoid them. As discussed in the earlier articles, the main issues in a Data Warehouse implementation are not merely technology or system architecture, the majority of the issues are related to:
• Stakeholder involvement and sponsorship
• Sufficient Funding
• Cultural awareness
• Agreed understanding of project scope and Scope management
• Organisational Politics
• Stable workforce
• Good project management
The Extraction, Transformation and Load (ETL) process is perhaps the most critical component of a data warehouse, however getting this right does not conclude that the implementation is a success. ETL is a continuous process as the source database will continue to change and when this occurs, the business rules or transformation logic will also need to be changed and thus, code changes may be required. Data Warehouse teams cannot “set and forget” ETL routines after a successful iteration of the ETL routine. This routine needs to be fine-tuned or refined regularly to protect the integrity of the data warehouse.
Even though reports are being produced by the BI application, care is required to maintain the integrity and validity of the data in the data warehouse due to changes that could occur in the transactional systems. Transactional and data warehouse teams need to standardise a process of communicating changes to the source system (Westerman 2001).
In a data warehouse environment, it is unlikely that stakeholders are able to specify all their reporting and analysis needs upfront. Therefore, getting out some sample data from the data warehouse early on in the project will give the users a taste of the sort of information they can get out of the data warehouse. This iterative manner of “show and tell” will generate far more accurate understanding of the requirements thereby ensuring that the data warehouse implementation is tracking accordingly.
Some projects will fail regardless and others will succeed but the key is to understand the issues of the organisation and their stakeholders so as to better the success outcomes of the project. Failed projects will make news whereas successful projects tend not to be heard. An example is the NASA’s Mars Climate Orbiter loss (Sauser, Reilly et al. 2009) millions of dollars were spent where the return of investment was negative. They believed that the project failed due to technical issues but the real cause of the failure was due to bad management.
REFERENCES
Meredith, R., P. A. a. O'Donnell, et al. (2008). Databases and Data Warehouses for Decision Support. Handbook on Decision Support Systems. F. V. a. Burstein and C. W. Holsapple. Berlin, Germany, Springer-Verlag. 1: 207-230.
Sauser, B. J., R. R. Reilly, &, et al. (2009). "Why projects fail? How contingency theory can provide new insights – A comparative analysis of NASA’s Mars Climate Orbiter loss." International Journal of Project Management Available online 14 February 2009: 15.
Westerman, P. (2001). Data Warehousing: using the Wal-Mart model, Academic Press.
Critical Factors for a Successful Data Warehouse Implementation
Ten Critical Factors for Successful Data Warehouse Implementation(Frolick and Lindsey), (Adelman and Moss 2000):
1. Stakeholders Involvement & Strong Sponsorship
The Data Warehouse project needs to align itself with the goals of the stakeholders and the organisation. Stakeholders needs to know that their decision in embarking on this project will pay dividends (Yeo and Qiu 2003). The project manager will need to strike a rapport with stakeholders and the project team. This will foster open and honest communication between the team members.
Implementing the Data Warehouse systems will require delivery of front-end and back-end systems. It is important to dialogue with the stakeholders in how they perceive the delivery of their decision support is going to be. This will play an important role in the “journey” to a successful implementation.
2. Sufficient Funding
A project usually starts off with an allocated agreed budget. Project cost run off may still occur but this can be minimised due to the budget agreement which forms the basis of what is included in the project and what is not. Ambiguity in any aspect of the project can and will most probably increase the cost of the project.
Anything that potentially impacts the original project scope in terms of timeline or cost, such as a change in business requirements or expectations is to be treated as a change in scope. The project manager will need to ensure that if the cost budget were to be extended that new sign offs and agreement must be in place and ensure that the funds are there to complete the new work.
The project manager needs to monitor not only the client’s budget but also that of his/her own team. This involves ensuring that the project is running on schedule or that the number of staff required on the project team meets the budget constraints at all times. Any variation to schedule or resourcing may cause a loss in revenue.
3. Cultural Awareness
A paper written by Zwikael, Shimizu et al. (2005) discusses the difference in culture between the project managers from Israeli and Japanese background. It describes the different project management skills applied by project manager from the different countries.
Project Managers | Focused on |
Israel | · Scope · Time |
Japan | · Communications · Cost |
The Japanese organisations use clear and measurable success measures for each project, while their Israeli counterparts were often quite foggy (Zwikael, Shimizu et al. 2005). In other words, cultural behaviours can play a part in a project.
Projects developing a common set of terminologies at the start will minimise confusion and foster better communication.
4. Agreed Understanding of Project Scope & Scope Management
The project should have a detailed scoping document which has been agreed to by both the consulting firm and the organisation requesting the work to be done. This document forms an agreed understanding between the consulting firm and the client of what the goals and targets of the project is. Any deviation to this project scope is considered a change in scope.
The implementation of the Data Warehouse needs to follow the requirement specification(s). After the scope has been agreed, there will inevitably be changes to the business requirements or legislative compliance. This change in scope needs to be managed otherwise a scope creep situation happens.
5. Organisational Politics
Political issues will always rise up in any organisation and in all projects. It is the degree of politics that may adversely influence the outcome of the project. It is therefore important to understand the mechanics of the organisation and culture of the organisation early on. Engaging with stakeholders will highlight personal and corporate agendas which may or may not affect the project. Any negative issues needs to be highlighted and addressed quickly so as not to allow the project to fail.
6. Stable Workforce
If turnover of organisation personnel is high, knowledge gained about the data warehouse leaves the organisation and thus the project, creating instability within the project itself.
Project Managers will need to understand that large scale projects are not dictated by technologies but rather by the expertise of the team and their ability to manage the project.
7. Understand Data Problems and Complexities in the Extract, Transform, Load (ETL) phase
In many organisations, there are systems set up to assist with the daily routine of data capture and processing. One such system could be an Enterprise Resource Planning (ERP) system which is a transactional based system. This system performs very well in capturing data and storing them in a Relational Database Management System (RDBMS). This data is time-variant and volatile. The concept of time-variant and volatility will be discussed later on. A Data Warehouse on the other hand extracts the data from the transactional RDBMS for analysis and decision support. Organisations trying to improve their “bottom line” require data warehouse tools that can assist them make these decisions in a timely manner.
Data in the Data Warehouse needs to be consistent. It also needs to be "subject-oriented, integrated, time-variant, and non-volatile collection of data in support of management's decisions” (Inmon and Hackathorn 1994).
· Subject-Oriented - Data Warehouses are designed around the organisation structure and the applications and functions around the organisation functions (Inmon 2002).
· Integration and Transformation - Data in the Data Warehouse needs to be integrated (Inmon 2002). For example, the Australian state of Victoria should be standardised to “VIC” in the Data Warehouse to enforce consistency regardless of how this is stored in the source data. During the ETL process, the data will be transformed to enforce consistency.
· Time Variance - Inmon (2002) calls the time variance the “time horizon”. The time horizon in a transactional database is stored in days or months whereas the Data Warehouse stores a greater detail of historical data for the given time (Inmon 2002).
· Non-Volatility - The data in the Data Warehouse is static / non-volatile (Inmon 2002) as opposed to that of the transactional-based system where the data can be updated continuously.
ETL phase
Poor or inaccurate data quality could lead to bad decision making. Data must be cleansed to ensure the data for decision support is valid.
Data for most systems is keyed into the front-end systems manually and therefore, if the front-end application does not have proper data validation in place, then bad or dirty data will be allowed into the system. The ETL phase is required so that during the mapping of data from the source to the Data Warehouse, any data that does not meet an agreed set of transformation rules can be cleansed. Issues found during an ETL process include:
· data inconsistency – for example state names can come in as VIC or Vic or Victoria;
· data redundancy in the transactional systems – for example, inactive inventory accounts for the past 12 months that have not been used or transacted or information that is not relevant for decision support;
· “dirty data” in the transactional system(s) – for example “1E+6” in the data field should be transformed to 1,000,000
Care needs to be taken when data is being cleaned as data may be transposed incorrectly, deleted or placed in the wrong field.
ETL is a continuous process as the source database will continue to change and when this occurs, the business rules or transformation logic will also need to be changed and thus, code changes may be required. Data Warehouse teams cannot “set and forget” ETL routines after a successful iteration of the ETL routine. This routine needs to be fine-tuned or refined regularly to protect the integrity of the data warehouse.
Transactional and data warehouse teams need to standardise a process of communicating changes to the source system (Westerman 2001). If the transactional system requires data from the data warehouse that was extracted from another source system, then any changes in the Data Warehouse will need to also flow back to the transactional team.
8. Appropriate Choice of Technology
Another issue that might complicate matters in a Data Warehouse project is the incompatibility between the skill-set of the team and the applications used. Consulting houses for Data Warehouse and Business Intelligence will generally be more aligned or familiar with a particular vendor brand, even though they may also use other vendors from time to time. Thus in a sales presentation, their vendor alliance may influence the client to go with a particular suite of “proven technology”. This may not always be in the best interest of the client.
Clients have their own IT infrastructure and the introduction of a new product and architecture might not complement the existing technical environment. Clients might not have the equipment to host the proposed application or might not have the right team size to maintain the data warehouse applications during and after implementation. This will form a risk factor in determining the ongoing viability of the application as new staff will have to be trained. Also, cultural issues (Zwikael, Shimizu et al. 2005) may arise between the existing transactional processing system team and the new Data Warehouse team.
9. Minimisation of the Unexpected or Unpredictable Elements of a Data Warehouse
Robert Hillard and Peter Blecher co-wrote a paper on “Why is Data Warehousing so hard?” for the Nolan Norton Institute (which was the research division of KPMG Consulting now known as BearingPoint Inc.), (Hillard and Blecher 1999). Hillard and Blecher argued that implementing a Data Warehouse is like predicting the weather. They outlined techniques to minimise the unpredictable elements in a Data Warehouse. The techniques were:
· Iterative development –an approach where the functionality of the Data Warehouse is delivered in stages or progressively;
· Setting the user’s expectations – this is basically about good stakeholder management whereby the expectations of the stakeholders should be matched to what the Data Warehouse can deliver;
· Understand user behaviour – Anticipating how much tolerance a user has towards the performance of the Data Warehouse;
· Predict the unpredictable – the data warehouse manager needs to include allowance for unpredicted outages in service level agreements by setting benchmarks based on similar systems in the past.
10. Good Project Management
It might sound obvious but regardless of whichever project methodology is used by the data warehouse project, sound and solid project management skills are required so that the data warehouse is delivered according to plan.
The project manager will need to watch out for any schedule slippage or budget blow-out symptoms arising from the following:
· Long time delays in delivery of the front end;
· Not understanding the front end requirements by the stakeholder;
· Undefined or poorly gathered requirements;
· Inadequate testing;
· Technology constraints.
Testing of the Data Warehouse will need to be carried out to ensure that the performances of the system along with its functionality are as per the specification document in the discussions with the stakeholders (Weisensee, Matthews et al.). Testing should involve end-users because they will need to be comfortable with their system.
In terms of the most suitable data warehouse project methodology, most consulting firms and some vendor companies have their own in-house project methodology. Here are just some of them:
· Teradata - http://www.teradata.com/t/article.aspx?id=1466
· Chicago Intelligence Business Group - http://www.chicagobigroup.com/trucourse_business_intelligence_methodology.htm
· Business Intelligence Consulting Group - http://www.oracle.com/appserver/business-intelligence/docs/bicg-buy-versus-build-fy08.pdf / http://www.biconsultinggroup.com/knowledgebase.asp?CategoryID=197
· WhiteCloud Analytics- http://www.whitecloudanalytics.com/consulting.html
· SSW Consulting Services - http://www.ssw.com.au/ssw/Standards/Rules/RulestoBetterProjectManagement.aspx
For large scale data warehouse projects, it is important that the engaged project manager has the experience of working in a data warehouse environment because there are unique challenges faced in data warehouse implementations.
Project Managers will need to monitor closely all aspect of the project from staffing, budgeting and planning of the project. They should regularly conduct checkpoint review meetings to ensure that the requirements for decision support data analysis through ad hoc reporting or dashboard are on track.
All aspects of the project plan such as the development of requirements, creation of deliverable artefacts, prototyping of the application and testing require monitoring for potential slippages in schedule.
CONCLUSION
Data warehouse implementation is by nature, a complex undertaking and the degree of complexity increases depending on the size of the organisation, its politics, the stakeholder degree of buy-in and the number of data sources.
It is interesting to note that out of ten critical factors influencing the success of a data warehouse implementation, at least 60% of the factors are not related to technology or solution architecture. These are:
· Stakeholder involvement and sponsorship
· Sufficient Funding
· Cultural awareness
· Agreed understanding of project scope and Scope management
· Organisational Politics
· Stable workforce
· Good project management
The remaining factors are equally as important:
· Understand data problems and complexities in the ETL phase
· Appropriate choice of technology
· Minimisation of the unexpected or unpredictable elements of a data warehouse
REFERENCES
Adelman, S. and L. Moss (2000). "Data Warehouse Failures." The Data Administration Newsletter. from http://www.tdan.com/view-articles/4876.
Elmasri, R. a. and S. B. Navathe (2000). Fundamentals of database systems. Massachusetts, Addison-Wesley.
Frolick, M. N. and K. Lindsey. "Critical Factors for Data Warehouse Failure." Business Intelligence Journal. from http://www.tdwi.org/research/display.aspx?ID=6592.
Hillard, R. a. and P. Blecher (1999). Why is data warehousing so hard? NNI:Opinion, KPMG. 4: 1-4.
Inmon, W. a. and R. Hackathorn (1994). Using the Data Warehouse. New York, John Wiley and Sons.
Inmon, W. H. (2002). Building the data warehouse. New York, J. Wiley.
Kimball, R., M. Ross, et al. (2008). The data warehouse lifecycle toolkit. Indianapolis, IN, Wiley Pub.
Weisensee, D., E. a. Matthews, et al. Implementing Data Warehousing and Business Intelligence at McMaster University Using the SAS® Intelligence Value Chain Hamilton, ON, McMaster University.
Westerman, P. (2001). Data Warehousing: using the Wal-Mart model, Academic Press.
Yeo, K. T. a. and F. Qiu (2003). "The value of management flexibility—a real option approach to investment evaluation." International Journal of Project Management 21: 243–250.
Zwikael, O., K. Shimizu, and, et al. (2005). "Cultural differences in project management capabilities: A field study." International Journal of Project Management 23: 454–462.