EII: Information on DemandEII: Information on Demand
Will enterprise information integration really displace traditional data warehousing, or just complement it?
If the term "enterprise information integration" isn't immediately clear, there's a reason: EII is a broad notion that raises more questions than it answers. How do you know if EII is right for your organization? What are the challenges of implementing EII? Above all, what does EII offer that isn't already covered by data warehousing and data extract, transform and load (ETL) software procedures? How is it different from customer data integration and other recent approaches to information integration?
All the acronyms are enough to make your head spin. In this article, we'll clarify EII and its role, particularly in business intelligence and data warehousing scenarios.
Break with Tradition
TAKE ACTION |
---|
• Review your data integration needs. You want to add a stock price and other company information on your portal, and this information is widely distributed. Or, your key supplier has started providing new information in data files not currently handled by your ETL process, and you are loathe to open this Pandora's box. Rather than have three new applications under development incorporating spaghetti code routines, consider a virtual, integrated data store that all applications could use. The single, virtual view benefits multiple reporting and data analysis needs, rather than just one need.• Understand the EII tool's limitations. James Markarian, chief technology officer at Informatica Corp., suggests that scalability of EII solutions may be driven by determining factors such as query performance and caching. Ensure that your scalability requirements are addressed by the EII solution.• Explore the full range of vendors. Avaki, Composite Software, Ipedo and MetaMatrix are prominent EII tool providers. However, major software vendors such as BEA, IBM, Oracle and SAP are filling out their portfolios with some form of EII technology, perhaps as CDI, RDM or MDM. Organizations with predominantly IBM technology, for example, should take a good look at DB2 Information Integrator, while Oracle shops should consider Customer Data Hub. Ascential Software and Informatica, the two heavyweights of traditional ETL integration, don't field EII tools, but this may change in 2005.• Begin with a prototype — and stick with it. EII has come a long way from the early attempts at query federation, which has been an "emerging technology" in database research and development circles for some time. Still, as new technology, it will require some fortitude to implement. Vendors will need your faithful partnership to realize the potential. Identify the types of information you'd like to integrate and create a prototype that not only demonstrates integration but also allows you to do performance measurement and diagnostics. |
Rob Cardwell, CTO of EII software vendor MetaMatrix, says EII is about making distributed data "accessible and manageable, breaking through traditional barriers of location, structure, semantics and context." Surveying the offerings of MetaMatrix and competing vendors, it's clear that a federated query system is fundamental to making distributed data accessible, as it accesses multiple, heterogeneous data sources and brings back a single data set.
EII differs most from conventional ETL-oriented data warehousing in that it accesses, rather than moves, information. Keep in mind that ETL really isn't one standard procedure but multiple processes that vary according to what an organization needs. However, ETL generally involves data movement to a central repository or other files and subsystems, such as data marts, that support BI reporting. EII uses virtualization to present clients with a view of one consolidated information resource, hiding the federated query system that's actually drawing from multiple data resources. EII "plays the data where it lays," as some put it.
The number and complexity of data silos — disparate, disconnected resources beholden to a single department or user — continue to grow, outpacing IT's attempts to standardize ETL and data integration tools as well as efforts to update and maintain what still dominates most integration efforts: custom code. Regulatory compliance, real-time BI and new challenges involving convergence of structured and unstructured information are putting even more pressure on conventional approaches.
EII could be a solution to some of these woes. Along with less movement, EII involves less extensive data transformation, focusing on combining diverse definitions of data elements and presenting the result as a single information element. Strong global query optimization is critical to EII and will always be a challenge; however, optimization plays to the strengths of established database management vendors as well as newer vendors applying the latest algorithms. Automated intelligence in query optimization, as well as full support for universal data access standards (such as ODBC, JDBC and XML), can take the burden of knowing the intricacies of each data source off the application or user.
Performance is removed from the programmer/administrator domain and is given over to the EII tool. Most EII tools employ data caching or staging to improve query performance. While EII is most often used just to read data, the federated approach could work for bidirectional transactions-that is, updating and manipulating across multiple sources. Emerging service-oriented architectures (SOAs) and enterprise service bus (ESB) technology will work with EII to let clients consume data and expose information as part of Web services.
The Metadata Layer
A key aspect of EII — and one supported by most of the tools in the marketplace — is robust data modeling and metadata management. While the dream of a single data or information model within an enterprise remains elusive and in most cases impractical, EII helps establish an integration architecture that melds modern universal access standards with data about the sources, and about the information requirements, such as a single view of a product or customer. In other words, the focus is on how data is used rather than on generic relational data modeling for ETL.
Metadata is important for another goal: reusability. EII can help business analysts and developers maintain virtual views, including logic and interfaces required to create and maintain such perspectives on customers, products and other objects of interest. EII tools work with the metadata layer to ensure security of the metadata and interact with security at disparate data sources.
To sum up, EII is a data integration and virtualization technique aimed at providing a unified view of data — a single version of the truth. It does so by facilitating access to multiple or disparate data sources on demand in a secure, efficient manner.
What EII is Not
To understand EII, it's important to consider what it is not. I've discussed some of the key ways in which EII is different from ETL, a large-volume, batch-oriented approach to data movement and transformation. EII, on the other hand, is mainly about retrieving data on demand.
Does EII compete with data warehousing? Despite the buzz, vendors unanimously answer "no." EII supplements, rather than supplants, data warehousing. EII can help data warehousing by bringing in data from minor or nonstandard sources. It can also federate data from the warehouse itself, join it with data from other sources, and present it to the user or client application on demand. BI, the main consumer of data warehousing, is also an EII consumer.
How does EII differ from enterprise application integration? EAI and its emerging new generation, ESB, are "push" technologies. They use messaging and are geared for transactional application integration. EII is a "pull" technology: It supports SQL and other standard data access languages that send queries to the EII virtual data store. The EII engine then uses a federated approach to answer the query.
Customer data integration (CDI), exemplified by technology from newer vendors such as Siperian and Journee as well as from Oracle and other established providers, confuses matters a bit. CDI is also an on-demand (pull) solution and uses metadata and a model-driven approach to integration. Virtualization is also essential to CDI. Dig a little deeper ... and they still look awfully similar!
CDI and EII are essentially the same technology, packaged differently for their respective purposes. EII providers are pursuing demand for "middleware": that is, generic information integration. CDI focuses on customer-oriented applications, including marketing intelligence, call centers and other services, and other aspects of customer relationship management. There's no reason EII vendors couldn't refine their focus on integrating customer information or CDI vendors couldn't turn their attention to "product," "supplier" or other objects and develop a focused solution.
A final related field is reference, or "master," data management (RDM or MDM). IBM, Oracle, SAP and other information management vendors see this technology as a means to store, augment and consolidate structured and unstructured data from heterogeneous locations. RDM and MDM solutions ensure cross-system data consistency in much the same manner as EII — through a metadata layer, virtual views and federated querying. RDM and MDM are generally focused on product content management and cross-media catalog publishing and access via the Web.
RDM and MDM represent a specialized application of the broader EII approach. Some vendors, such as Razza, go further to make the approach easier by presenting meaningful interfaces to users, whereas EII generally operates behind the scenes as middleware.
EII is an emerging technology. No matter what shape it takes, the main driver is that coveted goal of a "single version of the truth." Whether the data is reference or transactional, EII's virtualization and metadata technologies let organizations create a single definition of data from disjointed and disparate information across the enterprise. This includes ERP, spreadsheet files, data marts and warehouses and Web services.
The single definition may be accessed and delivered through portals, dashboards, mobile devices and various applications. Mark Fulgham, executive director of worldwide strategic outsourcing at Hewlett-Packard Managed Services, says that EII helps his organization gain "nimble, operational, right-time insight into intraday business and service delivery functions." Fulgham is testing EII technology from Composite Software for both internal use and for a service for HP clients.
What if you're interested in implementing EII but already have one or more data warehouses, portals and dashboards in place and, understandably, little desire to disrupt this finely orchestrated data flow? "Take Action," offers some relevant advice.
Data Integration, Reloaded
The days of laissez-faire data and information integration are behind us. Spaghetti code buried in multiple applications won't do-nor will expensive, inflexible ETL and data warehousing implementations. The addition of unstructured information — content in the form of text, e-mail, images and more — will challenge standard operating procedures. Far from replacing data warehousing, EII's premise — data on demand, from any source to any destination — will be a critical addition to ETL and data warehousing efforts. The potential is enormous — even if end users never know how it was achieved.
Business management eyes are on BI and data warehousing. Demands for higher return on investment and reduced total cost of ownership have never been higher. Information integration is an important, strategic step toward greater competitive advantage. By efficiently sharing all forms of data, you can meet objectives such as improving customer and partner intelligence and creating self-service business processes.
IT hasn't yet delivered "The Matrix" vision, where all that users see of complex data access and sophisticated applications is the interface-the sophisticated appendage of an infrastructure supporting a cohesive and connected world. As it matures, however, EII might enable organizations to reach such an exalted stage.
Rajan Chandras is a principal consultant with the New York offices of CSC Consulting (www.csc.com). The opinions expressed here are his own. Write him at [email protected].
About the Author
You May Also Like