As I’ve mentioned in a couple of previous posts, QlikView doesn’t have the built-in matching functions needed for customer data integration (CDI). This has left me looking for other ways to provide that service, preferably at a low cost. The problem is that the major CDI products like Harte-Hanks Trillium, DataMentors DataFuse and SAS DataFlux are fairly expensive.
One intriguing alternative is Infosolve Technologies. Looking at the Infosolve Web site, it’s clear they offer something relevant, since two flagship products are ‘OpenDQ’ and ‘OpenCDI’ and their tag line is ‘The Power of Zero Based Data Solutions’. But I couldn't figure out exactly what they were selling since they stress that there are ‘never any licenses, hardware requirements or term contracts’. So I broke down and asked them.
It turns out that Infosolve is a consulting firm that uses free open source technology, specifically the Pentaho platform for data integration and business intelligence. A Certified Development partner of Pentaho, Infosolve has developed its own data quality and CDI components on the platform and simply sells the consulting needed to deploy it. Interesting.
Infosolve Vice President Subbu Manchiraju and Director of Alliances Richard Romanik spent some time going over the details and gave me a brief demonstration of the platform. Basically, Pentaho lets users build graphical workflows that link components for data extracts, transformation, profiling, matching, enhancement, and reporting. It looked every bit as good as similar commercial products.
Two particular points were worth noting:
- the actual matching approach itself seems acceptable. Users build rules that specify which fields to compare, the methods used to measure similarity, and similarity scores required for each field. This is less sophisticated than the best commercial products, but field comparisons are probably adequate for most situations. Although setting up and tuning such rules can be time-consuming, Infosolve told me they can build a typical set of match routines in about half a day. More experienced or adventurous users could even do it for themselves; the user interface makes the mechanics very simple. A half-day of consulting might cost $1,000, which is not bad at all when you consider that the software itself is free. The price for a full implementation would be higher since it would involve additional consulting to set up data extracts, standardization, enhancement and other processes, but cost should still be very reasonable. You’d probably need as much consulting with other CDI systems where you'd pay for the software too.
- data verification and enhancement is done by calls to StrikeIron, which provides a slew of on-demand data services. StrikeIron is worth knowing about in its own right: it lets users access Web services including global address verification and corrections; consumer and business data lookups using D&B and Gale Group data; telephone verification, appends and reverse appends; geocoding and distance calculations; Mapquest mapping and directions; name/address parsing; sales tax lookups; local weather forecasts; securities prices; real-time fraud detection; and message delivery for text (SMS) and voice (IVR). Everything is priced on a per use basis. This opens up all sorts of interesting possibilities.
The Infosolve software can be installed on any platform that can run Java, which is just about everything. Users can also run it within the Sun Grid utility network, which has a pay-as-you-go business model of $1 per CPU hour.
I’m a bit concerned about speed with Infosolve: the company said it takes 8 to 12 hours to run a million record match on a typical PC. But that assumes you compare every record against every other record, which usually isn’t necessary. Of course, where smaller volumes are concerned, this is not an issue.
Bottom line: Infosolve and Pentaho may not meet the most extreme CDI requirements, but they could be a very attractive option when low cost and quick deployment are essential. I’ll certainly keep them in mind for my own clients.
Thursday, 29 November 2007
Low Cost CDI from Infosolve, Pentaho and StrikeIron
Posted on 18:56 by Unknown
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment