Google


JUNE 2001

"Running into a burning building is organized chaos," says Kenn Devane, ceo of MineTech. "But how different is that than running a business?"

Kenn Devane MineTech
All hell is breaking loose. The ceiling is caving in on you. The floor's about to give way. You can't see a thing and you have to trust your instincts. Doubly nuts, no doubt about it. Charging into a burning building. For free.

One time Kenn-the-volunteer-fireman ran into a blaze in a plastics factory. It was about four o'clock in the morning. Dark. Wooden floors. "A guy was with me, thank God, who had been fighting fires for a few years," Kenn relays. "He said his knees felt hot. I was so busy with the hose, it never even occurred to me to think about something like that. Well, he took an axe and hit the floor. The axe went right through. The fire was in the basement. We backed out and the floor collapsed."

Oh, sure. Just another fire at the office. Feel the door, feel the floor, feel the wall. Get low. "Some things you have to figure out and decide on very quickly and other things you can put off for a while," Kenn supposes. "What are you going to use -- the inch-and-a-half hose? Deuce-and-a-half? Deck gun?"

When Kenn was 18 years old he just couldn't imagine anything more exciting than driving a fire truck. Today, just a few short years later, he can't imagine anything more exciting than driving a business he calls MineTech. His ardor for stamping out burning buildings may now be satisfied. All in all, he'd rather be fly-fishing. His passion for interrogating data, on the other hand, is fully stoked.

Interrogating data? It's the next big step beyond mining data, says Kenn.

"Data mining as a term has become like Xerox and Kleenex," he analogizes. "Data mining is a generic handle and everybody thinks they have a grip on it. But as soon as you scratch the surface, most marketers are not data mining; they're data shuffling. They're simply rearranging data and plugging it into pie charts, or into segments or clusters. They're not gaining true insights."

Kenn posits that data mining is limited in application because it is too mathematical and statistical in nature. Most marketers are not. "The overall goal of statistical data mining is to manually test and find variables that minimize the error factor to help you make better decisions. You drill down as far as you can go and most people think that's good enough. That's why you hear terms like "fuzzy logic." Gray is good -- or so some people like to think. But in today's competitive marketplace, good enough is not enough."



I'm into being shocked into action, to a point where there's no turning back.

Kenn Devane got into the data business after getting out of the television ad sales business. Following a ten year career in TV that took him from Sioux City to Detroit to New York to Hartford, he landed back home on Long Island at a small database marketing company called Computerized Marketing Technologies (CMT). The company was fielding consumer surveys and using the results to build databases for companies like General Foods and Philip Morris. They pioneered the concept of household-encoded, variable-value coupons. That was fourteen years ago.

"That's where I got the bug for data," confesses Kenn, who subsequently carried his infection to Spectra, where he opened the East Coast office and ultimately became EVP of Sales. The Spectra technology enables marketers to build a consumer profile based on syndicated purchase data, geo-code it down to block group level, and then overlay it on a store's trading area. The result was an ability to determine a given store's propensity to sell a given product.

In late 1999, Kenn left Spectra to form MineTech. "My impetus to start MineTech was the possibility of analyzing and leveraging whatever internal data a company has -- that represents a proprietary point of difference for them to sustain a competitive advantage every day, in real time."



Kenn Devane, MineTech

Marketers today are drowning in data. The current systems, software and methods for analyzing this data can't possibly keep up with today's data streams. While the marketplace is moving at 100 miles per hour, traditional, statistical data mining processes permit the marketer to go only 30 miles per hour. That explains why, according to industry sources, only two percent of Web data gets analyzed and only 7 percent of offline data gets analyzed.

Another related issue is that marketers have basements full of information that is more relevant to them than syndicated data could ever be. There is a huge opportunity for brands to use their own information -- transaction, sales and manufacturing data, to create their own specific solutions.

In the consumer packaged goods industry, for example, nearly every marketer uses either Nielsen or IRI data. You can simply buy your competitor's data. So if I'm Minute Maid, I can get Tropicana's sales data and Tropicana can get Minute Maid's sales data. There's nothing unique about syndicated data, which means it's unlikely anyone is going to derive any significant competitive advantage from it. A company's own transaction and manufacturing data, on the other hand, is rich in possibility. There are plenty of unique, proprietary things about their data that a marketer can use craft a competitive advantage.

To get the greatest value out of their data, however, marketers require a combination of experienced business managers who understand the product strategy, statisticians who understand all the available data mining, modeling and analysis techniques and software system experts who can wire them all together. Up until now, these different groups have not connected because, in effect, they speak different languages. We have assembled a skilled team that is fluent in all three areas.

The big difference between "shuffling" data and mining data is in the "interrogation" process. True data mining breaks data apart in search of hidden patterns that provide critical insight. Data shuffling, as in Online Analytical Processing (OLAP), is a user-driven query that manually confirms and reports on your data but does not break it apart and find new variable combinations. True data mining is an automatic exploratory, data-driven process that is used to describe or predict results rather than report on it.



I can't imagine telling my kids, ten years from now, that the data revolution passed me by.

Since most marketers know "what" happened, we focus on "why" it happened. To help us do that we use a technique called genetic programming. The idea is to go into the data and break apart its variable components or "DNA" in as many different ways as you can. Genetic data modeling automatically tests every variable combination and crunches, crosses, reproduces, mutates and copies them until it evolves into the best model solution for your specific objective.

Genetic programming finds new data combinations or "structure" that is not possible in traditional statistics because it tests infinitely more data variable calculations. Of course, there is always a statistical framework and rules that you maintain as you go through the data. But genetics, for the first time, enables the marketer to consider every variable and run "what if" drills in minutes versus weeks. After you've run literally thousands of models and looked at every possible data scenario, the strongest or "most fit" variables combine to produce the final "best" model solution. It's like a Darwinian process, a "survival of the fittest" scenario.

Data interrogation does not begin with any preconceptions. You simply let the data "do the talking." If you start out with preconceived ideas of what the best variables are, by definition you have eliminated some new variable combinations that could provide breakthrough insight.

Kenn Devane, MineTech
With traditional data mining, the modeler manually tests the variables they believe are most important because they cannot possibly test every variable combination. This results in a model that is somewhat biased by their past modeling experience versus what the data is telling them. For example, a large magazine publisher recently used a response model that is largely driven by 55+ households, to target subscription offers for a magazine being targeted at women 18-24.

When the marketing director of the women's magazine objected to the 55+age variable in the model, the statistician cited his experience, coefficients, correlations and analytical "mumbo jumbo" to argue that this was the best model for the promotion. The marketing manager was "statistically intimidated" into accepting the model for her promotion. Our modeling process would enable that marketing manager and statistician to quickly review which age break worked better in the model and automatically eliminate the weaker one without having to rebuild the model.

Data interrogation is a two-step process.
First, you "mine" the data to find the best individual variables and then combine them to form the best model. The model output is a code that then is used to score or rank your database of customers, prospects, stores, and so forth. In a recent project, a client wanted to predict which alumni (non-donors) were more likely to contribute a large gift to their capital campaign. By modeling the alumni donors who had given more than $500,000, we were able to identify their unique variables, or characteristics, and then score the entire alumni population by those same variables.

The more variables the non-donors had in common with the current donors, the higher their model score, and the more likely they were to make a donation. The client could now focus development resources on the top 10, 20 or 30 percent of their 60,000 alumni population, versus all of them, and generate much higher results. Our genetic programming process enabled us to complete this model in two days and provide the client with new insight on their alumni characteristics.



The opportunity is to interrogate your own, existing, internal data to solve your specific business and marketing challenges.

The rapid development of computing and storage technology makes this an incredibly exciting time to be involved in the data business. It is estimated that the amount of new information doubles every year. Ninety percent of all information produced is now in a digital format and 90 percent of that data is text based or unstructured. Courtesy of the internet, researchers predict that the average person will soon be able to access virtually all recorded information. And, therein lies the problem! Finding what you need, when you need it so you can analyze it.

In response to that need, we have been focusing on technology that automates, simplifies and supports our modeling activities. Since 80 percent of the data mining and modeling process is locating, appending and prepping the data, we are always looking for ways to make this easier. We recently added an on-line survey capability, not to compete with the myriad of survey product already on the market, but rather to enable our clients to quickly develop and launch surveys to fill in missing data, ask specific questions of a specific target audience, or test an idea before committing major resources to it. Where and when necessary, all of the data acquired from the client surveys is incorporated into the future models we develop for them.

We have also introduced new search engine software that eliminates the need to query the endless "key word" driven batch searches delivered by popular search engines like Google and Yahoo. This, too, is an attempt to help our clients make better use of their own --and third-party -- data by making it easier and faster for them to find whatever data they need, wherever it is.

Kenn Devane, MineTech
We use a "context" based approach to search, versus keyword. We apply a 50,000 word library of linguistic elements (terminology; phraseology; grammatical forms, and so forth) to filter and personalize the search process based on the exact information you need. A snapshot of each document found in your search is displayed via an index listing the article's contents by subject. You simply select the specific subject you are interested in and the information relating to the subject chosen is highlighted within the document for your immediate review.

This technology can be applied to intranet documents or archives, web searches and even your e-mail. What was once hailed as the ultimate Communication vehicle has slowed to a pain crawl courtesy of spam, viruses and firewalls. Our search technology enables you to set up specific "inboxes" by categories that are important to you. As each e-mail comes in, it is automatically read and sorted into one of your predefined inboxes which you can then read based on your priorities and timetable. Automatic responses to specific e-mails can be pre-set and spam can be filtered before it overloads your e-mail system. Here, too, portions of this information can be be used to enhance the productivity of your data mining and modeling efforts

One of the big questions is how to begin interrogating your data. It's incorrectly assumed that you have to make large hardware and software investments. In fact, it is generally more effectively to work with the data you already have, before you invest in additional resources. Then, you can add the exact information you need from a multitude of sources to mine, model and find your answers.

The first and sometimes hardest step to successful data mining is to identify your business problem. The second step, is finding and processing your data for mining. However, technology is making it easier, faster and less expensive than ever to find the insight that will provide you with a sustainable competitive advantage in your specific marketplace.

At the end of the day, data is data. And the opportunity is to interrogate your own, existing, internal data to solve your specific business and marketing challenges
.


©2002 reveries.com