Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel<sup>®</sup> with XLMiner<sup>®</sup>,9780470084854

Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel^® with XLMiner^®

by Galit Shmueli (Univ. of Maryland, College Park ); Nitin R. Patel (Massachusetts Institute of Technology ); Peter C. Bruce (statistics.com )

ISBN13: 9780470084854

ISBN10: 0470084855

Format: Hardcover

Pub. Date: 2006-12-01

Publisher(s): Wiley-Interscience

Other versions by this Author

This Item Qualifies for Free Shipping!*

*Excludes marketplace orders.

List Price: ~~$132.83~~

Rent Textbook

Select for Price

Add to Cart

There was a problem. Please try again later.

New Textbook

We're Sorry
Sold Out

Used Textbook

We're Sorry
Sold Out

eTextbook

We're Sorry
Not Available

Buy from our Marketplace starting at $11.94

Summary

In today's world, businesses are becoming more capable of accessing their ideal consumers, and an understanding of data mining contributes to this success. Data Mining for Business Intelligence, which was developed from a course taught at the Massachusetts Institute of Technology's Sloan School of Management, and the University of Maryland's Smith School of Business, uses real data and actual cases to illustrate the applicability of data mining intelligence to the development of successful business models. Featuring XLMiner, the Microsoft Office Excel add-in, this book allows readers to follow along and implement algorithms at their own speed, with a minimal learning curve. In addition, students and practitioners of data mining techniques are presented with hands-on, business-oriented applications. An abundant amount of exercises and examples are provided to motivate learning and understanding. Data Mining for Business Intelligence: * Provides both a theoretical and practical understanding of the key methods of classification, prediction, reduction, exploration, and affinity analysis * Features a business decision-making context for these key methods * Illustrates the application and interpretation of these methods using real business cases and data This book helps readers understand the beneficial relationship that can be established between data mining and smart business practices, and is an excellent learning tool for creating valuable strategies and making wiser business decisions.

Author Biography

GALIT SHMUELI, PHD, is Assistant Professor of Statistics in the Decision and Information Technologies Department of the Robert H. Smith School of Business at the University of Maryland.

NITIN R. PATEL, PHD, is Chairman, Founder, and Chief Technology Officer of Cambridge-based Cytel Incorporated and a Visiting Professor in the Engineering Systems Division at the Massachusetts Institute of Technology.

PETER C. BRUCE is President and owner of statistics.com, the leading provider of professional development courses in statistics.

Foreword

xiii

Preface

Acknowledgments

xvii

Introduction

(8)

What Is Data Mining?

(1)

Where Is Data Mining Used?

(1)

The Origins of Data Mining

(1)

The Rapid Growth of Data Mining

(1)

Why Are There So Many Different Methods?

(1)

Terminology and Notation

(2)

Road Maps to This Book

(3)

Overview of the Data Mining Process

(26)

Introduction

(1)

Core Ideas in Data Mining

(2)

Supervised and Unsupervised Learning

(1)

The Steps in Data Mining

(2)

Preliminary Steps

(8)

Building a Model: Example with Linear Regression

(6)

Using Excel for Data Mining

(8)

Problems

(4)

Data Exploration and Dimension Reduction

(18)

Introduction

(1)

Practical Considerations

(2)

Example 1: House Prices in Boston

(1)

Data Summaries

(1)

Data Visualization

(2)

Correlation Analysis

(1)

Reducing the Number of Categories in Categorical Variables

(1)

Principal Components Analysis

(12)

Example 2: Breakfast Cereals

(3)

Principal Components

(1)

Normalizing the Data

(3)

Using Principal Components for Classification and Prediction

(2)

Problems

(2)

Evaluating Classification and Predictive Performance

(22)

Introduction

(1)

Judging Classification Performance

(19)

Accuracy Measures

(3)

Cutoff for Classification

(4)

Performance in Unequal Importance of Classes

(1)

Asymmetric Misclassification Costs

(5)

Oversampling and Asymmetric Costs

(6)

Classification Using a Triage Strategy

(1)

Evaluating Predictive Performance

(3)

Problems

(1)

Multiple Linear Regression

(16)

Introduction

(1)

Explanatory vs. Predictive Modeling

(1)

Estimating the Regression Equation and Prediction

(5)

Example: Predicting the Price of Used Toyota Corolla Automobiles

(4)

Variable Selection in Linear Regression

(10)

Reducing the Number of Predictors

(1)

How to Reduce the Number of Predictors

(4)

Problems

(5)

Three Simple Classification Methods

(20)

Introduction

(1)

Example 1: Predicting Fraudulent Financial Reporting

(1)

Example 2: Predicting Delayed Flights

(1)

The Naive Rule

(1)

Naive Bayes

(10)

Conditional Probabilities and Pivot Tables

(1)

A Practical Difficulty

(1)

A Solution: Naive Bayes

(5)

Advantages and Shortcomings of the naive Bayes Classifier

100

(3)

k-Nearest Neighbors

103

(8)

Example 3: Riding Mowers

104

(1)

Choosing k

105

(1)

k-NN for a Quantitative Response

106

(1)

Advantages and Shortcomings of k-NN Algorithms

106

(2)

Problems

108

(3)

Classification and Regression Trees

111

(26)

Introduction

111

(2)

Classification Trees

113

(1)

Recursive Partitioning

113

(1)

Example 1: Riding Mowers

113

(7)

Measures of Impurity

115

(5)

Evaluating the Performance of a Classification Tree

120

(1)

Example 2: Acceptance of Personal Loan

120

(1)

Avoiding Overfitting

121

(9)

Stopping Tree Growth: CHAID

121

(4)

Pruning the Tree

125

(5)

Classification Rules from Trees

130

(1)

Regression Trees

130

(2)

Prediction

130

(1)

Measuring Impurity

131

(1)

Evaluating Performance

132

(1)

Advantages, Weaknesses, and Extensions

132

(5)

Problems

134

(3)

Logistic Regression

137

(30)

Introduction

137

(1)

The Logistic Regression Model

138

(8)

Example: Acceptance of Personal Loan

139

(2)

Model with a Single Predictor

141

(2)

Estimating the Logistic Model from Data: Computing Parameter Estimates

143

(1)

Interpreting Results in Terms of Odds

144

(2)

Why Linear Regression Is Inappropriate for a Categorical Response

146

(2)

Evaluating Classification Performance

148

(2)

Variable Selection

148

(2)

Evaluating Goodness of Fit

150

(3)

Example of Complete Analysis: Predicting Delayed Flights

153

(7)

Data Preprocessing

154

(1)

Model Fitting and Estimation

155

(1)

Model Interpretation

155

(1)

Model Performance

155

(2)

Goodness of fit

157

(1)

Variable Selection

158

(2)

Logistic Regression for More Than Two Classes

160

(7)

Ordinal Classes

160

(1)

Nominal Classes

161

(2)

Problems

163

(4)

Neural Nets

167

(20)

Introduction

167

(1)

Concept and Structure of a Neural Network

168

(1)

Fitting a Network to Data

168

(13)

Example 1: Tiny Dataset

169

(1)

Computing Output of Nodes

170

(2)

Preprocessing the Data

172

(1)

Training the Model

172

(4)

Example 2: Classifying Accident Severity

176

(1)

Avoiding overfitting

177

(4)

Using the Output for Prediction and Classification

181

(1)

Required User Input

181

(1)

Exploring the Relationship Between Predictors and Response

182

(1)

Advantages and Weaknesses of Neural Networks

182

(5)

Problems

184

(3)

Discriminant Analysis

187

(16)

Introduction

187

(1)

Example 1: Riding Mowers

187

(1)

Example 2: Personal Loan Acceptance

188

(1)

Distance of an Observation from a Class

188

(3)

Fisher's Linear Classification Functions

191

(3)

Classification Performance of Discriminant Analysis

194

(1)

Prior Probabilities

195

(1)

Unequal Misclassification Costs

195

(1)

Classifying More Than Two Classes

196

(1)

Example 3: Medical Dispatch to Accident Scenes

196

(1)

Advantages and Weaknesses

197

(6)

Problems

200

(3)

Association Rules

203

(16)

Introduction

203

(1)

Discovering Association Rules in Transaction Databases

203

(1)

Example 1: Synthetic Data on Purchases of Phone Faceplates

204

(1)

Generating Candidate Rules

204

(2)

The Apriori Algorithm

205

(1)

Selecting Strong Rules

206

(6)

Support and Confidence

206

(1)

Lift Ratio

207

(1)

Data Format

207

(2)

The Process of Rule Selection

209

(1)

Interpreting the Results

210

(1)

Statistical Significance of Rules

211

(1)

Example 2: Rules for Similar Book Purchases

212

(1)

Summary

212

(7)

Problems

215

(4)

Cluster Analysis

219

(22)

Introduction

219

(1)

Example: Public Utilities

220

(2)

Measuring Distance Between Two Records

222

(5)

Euclidean Distance

223

(1)

Normalizing Numerical Measurements

223

(1)

Other Distance Measures for Numerical Data

223

(3)

Distance Measures for Categorical Data

226

(1)

Distance Measures for Mixed Data

226

(1)

Measuring Distance Between Two Clusters

227

(1)

Hierarchical (Agglomerative) Clustering

228

(5)

Minimum Distance (Single Linkage)

229

(1)

Maximum Distance (Complete Linkage)

229

(1)

Group Average (Average Linkage)

230

(1)

Dendrograms: Displaying Clustering Process and Results

230

(1)

Validating Clusters

231

(1)

Limitations of Hierarchical Clustering

232

(1)

Nonhierarchical Clustering: The k-Means Algorithm

233

(8)

Initial Partition into k Clusters

234

(3)

Problems

237

(4)

Cases

241

(30)

Charles Book Club

241

(9)

German Credit

250

(4)

Tayko Software Cataloger

254

(4)

Segmenting Consumers of Bath Soap

258

(4)

Direct-Mail Fundraising

262

(3)

Catalog Cross-Selling

265

(2)

Predicting Bankruptcy

267

(4)

References

271

(2)

Index

273

Kids

Men

Women

For You

For Your Car

For Your Home

For Your Pet

For Your Tech

Artwork

Games

Gift Wraps

Holiday

Home Decor

Mascot

Office Decor

Outdoor/Recreation

Graduation Gear

Graduation Gifts

Art Supplies

For Your Office

For Your Tech

Office Supplies

School Supplies

Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel^® with XLMiner^®

Rent Textbook

New Textbook

Used Textbook

eTextbook

Summary

Author Biography

Table of Contents

Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel® with XLMiner®

Rent Textbook

New Textbook

Used Textbook

eTextbook

How Marketplace Works:

Summary

Author Biography

Table of Contents

Digital License

Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel^® with XLMiner^®