
Incorporating Knowledge Sources into Statistical Speech Recognition
by Sakti, Sakriani; Markov, Konstantin; Nakamura, Satoshi; Minker, Wolfgang-
This Item Qualifies for Free Shipping!*
*Excludes marketplace orders.
Rent Textbook
Rent Digital
New Textbook
We're Sorry
Sold Out
Used Textbook
We're Sorry
Sold Out
How Marketplace Works:
- This item is offered by an independent seller and not shipped from our warehouse
- Item details like edition and cover design may differ from our description; see seller's comments before ordering.
- Sellers much confirm and ship within two business days; otherwise, the order will be cancelled and refunded.
- Marketplace purchases cannot be returned to eCampus.com. Contact the seller directly for inquiries; if no response within two days, contact customer service.
- Additional shipping costs apply to Marketplace purchases. Review shipping costs at checkout.
Summary
Table of Contents
Introduction and Book Overview | p. 1 |
Automatic Speech Recognition - A Way of Human-Machine Communication | p. 1 |
Approaches to Speech Recognition | p. 4 |
Knowledge-based Approaches | p. 4 |
Corpus-based Approaches | p. 6 |
State-of-the-art ASR Performance | p. 7 |
Studies on Incorporating Knowledge Sources | p. 10 |
Sources of Variability in Speech | p. 10 |
Existing Ways of Incorporating Knowledge Sources | p. 12 |
Major Challenges to Overcome | p. 15 |
Book Outline | p. 16 |
Statistical Speech Recognition | p. 19 |
Pattern Recognition Overview | p. 19 |
Theory of Hidden Markov Models | p. 22 |
Markov Chain | p. 22 |
General form of an HMM | p. 23 |
Principle Cases of HMM | p. 25 |
Pattern Recognition for HMM-Based ASR Systems | p. 35 |
Front-end Feature Extraction | p. 36 |
HMM-Based Acoustic Model | p. 43 |
Pronunciation Lexicon | p. 49 |
Language Model | p. 50 |
Search Algorithm | p. 51 |
Graphical Framework to Incorporate Knowledge Sources | p. 55 |
Graphical Model Representation | p. 56 |
Probability Theory | p. 56 |
Graphical Model | p. 59 |
Junction Tree Algorithm | p. 63 |
Procedure of GFIKS | p. 68 |
Causal Relationship between Information Sources | p. 70 |
Direct Inference on Bayesian Network | p. 71 |
Junction Tree Decomposition | p. 72 |
Junction Tree Inference | p. 75 |
Practical Issues of GFIKS | p. 75 |
Types of Knowledge Sources | p. 75 |
Different Levels of Incorporation | p. 76 |
Speech Recognition Using GFIKS | p. 79 |
Applying GFIKS at the HMM State Level | p. 79 |
Causal Relationship Between Information Sources | p. 80 |
Inference | p. 81 |
Enhancing Model Reliability | p. 81 |
Training and Recognition Issues | p. 82 |
Applying GFIKS at the HMM Phonetic-unit Level | p. 83 |
Causal Relationship between Information Sources | p. 83 |
Inference | p. 85 |
Enhancing the Model Reliability | p. 85 |
Deleted Interpolation | p. 86 |
Training and Recognition Issues | p. 86 |
Experiments with Various Knowledge Sources | p. 87 |
Incorporating Knowledge at the HMM State Level | p. 87 |
Incorporating Knowledge at the HMM Phonetic-unit Level | p. 116 |
Experiments Summary and Discussion | p. 132 |
Conclusions and Future Directions | p. 139 |
Conclusions | p. 139 |
Theoretical Issues | p. 139 |
Application Issues | p. 140 |
Experimental Issues | p. 141 |
Future Directions: A Roadmap to a Spoken Language Dialog System | p. 142 |
Speech Materials | p. 145 |
AURORA TIDigit Corpus | p. 145 |
TIMIT Acoustic-Phonetic Speech Corpus | p. 146 |
Wall Street Journal Corpus | p. 148 |
ATR Basic Travel Expression Corpus | p. 150 |
ATR English Database Corpus | p. 150 |
ATR Software Tools | p. 153 |
Generic Properties of ATRASR | p. 153 |
Data Preparation | p. 153 |
SSS Data Generating Tools | p. 155 |
Acoustic Model Training Tools | p. 155 |
Language Model Training Tools | p. 157 |
Recognition Tools | p. 157 |
Composition of Bayesian Wide-phonetic Context | p. 163 |
Proof using Bayes's Rule | p. 163 |
Variants of Bayesian Wide-phonetic Context Model | p. 164 |
Statistical Significance Testing | p. 169 |
Statistical Hypothesis Testing | p. 169 |
The Use of the Sign Test for ASR | p. 172 |
References | p. 175 |
Index | p. 189 |
Table of Contents provided by Ingram. All Rights Reserved. |
An electronic version of this book is available through VitalSource.
This book is viewable on PC, Mac, iPhone, iPad, iPod Touch, and most smartphones.
By purchasing, you will be able to view this book online, as well as download it, for the chosen number of days.
Digital License
You are licensing a digital product for a set duration. Durations are set forth in the product description, with "Lifetime" typically meaning five (5) years of online access and permanent download to a supported device. All licenses are non-transferable.
More details can be found here.
A downloadable version of this book is available through the eCampus Reader or compatible Adobe readers.
Applications are available on iOS, Android, PC, Mac, and Windows Mobile platforms.
Please view the compatibility matrix prior to purchase.