The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data

  • 8h 12m
  • James Sanger, Ronen Feldman
  • Cambridge University Press
  • 2007

Text mining tries to solve the crisis of information overload by combining techniques from data mining, machine learning, natural language processing, information retrieval, and knowledge management. In addition to providing an in-depth examination of core text mining and link detection algorithms and operations, this book examines advanced pre-processing techniques, knowledge representation considerations, and visualization approaches. Finally, it explores current real-world, mission-critical applications of text mining and link detection in such varied fields as M&A business intelligence, genomics research and counter-terrorism activities.

About the Author

Dr. Ronen Feldman is a Senior Lecturer in the Mathematics and Computer Science Department of Bar-Ilan University and Director of the Data and Text Mining Laboratory. Dr. Feldman is cofounder, Chief Scientist, and President of ClearForest, Ltd., a leader in developing next-generation text mining applications for corporate and government clients. He also recently served as an Adjunct Professor at New York University’s Stern School of Business. Apioneer in the areas of machine learning, data mining, and unstruc-tured data management, he has authored or coauthored more than 70 published articles and conference papers in these areas.

James Sanger is a venture capitalist, applied technologist, and recognized industry expert in the areas of commercial data solutions, Internet applications, and IT security products. He is a partner at ABS Ventures, an independent venture firm founded in 1982 and originally associated with technology banking leader Alex. Brown and Sons. Immediately before joining ABS Ventures, Mr. Sanger was a Managing Director in the New York offices of DB Capital Venture Partners, the global venture capital arm of Deutsche Bank. Mr. Sanger has been a board member of several thought-leading technology companies, including Inxight Software, Gomez Inc., and ClearForest, Inc.; he has also served as an official observer to the boards of AlphaBlox (acquired by IBM in 2004), Intralinks, and Imagine Software and as a member of the Technical Advisory Board of Qualys, Inc.

In this Book

  • Introduction to Text Mining
  • Core Text Mining Operations
  • Text Mining Preprocessing Techniques
  • Categorization
  • Clustering
  • Information Extraction
  • Probabilistic Models for Information Extraction
  • Preprocessing Applications Using Probabilistic and Hybrid Approaches
  • Presentation-Layer Considerations for Browsing and Query Refinement
  • Visualization Approaches
  • Link Analysis
  • Text Mining Applications
  • Bibliography
SHOW MORE
FREE ACCESS