Scripting Intelligence: Web 3.0 Information Gathering and Processing

  • 5h 33m
  • Mark Watson
  • Apress
  • 2009

While Web 2.0 was about data, Web 3.0 is about knowledge and information. Scripting Intelligence: Web 3.0 Information Gathering and Processing offers the reader Ruby scripts for intelligent information management in a Web 3.0 environment—including information extraction from text, using Semantic Web technologies, information gathering (relational database metadata, web scraping, Wikipedia, Freebase), combining information from multiple sources, and strategies for publishing processed information. This book will be a valuable tool for anyone needing to gather, process, and publish web or database information across the modern web environment.

  • Text processing recipes, including speech tagging and automatic summarization
  • Gathering, visualizing, and publishing information from the Semantic Web
  • Information gathering from traditional sources such as relational databases and web sites

What you’ll learn

  • Gather and process information within the Web 3.0 environment.
  • See the flexibility of scripting with Ruby to gather and process information.
  • Extract text from various document formats.
  • Work with the RDF data model and SPARQL query language, the foundations of the Semantic Web.
  • Use GraphViz for data visualization.
  • Extract information from relational databases and web sites.

Who is this book for?

  • Anyone needing to gather and display information available in electronic formats
  • Programmers needing to tag, summarize, or publish information
  • Ruby programmers and computer enthusiasts interested in seeing what Ruby can do with information management and Semantic Web tools
  • Academic researchers needing to extract and organize information in a more automated way.

About the Author

MARK WATSON is the author of 15 books on artificial intelligence (AI), software agents, Java, Common Lisp, Scheme, Linux, and user interfaces. He wrote the free chess program distributed with the original Apple II computer, built the world's first commercial Go playing program, and developed commercial products for the original Macintosh and for Windows 1.0. He was an architect and a lead developer for the worldwide-distributed Nuclear Monitoring Research and Development (NMRD) project and for a distributed expert system designed to detect telephone credit-card fraud. He has worked on the AI for Nintendo video games and was technical lead for a Virtual Reality system for Disney. He currently works on text- and data-mining projects, and develops web applications using Ruby on Rails and server-side Java.

In this Book

  • Parsing Common Document Types
  • Cleaning, Segmenting, and Spell-Checking Text
  • Natural Language Processing
  • Using RDF and RDFS Data Formats
  • Delving Into RDF Data Stores
  • Performing SPARQL Queries and Understanding Reasoning
  • Implementing SPARQL Endpoint Web Portals
  • Working with Relational Databases
  • Supporting Indexing and Search
  • Using Web Scraping to Create Semantic Relations
  • Taking Advantage of Linked Data
  • Implementing Strategies for Large-Scale Data Storage
  • Creating Web Mashups
  • Performing Large-Scale Data Processing
  • Building Information Web Portals