Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS Access
Database Tools
SQL Scripts & Samples
» Database Forum
» Sitemap
Free Newsletters:
News Via RSS Feed

follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2

Posted Mar 9, 2010

IBM's BigSheets Text-mining the UK Web Archive

By Staff

Recently announced, the UK Web Archive, with the help of IBM and its decades of experience in text-mining and BigSheets software is going to store and make accessible every site in the .uk top-level domain to provide dynamic research with abilities like classifying pages into categories, extracting entities as metadata, and offering several approaches to querying and visualizing data.

Hadoop, the core technology being used within BigSheets, is a data storage system that can scale to billions of items with less required structure and space than a relational database; easily handling large amounts of traffic and using parallel processing as well as addition of new servers, replication, fail-over, and load balancing.

View Article

Daily News Archives

Comment and Contribute


(Maximum characters: 1200). You have characters left.