Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Posted Mar 9, 2010

IBM's BigSheets Text-mining the UK Web Archive

By DatabaseJournal.com Staff

Recently announced, the UK Web Archive, with the help of IBM and its decades of experience in text-mining and BigSheets software is going to store and make accessible every site in the .uk top-level domain to provide dynamic research with abilities like classifying pages into categories, extracting entities as metadata, and offering several approaches to querying and visualizing data.

Hadoop, the core technology being used within BigSheets, is a data storage system that can scale to billions of items with less required structure and space than a relational database; easily handling large amounts of traffic and using parallel processing as well as addition of new servers, replication, fail-over, and load balancing.

View Article

Daily News Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.