Project Name

Intelligent Web Group

Member

Taiki Ito, Masahiro Kudo, Norihiko Kuratomi, Aratontitico, Daiki Higashiguchi, Masayuki, Nakano

Keyword

Web Page Segmentation, Web mining, Web Page Layout, Semantic Web

Purpose

Web Page Segmentation and Analysis.

Outline

In recent years, the Web provides people many contents within a single Web page. The project main aim is a web page segmentation, which divides a Web page into several small content-blocks, and analysis their content-blocks.

Many web applications can utilize the content-blocks of Web pages. For example, we propose a new browsing system, which displays only important content-blocks to facilitate navigation and reading on a mobile phones with a small screen. Furthermore, considering position and relation of each content-blocks is great potential to boost up the performance of current web search engines. The content-block can apply preprocessing for automatic wrapper generation too.

However, since the content-block does not describe directly and detect by Web page layout, it is difficult to extract. We propose the new vision based page segmentation algorithm using on various methods. Some experimental results show that the proposed method has higher precision than existing study. The future direction of this study will be implementation of various Web application used on content-blocks.


Copyright (c) 2009 Shintani Lab. All rights reserved.