Implementation of Vision Based Page Segmentation algorithm in Java.
The implementation utilizes CSSBox (X)HTML/CSS rendering engine written by Radek Burget.
Description of VIPS and my implementation in my master's thesis (in Czech)
https://www.fit.vutbr.cz/study/DP/DP.php?id=14163&file=t
Original work by Microsoft
https://www.cad.zju.edu.cn/home/dengcai/VIPS/VIPS_July-2004.pdf
CSSBox
https://cssbox.sourceforge.net
Just pass the URL of web page you want to analyze as argument to VipsTester class.
Preferences of implementation can be changed also there.