Ridges based Curled Textline Region Detection from Grayscale Camera-Captured Document Images
As compared to scanners, cameras offer fast, flexible and non-contact document imaging, but with distortions like uneven shading and warped shape. Therefore, camera-captured document images need preprocessing steps like binarization and textline detection for dewarping so that traditional document image processing steps can be applied on them. Previous approaches of binarization and curled textline detection are sensitive to distortions and loose some crucial image information during each step, which badly affects dewarping and further processing. Here we introduce a novel algorithm for curled textline region detection directly from a grayscale camera-captured document image, in which matched filter bank approach is used for enhancing textline structure and then ridges detection is applied for finding central line of curled textlines. The resulting ridges can be potentially used for binarization, dewarping or designing new techniques for camera-captured document image processing. Our approach is robust against bad shading and high degrees of curl. We have achieved around 91% detection accuracy on the dataset of CBDAR 2007 document image dewarping contest.