Grid-based outlier detection in large data sets for combine harvesters
Outlier detection is one of the most widely used technique to identify abnormal behavior in raw data. The sense of abnormal deviation mentioned here accounts not only for human made or system errors that naturally occur as part of the data but also as seldomly occuring events. In this paper, we propose a new algorithm called Grid Based Outlier Detection (GBOD) to find the hidden outliers in large data sets. In contrast to existing grid based methods which are limited to only some statistical based approaches, the GBOD algorithm is raised with two alternations to figure out different range of outliers depending on the interest of the user. First, the number of points in a local grid cell is used to decide whether a point is an outlier or not. In a second step, this approach is extended to method that assigns an outlier score to each data point. The simple design makes this algorithm extremely efficient for large data sets.