Downloads
![]() |
RapidMiner Extensions |
---|
![]() |
Datasets for Image and Video AnalysisThis is a dataset of videos downloaded from the online portal YouTube for experiments with automatic video annotation (or "concept detection", respectively). |
---|
A dataset of YouTube video clips tagged with 22 different concepts for experiments with automatic video annotation.
![]() |
Datasets for Machine Learning |
---|
![]() |
Datasets for Document AnalysisHere you find the data sets that have been generated at MADM for research purposes. Detailed information about each dataset can be obtained on the specific page. |
---|
The data set contains genuine and forged doctor bills. Forgeries are made by re-engineering of genuine documents.
The data set contains print-outs from color laser printers and copiers that show Machine Identification Codes (MIC), also known as “yellow dots” or counterfeit protection system codes.
The data set contains scanned invoices with color logos, color text and various kinds of stamps.
The dataset contains gray scale invoices from the same source as well as copies of genuine invoices to detect and measure the scanning distortions.
The dataset contains synthetic gray scale document images with single column text where the last paragraph is either rotated or mis-aligned. Different fonts and font sizes are used.