This document outlines the datasets used, including their sources, content descriptions, feature extraction methods, and data split statistics.