Title: Identifying High-Number-Cluster Structures in RFID Ski Lift Gates Entrance Data
Abstract: In this paper we identify skier groups in data from RFID ski lift gates entrances. The ski lift gates’ entrances are real-life data covering a 5-year period from the largest Serbian skiing resort with a 32,000 skier per hour ski lift capacity. We utilize three representative algorithms from three most widely used clustering algorithm families (representative-based, hierarchical, and density based) and produce 40 algorithm settings for clustering skiing groups. Ski pass sales data was used to validate the produced clustering models. It was assumed that persons who bought ski tickets together are more likely to ski together. AMI and ARI clustering validation measures are reported for each model. In addition, the applicability of the proposed models was evaluated for ski injury prevention. Each clustering model was tested on whether skiing in groups increases risk of injury. Hierarchical clustering algorithms showed to be very efficient in terms of finding the high-number-cluster structure (skiing groups) and for detecting models suitable for injury prevention. Most of the tested clustering algorithms models supported the hypothesis that skiing in groups increases risk of injury.