ABSTRACT
Determining key performance indicators and classifying players accurately between competitive levels is one of the classification challenges in sports analytics. A recent study applied Random Forest algorithm to identify important variables to classify rugby league players into academy and senior levels and achieved 82.0% and 67.5% accuracy for backs and forwards. However, the classification accuracy could be improved due to limitations in the existing method. Therefore, this study aimed to introduce and implement feature selection technique to identify key performance indicators in rugby league positional groups and assess the performances of six classification algorithms. Fifteen and fourteen of 157 performance indicators for backs and forwards were identified respectively as key performance indicators by the correlation-based feature selection method, with seven common indicators between the positional groups. Classification results show that models developed using the key performance indicators had improved performance for both positional groups than models developed using all performance indicators. 5-Nearest Neighbour produced the best classification accuracy for backs and forwards (accuracy = 85% and 77%) which is higher than the previous method’s accuracies. When analysing classification questions in sport science, researchers are encouraged to evaluate multiple classification algorithms and a feature selection method should be considered for identifying key variables.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Informed consent
The study got the ethics approval of the Institutions Ethics Committee and written informed consent was obtained from all participants who are completely anonymized and cannot be identified through this study.