A novel bias-alleviated hybrid ensemble model based on over-sampling and post-processing for fair classification

Fang Hea School of Information Management and Artificial Intelligence, Zhejiang University of Finance and Economics, Hangzhou, People’s Republic of ChinaCorrespondence[email protected]
View further author information

Xiaoxia Wub Department of Financial Accounting, Zhejiang Institute of Economics and Trade, Hangzhou, People’s Republic of ChinaView further author information

Wenyu Zhanga School of Information Management and Artificial Intelligence, Zhejiang University of Finance and Economics, Hangzhou, People’s Republic of ChinaView further author information

Xiaoling Huangc Library, Zhejiang University of Finance and Economics, Hangzhou, People’s Republic of ChinaView further author information

Abstract

With the rapid development of machine learning in the field of classification, the classification fairness has become the research emphasis second to prediction accuracy. However, the data bias and algorithmic discrimination that affect the fair classification of models have not been well resolved, which may damage or benefit the specific groups related to the sensitive attributes (e.g. age, race, and gender). To alleviate the unfairness of the classification model, this study proposes a novel bias-alleviated hybrid ensemble model (BAHEM) based on over-sampling and post-processing. First, a new clustering-based over-sampling method is proposed to reduce the data bias caused by the imbalance in label and sensitive attribute. Then, a stacking-based ensemble learning method is employed to obtain the higher performance and robustness of the BAHEM. Finally, a new classification with alternating normalisation (CAN)-based post-processing method is proposed to further improve the fairness and maintain the accuracy of the BAHEM. Three datasets with different sensitive attributes and four evaluation metrics were used to evaluate the prediction accuracy and fairness of the BAHEM. The experimental results verify the superior fairness of the BAHEM with little accuracy reduction.

KEYWORDS:

Disclosure statement

No potential conflict of interest was reported by the author(s).

Compliance with ethical standards

Conflicts of interest: The authors declare that there is no conflict of interests regarding the publication of this article.

Ethical standard: The authors state that this research complies with ethical standards. This research does not involve either human participants or animals.

Data availability statement

The datasets analysed during the current study are available in the UCI repository. German dataset is from https://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german, Adult dataset is from https://archive.ics.uci.edu/ml/machine-learning-databases/adult, and Bank dataset is from https://archive.ics.uci.edu/ml/machine-learning-databases/00222.

Notes

1 Available at https://github.com/yhefang/BAHEM.

Additional information

Funding

This work has been supported by Fundamental Research Funds for the Provincial Universities of Zhejiang Institute of Economics and Trade (No. 19YQ19), National Natural Science Foundation of China (No. 51875503), Zhejiang Natural Science Foundation of China (No. LZ20E050001), and Zhejiang Key R & D Project of China (No. 2022C03166).

A novel bias-alleviated hybrid ensemble model based on over-sampling and post-processing for fair classification

Information for

Open access

Opportunities

Help and information

A novel bias-alleviated hybrid ensemble model based on over-sampling and post-processing for fair classification

Abstract

Disclosure statement

Compliance with ethical standards

Data availability statement

Notes

Additional information

Funding

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature