Multi-class multi-label classification of social media texts for typhoon damage assessment: a two-stage model fully integrating the outputs of the hidden layers of BERT

Liwei Zoua School of Geography and Planning, Sun Yat-sen University, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Guangzhou, People's Republic of China;b Guangdong Provincial Key Laboratory of Intelligent Urban Security Monitoring and Smart City Planning, Guangzhou, People's Republic of ChinaView further author information

Zhi Hea School of Geography and Planning, Sun Yat-sen University, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Guangzhou, People's Republic of China;b Guangdong Provincial Key Laboratory of Intelligent Urban Security Monitoring and Smart City Planning, Guangzhou, People's Republic of ChinaCorrespondence[email protected]

https://orcid.org/0000-0001-9568-7076 View further author information

Chengle Zhoua School of Geography and Planning, Sun Yat-sen University, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Guangzhou, People's Republic of China;b Guangdong Provincial Key Laboratory of Intelligent Urban Security Monitoring and Smart City Planning, Guangzhou, People's Republic of ChinaView further author information

Wenbing Zhua School of Geography and Planning, Sun Yat-sen University, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Guangzhou, People's Republic of China;b Guangdong Provincial Key Laboratory of Intelligent Urban Security Monitoring and Smart City Planning, Guangzhou, People's Republic of ChinaView further author information

ABSTRACT

With the development of social media, it has become increasingly important to quickly and accurately identify social media texts related to disasters (e.g. typhoon) to aid in rescue and recovery efforts. Currently, multi-class classification and pre-trained language model Bidirectional Encoder Representations from Transformers (BERT) are widely used for text classification. However, most studies on typhoon damage classification are multi-class single-label, which contradicts to the reality that a social media text may correspond to multiple types of damage. Moreover, the outputs of the hidden layers of BERT are not fully utilized. This paper proposes a two-stage multi-class multi-label classification method for typhoon damage assessment by fully integrating the outputs of the hidden layers of BERT. In the first stage, sentence vectors are adopted to identify typhoon damage-related texts. In the second stage, word matrices are applied for multi-class multi-label classification to further classify the texts into five damage categories (i.e. transportation, public, electricity, forestry, and waterlogging). The two stages are trained end-to-end to identify typhoon damage from social media texts. Experiments on $SinaWeibo$ texts during typhoon landfall in Chinese coastal regions demonstrate that the proposed method can effectively improve the accuracy of text classification and comprehensively assess typhoon damage.

KEYWORDS:

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data were derived from the following resources available in the public domain: https://weibo.com/.

Additional information

Funding

This research was supported in part by the Guangdong Provincial Key Laboratory of Intelligent Urban Security Monitoring and Smart City Planning under Grant No. GPKLIUSMSCP-2023-KF-02, the National Natural Science Foundation of China under Grant No. 42271325, the National Key Research and Development Program of China under Grant No. 2020YFA0714103, the Innovation Group Project of Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai) under Grant No. 311022018.

Multi-class multi-label classification of social media texts for typhoon damage assessment: a two-stage model fully integrating the outputs of the hidden layers of BERT

Information for

Open access

Opportunities

Help and information

Multi-class multi-label classification of social media texts for typhoon damage assessment: a two-stage model fully integrating the outputs of the hidden layers of BERT

ABSTRACT

Disclosure statement

Data availability statement

Additional information

Funding

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature