189
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Leveraging sentiment analysis of Arabic Tweets for the 2022 FIFA World Cup insights, incorporating the gulf region

ORCID Icon, ORCID Icon & ORCID Icon
Received 17 Sep 2023, Accepted 09 Apr 2024, Published online: 30 Apr 2024

ABSTRACT

Purpose

The study analyzes Arabic Tweets, with a focus on those originating from the Gulf Cooperation Council (GCC) region, to examine public perceptions of the 2022 FIFA World Cup.

Methodology

Twitter’s search API was utilized to gather a substantial dataset of 157,756 tweets collected from 1 October 2022, to 1 January 2022. After data cleaning, a refined dataset of 43,658 tweets, was categorized into positive, neutral, and negative sentiments based on sentiment scores.

Findings

Arabic tweets, initially held a positive sentiment towards the World Cup, which increased during the tournament, accompanied by a decrease in negative sentiment.

Research contribution

This study offers valuable insights into public sentiment regarding the 2022 FIFA World Cup among Arabic-speaking Twitter users, with a particular emphasis on those from the GCC region. Additionally, it lends support to FIFA's decision to host the event in the Arab World by highlighting the favorable public reception.

Originality

This research's originality lies in its examination of the transformation in public sentiment within the Arabic-speaking community throughout the 2022 FIFA World Cup. These insights hold the potential to inform future decisions related to the hosting of major sporting events, providing a unique perspective on the evolving dynamics of sentiment within this demographic.

Introduction

Many countries in the Arab World are increasingly investing in hosting mega sport events, using the later as a strategic instrument to significantly contribute to the multifaceted objectives of national development and the promotion of tourism (W. Ishac, Citation2020; D. Liu, Citation2016). Additionally, these events are recognized as vehicles for global recognition and to achieve geopolitical goals (Amara, Citation2005; Al-Thani, Citation2021).

Similar to its neighboring nations, Qatar strategically invests in the sports industry becoming a key global and regional sports hub, in line with the goals of Qatar National Vision 2030 (QNV 2030). According to Scharfenort (Citation2012), QNV 2030 aims to enhance Qatar's regional standing; It focuses on sports to enhance aspects like, community cohesion, economic diversification, and environmental awareness (Qatar Olympic Committee, Citation2011). Hosting large-scale sport events, exemplified by the Qatar 2022 FIFA World, reflects this strategic vision, aiming to achieve diverse objectives on both national and international fronts.

To comprehensively assess whether a country is achieving its goals through hosting sport events, understanding the pivotal role of public sentiments becomes imperative. Researchers, in their exploration of the impact of hosting sports events, have particularly focused on two types of impacts: tangible and intangible. Within the realm of intangible, they have extensively examined its benefits, particularly by focusing on the social impact (Gibson et al., Citation2014; Oja et al., Citation2018; Storm & Jakobsen, Citation2020), as well as the perceived impact associated with hosting sports events (Balduck et al., Citation2011; Oshimi et al., Citation2021). Similarly, Twitter has been a valuable platform for sentiment analysis in mega sport events; it provides individuals with a space to express and communicate freely, offering real-time insights into public sentiment (Rodriguez, Citation2017; Meier et al., Citation2021; Fan et al., Citation2020). Employing sentiment analysis provides a broader overview by capturing sentiments expressed on digital platforms, and a better understanding of sporting events (Stojanovski et al., Citation2015; Yu & Wang, Citation2015).

In the Gulf Cooperation Council (GCC) countries, there is a shortage of studies assessing hosting of mega-events. There are even fewer studies specifically focusing on the impact of these events. In the case of Qatar, for example, only a small number of studies have investigated residents’ perceptions of international sport events in Qatar (Ishac et al., Citation2018; Ishac & Swart, Citation2022). Additionally, there is limited research focused on understanding the impact of hosting the 2022 FIFA World Cup on Qatar’s residents before the actual event (Al-Emadi et al., Citation2022; Ishac et al., Citation2022). Furthermore, social media platforms were used to understand public perception regarding the 2022 FIFA World Cup; Dun et al. (Citation2022) analyzed pre-event Twitter activity, revealing that Qatar's branding and soft power strategy did not align with public sentiment, especially in the Global North. Dewi and Arianto (Citation2023) assessed Twitter sentiment across three periods – before and after Qatar's selection as the host, and during the event. The study's findings revealed that approximately 84% of the sentiment expressed was positive, while around 16% reflected negative sentiment. While a positive impact was anticipated prior to the event, the post-event sentiment and overall perception remain unclear.

Considering the limited number of studies pertaining to Arabic public speaking sentiment, including the GCC region, the primary objective of this study is to build on the approach proposed by Ishac and Swart (Citation2022) and Ishac et al. (Citation2022) by investigating public sentiments on a wider sphere rather than focusing only on residents’ perception. This study examines a broader geographical perspective in the region by analyzing Arabic tweets generated before, during, and after the 2022 FIFA World Cup. Results offer valuable insights, enabling a better understanding of the public perceptions of Arabic-speaking individuals. Applying sentiment analysis could help organizations understand fan opinions and gain insight into how hosting sport events can impact communities (Aloufi et al., Citation2018; Lucas et al., Citation2017).

Literature review

Residents’ perception and sport events

The significance of public perceptions, mainly residents’ perceptions, is evaluated based on multiple factors, such as their political and social perspectives and their level of engagement and attachment to the community (Deccio & Baloglu, Citation2002); which in turn can influence residents’ reactions (Fredline & Faulkner, Citation2001). For example, scholars argue that residents of host communities perceive sport events positively, especially concerning their potential impact on tourism (Balduck et al., Citation2011; Ishac & Swart, Citation2022; Kim et al., Citation2015; Vetitnev & Bobina, Citation2017) and promoting a favorable image on the place or nation (Kim et al., Citation2014; Kogoya et al., Citation2022). Moreover, sport events help improve a sense of community, community pride, and national identity (Ishac et al., Citation2022; Mourão et al., Citation2022). In this regard, Waitt (Citation2003) has pointed out that neglecting residents’ perceptions regarding sport events would undermine public confidence.

Sport events have been shown to create diverse emotional responses within society (Stojanovski et al., Citation2015) and could provide a rich data source for studying cognitive processes and emotions (Lucas et al., Citation2017). As a result, scholars focused on studying this particular field from various perspectives and contexts. For instance, Ishac et al. (Citation2022) used the psychic income theory to examine residents’ perceptions before the 2022 FIFA World Cup and found a significant correlation between their perceptions, community attachment, pride, and excitement for the event. Similarly, Kim and Petrick (Citation2005) conducted an empirical study regarding residents’ perceptions of the 2002 World Cup held in Korea and Japan, identifying five dimensions: “tourism resources development and urban revitalization,” “image enhancement and consolidation,” “economic benefits,” “interest in foreign cultures or countries;” and “development of tourism infrastructure.”

In a similar vein, Biscaia et al. (Citation2017) argued that favorable attitudes of consumers towards hosting the 2014 FIFA World Cup in Brazil positively impacted their hedonic and utilitarian value perceptions. The authors also emphasized that communicating the event’s positive outcomes to the host community could enhance its perceived value. Likewise, Sullivan (Citation2018) found that emotions associated with the 2010 FIFA World Cup illustrate distinctive qualities of emotions linked to the event and a variation of emotional enthusiasm across different forms of excitement examined. As a result, emotional entrainment strengthened national identification, while the perceived emotions of national symbols were reinforced after the tournament (Von Scheve et al., Citation2014).

Sentiment analysis and FIFA World Cup Tweets

The World Cup is one of the most popular sporting events in the world, and it attracts millions of fans from around the globe: an estimated 5 billion football fans worldwide (FIFA, Citation2023). The World Cup also marked a significant moment in the history of social media (Dredge, Citation2014). For instance, around 672 million Twitter messages were disseminated during the FIFA World Cup 2014 (Rogers, Citation2014), and more than 618,000 tweets were posted per minute during the final of the FIFA World Cup 2014 (Kabakus et al., Citation2018; Lucas et al., Citation2017). Furthermore, Twitter reported that users posted 115 billion tweets sharing their views during the 2018 FIFA World Cup (Bavishi & Filadelfo, Citation2018), and around 147 billion conversations were recorded on the platform throughout the 2022 FIFA World Cup (Moore, Citation2022). Twitter has become increasingly popular while users generate an immense amount of data.

In recent times, sentiment analysis has been employed to grasp the general perception of sporting events (Stojanovski et al., Citation2015; Yu & Wang, Citation2015). For instance, Rodriguez (Citation2017) findings showed that Twitter provided individuals with more free space to express and communicate freely. In the 2018 FIFA World Cup, Meier et al. (Citation2021) assessed English Twitter messages associated with the host nation to evaluate whether the tweets portrayed a positive or negative sentiment toward Russia as a destination image. Results showed that negative messages declined at the event's start, leaving it to appear as a “normal” mega sport event on Twitter. Similarly, Twitter was used to examine supporters of the England team's real-time sentiment during the 2018 FIFA World Cup. Fan et al. (Citation2020) investigated sport fan sentiment during the football matches of England against Croatia and Colombia by extracting data with location stamps from England, the United Kingdom, and the U.K. According to their findings, team identification, national identification, and sentiment were significantly higher when the English team led these games.

Furthermore, Barnaghi et al. (Citation2015) and (2016) conducted a football sentiment analysis and created polarity classifiers to detect positive and negative sentiments. They used Uni-gram and Bi-gram features with a logistic regression algorithm. The algorithm was trained using a set of manually labeled tweets, which amounted to 4,162 tweets. Afterward, they employed the sentiment classifier they developed to evaluate tweets collected during the 2014 FIFA World Cup and examined whether there was a connection between sentiment and key moments during the tournament. Along the same lines, Dewi and Arianto (Citation2023) examined the sentiment of Twitter users regarding Qatar hosting the 2022 FIFA World Cup across three distinct periods: prior to Qatar winning the bid, subsequent to Qatar winning the bid, and during the actual event.

Similarly, in an analysis of football-related tweets in Portugal during the 2013 FIFA Confederations Cup, Firmino Alves et al. (Citation2014) assessed the sentiment by manually categorizing a sample of 1,500 tweets according to their positive and negative emotions. Within a parallel framework, Yu and Wang (Citation2015) collected U.S. football fans’ tweets during five matches at the 2014 FIFA World Cup and classified them into positive, negative, or neutral categories. They examined the reactions soccer fans expressed in their tweets, especially when goals were scored by their team or the opposing team. Emotions like fear and anger were frequently highlighted during the games. Using Twitter to understand users’ emotions showed solid predictive accuracy. Likewise, Lucas et al. (Citation2017) utilized sentiment analysis to explore how people reacted during the 2014 World Cup; the study focused on how these emotions were shaped by contextual factors, such as prior expectations, and how these emotions changed over time.

Furthermore, Alrumaih et al. (Citation2020) measured the sentiment analysis of comments on social media by assessing the sentiment orientation of the textual features and emoji-based components in tweets posted in Arabic during the 2018 FIFA World Cup. The study results showed that emojis supported the sentiment orientation of the texts. Similarly, Patel and Passi (Citation2020) conducted a study focusing on Twitter data pertaining to the 2014 FIFA World Cup. Their research centered on identifying emotional words present in user tweets. Their methodology involved extracting emotional words from WorldNet and assigning sentiment polarity based on the SentiWordNet dictionary. In cases where words in a sentence held contextual meaning within the text, the researchers referred to the lexicon and utilized the SentiWordNet dictionary in conjunction with the relevant part-of-speech (POS) information.

In an alternative study, Aloufi and El Saddik (Citation2018) developed a new sentiment lexicon, a football-specific sentiment classifier, using an automatically labeled football dataset. They used N-gram features and lexicon-based features to train their SVM classifier on the FIFA World Cup 2014 dataset, which was also automatically labeled and susceptible to errors in labeling. Their proposed method enabled them to categorize tweets into positive, negative, or neutral categories with an 85% accuracy rate. Previous research used lexicon-based or machine-learning approaches to analyze sentiment in football-related Twitter data (Barnaghi et al., Citation2016; Dun et al., Citation2022; Lucas et al., Citation2017; Patel & Passi, Citation2020; Yu & Wang, Citation2015). These sentiment lexicons have been derived from various sources and domains, including the FIFA World Cup datasets (Aloufi & El Saddik, Citation2018). Additionally, Nichols et al. (Citation2012) demonstrated that the volume of tweets per minute could automatically reconstruct and summarize the sequence of events in a football match. In light of the above, the current study aims to investigate Twitter users’ sentiments concerning the 2022 FIFA World Cup.

Research context

In the Arab region, specifically the GCC region, several countries have invested heavily in hosting mega sports events. However, limited research has been conducted to assess the public perception of these countries hosting international sporting events (Ishac et al., Citation2018; Ishac & Swart, Citation2022), and only a few studies have focused on measuring residents’ perceptions leading up to the 2022 FIFA World Cup (Alrumaih et al., Citation2020; Ishac et al., Citation2022). Among these studies are Dun et al. (Citation2022), who examined Twitter activity prior to the event to understand the public perception of Qatar's branding and soft power strategy, and Dewi and Arianto (Citation2023), who investigated Twitter users’ sentiment during three distinct periods: prior to Qatar securing the bid, subsequent to Qatar securing the bid, and throughout the actual event.

In light of the aforementioned, understanding whether there was a notable rise in user engagement with Arabic tweets related to the 2022 FIFA World Cup during the event can shed light on the impact and popularity of the event. Therefore, this study centers on analyzing tweets as they provide valuable insights into public perceptions (Brown et al., Citation2021). Understanding the variation of sentiments within tweets related across the different phases of the event will allow us to examine the event’s influence on online sentiment. Therefore, it is important to note that the analysis focuses on tweets’ content, particularly the tone in which it is written (Liu, Citation2015). Sentiment analysis involves understanding tweets’ content in various fields to identify satisfaction levels regarding specific topics, products, or events (Curran et al., Citation2011; Jansen et al., Citation2009; Ranco et al., Citation2015). Furthermore, this study considers language and geotags tweets to understand public perceptions of Arabic-speaking users in general and in specific geographical regions before, during, and after the 2022 FIFA World Cup.

Assessing tweets within a specific geographical region can still offer a better understanding of the impact of the event on a localized scale. Comparing sentiment and engagement level across different regions will help identify regional patterns and differences in how the event is perceived, allowing for a more accurate assessment of public engagement and sentiment within the area. However, geotagged tweets will introduce challenges in generalizing regional findings, primarily because less than 1% of Twitter users do not grant permission for geotagging (Cheng et al., Citation2010; Hecht et al., Citation2011). Next to that, language plays a crucial role in enhancing the accuracy of these sentiment variations, according to Raghupathi et al. (Citation2020). Therefore, Arabic was chosen as the primary language of focus, as it helps better understand perceptions within Arabic-speaking users. To overcome this research gap, the study put forth the following Research Question (RQ): How does the engagement of Arabic-speaking Twitter users with tweets related to the 2022 FIFA World Cup vary across different periods, and what patterns emerge in terms of the frequency of negative and positive tweets before, during, and after the event?

Methodology

Sample

Tweets were retrieved from twitter.com, using the RESET API function before, during, and after the 2022 FIFA World Cup. The search API offered by Twitter provides programmatic access to read and write Twitter data, allowing retrieval of approximately 1–2% of a random sample of all tweets. To collect Twitter data, we used the Twitter Academic Research Library (R Package) and Twitter API to extract Arabic tweets, including tweets geotagged from the Gulf region, keywords, and official hashtags related to the World Cup, e.g. “@Fifa2022, #Fifa2022, #Qatar2022, #Worldcup, #Football, #worldcup2022, #fifaworldcup, قطر#, قطر2022, #Qatar, #roadto2022, #roadto2022en, #roadto2022news, #roadto2022, مونديال_قطر, #boycottqatar2022, كاس_العالم_2022”.

Despite Twitter’s announcement of the substantial 147 billion conversations recorded on the platform throughout the competition (Moore, Citation2022), only 157,756 tweets were collected over three stages: before, during, and after the tournament, covering the period from October 1, 2022, to January 1, 2023. The analysis focused on Arabic-language tweets and Arabic-language tweets geographically posted from the GCC region. Consequently, each tweet in the dataset was assigned a location stamp indicating the country from which it originated.

To ensure the accuracy and reliability of our data, we recognized that not all Arabic tweets labeled with a location stamp from GCC countries necessarily originated from individuals. To address this, we implemented several measures during the data-cleaning process. First, manual checks were conducted to identify and remove data generated by bots, announcements, or advertisements. This step aimed to ensure that the remaining dataset primarily consisted of tweets from genuine users expressing their opinions. Second, we applied data processing and cleaning techniques to eliminate irrelevant or noise-inducing elements from the data. These techniques involved removing stop words, hashtags, punctuation, special characters, symbols, and URLs. Eliminating irrelevant data enhanced the quality and reliability of the remaining tweets for analysis, resulting in 43,658 refined tweets. Subsequently, linguistic techniques, such as tokenization, stemming, and lemmatization, were employed to delve deeper into the sentiment expressed in these tweets. These techniques helped obtain the root forms of Arabic words, enabling a more accurate sentiment assessment. Accordingly, we employed the Mohammad and Turney (Citation2013) Arabic Sentiment Lexicon to categorize the sentiment of each tweet. Following this lexicon, tweets were categorized as positive, negative, or neutral based on the sentiment conveyed by the words used.

Based on the comprehensive examination, we found that 121 tweets originated in Bahrain, 418 tweets in the United Arab Emirates, 585 in Oman, 968 in Kuwait, 3,242 in the Kingdom of Saudi Arabia (KSA), and 2,174 in Qatar, while 36,150 tweets were not geotagged (see ).

Table 1. Tweets collected during the 2022 FIFA World Cup in the GCC.

Analysis and coding procedure

Building upon the research of Budiharto and Meiliana (Citation2018), this project examined the emotional aspects of each Arabic tweet, employing a Lexicon-based model. The rationale behind this choice is related to the complexity of the Arabic language, known for its rich expressiveness and the variety of dialect nuances. To better understand the emotional tones in the tweets, we determined the frequency of each word in each of the tweets and categorized them based on the sum of the sentiment scores to the words within them. For example, if the sum of the sentiment scores of all the words in a tweet was greater than 0, the tweet was classified as positive. If the sum of the sentiment scores was less than 0, the tweet was classified as negative. Finally, if the sum of the sentiment scores was equal to 0, the tweet was classified as neutral. The process started with extracting the NRC lexical elements from a tweet using an R-coded application, and the number of words that matched each of the eight categories was used to generate the score for each emotion. Then, the degree of polarity was obtained by dividing the total score by the number of words having a sentimental degree in each tweet. presents a sample of tweets with their lexicon, emotions, and how they were coded. However, it was found that the Arabic lexicon dictionary proposed by Mohammad and Turney (Citation2013) did not include all the relevant words, especially those specific to the Khaliji dialect spoken by individuals in the GCC region. Therefore, the dictionary was extended to include more relevant terms for the analysis, such as “Irhabu” (ارحبو) and “Kaffou” (كفو), which mean “welcome” and “very good,” respectively. Additionally, a Khaliji dialect word, “فديتج” (which means “I sacrifice for you”), was also included. To ensure the precision of the analysis, a meticulous manual labeling procedure was implemented to establish the association between the newly incorporated words and their respective emotions. The obtained tweets, originally written in Arabic, were expertly translated into English by authors proficient in both languages. exhibits the retrieved tweets after translation.

Table 2. Lexicon sample.

The researchers utilized an R-coded application to extract the NRC lexical elements from each tweet. Furthermore, they applied Budiharto and Meiliana (Citation2018) approach by categorizing tweets based on their emotions. The model defines the sentiment of a tweet as positive if the sum of all the words with positive sentiment is greater than zero, negative if the sum of all the words with negative sentiment is less than zero, and neutral if the sum of all words with sentiment is equal to zero (see ).

Results

The researchers employed a lexicon-based approach to assess the intensity of specific emotions. They counted the frequency of lexicon words in each tweet, allowing for quantitative emotion measurement. For example, if a tweet contained two words with positive sentiment and one with negative sentiment, the final score would be the sum of the positive and negative words. The tweet would be classified as positive if the sum is greater than zero. Illustrates the findings regarding the number of tweets related to the 2022 FIFA World Cup. The data shows a gradual increase in tweets starting from October, with a significant surge in November, peaking in the middle of that month. These patterns indicate an escalating interest and active participation among Twitter users with regard to the event during this period . offers valuable insights into the tweet activity observed within collected Arabic Tweets and Arabic tweets geotagged from the GCC countries, demonstrating that, in most instances, there was an increase in tweet volume during the event compared to the pre-event period. These findings indicate an elevated level of Twitter activity and public engagement surrounding the 2022 FIFA World Cup among Arabic-speaking users, including the collected Arabic Tweets in the GCC. Consequently, these observations highlight the dynamic nature of public discourse on social media platforms and the growing enthusiasm and interest in the 2022 FIFA World Cup as the event unfolded.

Figure 1. Sentiment variation for Arabic Tweets.

Figure 1. Sentiment variation for Arabic Tweets.

Figure 2. Variation of positive sentiment over time by country.

Figure 2. Variation of positive sentiment over time by country.

Table 3. Arabic-Tweets Sentiment by Country and period.

Evidently, in the countries under observation, the emotions expressed were predominantly neutral rather than positive or negative in the period leading up to the event: 68.8% of the tweets observed in Bahrain were neutral, followed by 59.5% in the UAE and 49% in KSA. However, a significant shift in emotions was observed once the event began, with a notable increase in positive emotions: in Bahrain, an increase from 25% of positive tweets before the event to 43.8% of positive tweets during the event, from 25.7% to 43.4% in the UAE, from 39.2% before the event to 50.4% in KSA, and from 48.9% to 58.2% in Kuwait. On the other hand, in Qatar and Oman, positive tweets did not significantly change when comparing the two time periods (see ).

The study's findings suggest that there was a considerable increase in the proportion of positive tweets during the event compared to prior to it, with variations observed among countries. Specifically, the percentage of positive tweets ranged from 43.4% (UAE) to 58.2% (Kuwait) during the event, compared to a minimum of 25% (Bahrain) to 50.7% (Qatar) prior to it. These results provide valuable insight into the emotional dynamics of Twitter users in the context of a major sporting event and could have significant implications for social media and sentiment analysis research (See ).

The results show that the proportion of negative tweets decreased during the event compared to before. Leading up to the event, the highest percentage of negative tweets was 14.9% observed in the UAE, while the lowest was 6.2% observed in Bahrain. During the event, negative tweets decreased significantly, with a maximum of 12.4% observed in Bahrain and a minimum of 3.2% observed in Kuwait. These findings highlight a shift in the emotional tone of Twitter users towards more positive sentiments during a major sporting event and could have important implications for understanding the impact of such events on social media dynamics.

demonstrates the variation of negative tweets recorded throughout the entire duration of the event. Remarkably, the number of tweets identified during the event notably increased compared to the pre-event period. Therefore, when considering the percentage of positive and negative tweets, it is essential to consider the total number of tweets observed in each country to ensure accurate interpretation and analysis of the data, which is particularly relevant in the context of large-scale events like the FIFA World Cup, where the volume of tweets can be substantial and may influence the distribution of emotions observed.

Figure 3. Variation of negative sentiment over time by country.

Figure 3. Variation of negative sentiment over time by country.

The study revealed a noteworthy decline in the proportion of negative tweets and a rise in positive tweets during the event in contrast to the preceding timeframe. For instance, in Bahrain, no negative tweets were observed, while the percentage of positive tweets reached 68.8%. The UAE followed with 5.4% of observed negative tweets and 41.1% of positive tweets, while Qatar had the third lowest percentage of observed negative tweets at 5.9% and 55.9% of observed positive tweets. Interestingly, Oman had the highest negative percentage of tweets at 13.2% and a positive observed tweets of 60.5%. The percentage of observed negative tweets in Kuwait and KSA were similar, with 7.1% of negative observed tweets in Kuwait compared to 7.8% in KSA, while the positive observed tweets in Kuwait were 53.5% compared to 47.6% of positive tweets in KSA.

In addition to the previously discussed observations, the study also found interesting trends in Arabic-speaking users with non-geotagged tweets. For example, 11.2% of these tweets were negative before the event, which decreased to 10% during the event and 7.8% after the event (see ). Similarly, the proportion of positive tweets increased from 48% before the event to 61.1% after. Furthermore, it is important to indicate that the total of tweets categorized by their emotions indicates a total decrease in negative tweets. For instance, negative tweets before the event took place were 11.11% compared to 8.68% during and 7.5% after the event. Conversely, positive tweets increased from 47.83% before the event to 55.25% after the event (see ).

Table 4. Arabic-Tweets percentage by period of time.

Discussion

The growing interest in research within the Arab region is sparked by the increased investments made in bidding for and hosting major sporting events. Understanding the underlying factors that drive these investments has become a significant motivation for researchers, as the economic, social, and cultural impacts associated with hosting such events are substantial. This study aims to address existing gaps in knowledge by investigating the dynamics of public perceptions related explicitly to the 2022 FIFA World Cup. Mainly by focusing on Arabic-speaking users, along with Arabic tweets originating in the GCC countries, encompassing the periods before, during, and after the 2022 FIFA World Cup. One noteworthy aspect of this research is its focus on the use of Arabic as the primary language of investigation. This approach deepens our insights and enables the targeted analysis of specific populations, as highlighted by Ishac et al. (Citation2022). Furthermore, this study builds upon the methodology introduced by Ishac et al. (Citation2022) by incorporating public perceptions across multiple geographical locations, going beyond a narrow focus limited to participants’ nationalities. By expanding the scope of analysis, this research provides valuable insights into the diverse perspectives and experiences within Arab speaking language including the GCC region during the 2022 FIFA World Cup.

The study adopts the approach by Budiharto and Meiliana (Citation2018), where tweets are categorized based on emotional content. This categorization involved classifying tweets as negative if the sum of observed words in a tweet resulted in a negative value, as neutral if the sum of observed words equaled zero, and as positive if the sum of observed words surpassed zero. By employing this methodology, the study facilitated a systematic classification of tweets into specific emotional categories, enabling a comprehensive analysis of the sentiments conveyed within the collected data.

Our study demonstrates that examining sentiment in tweets aligns with our expectations and can be reasonably justified. Furthermore, this approach enabled us to analyze a substantial volume of tweets from Arabic-speaking users, including Arabic tweets geotagged in the GCC region associated with the 2022 FIFA World Cup. These findings provide more insights into the RQ, as the count of Arabic tweets, including those from the GCC countries, increased when comparing the pre-event period to the event itself. These results align with and reinforce Ishac et al.'s (2022) findings, suggesting that hosting the 2022 FIFA World Cup in Qatar can significantly exert a substantial influence on the broader region. Additionally, our results align with Theodorakis et al. (Citation2019), suggesting that the region exhibits a stronger association with football than any other sport. In addition, the surge in tweet activity observed in The Kingdom of Saudi Arabia throughout the event indicated the game's popularity in the country. This observation supports the findings of Xu et al. (Citation2022) and the projections made by Ishac et al. (Citation2022) regarding the Kingdom of Saudi Arabia's potential co-hosting of the tournament.

By employing three-dimensional emotional assessment, categorizing tweets into positive, neutral, and negative sentiments. The neutral dimension adds to the credibility of the findings by capturing how Twitter users perceive the event without a clear positive or negative bias. For example, in Qatar, Al-Emadi et al. (Citation2022) expressed concerns about young residents being worried due to new implications in Qatar, while Ishac et al. (Citation2022) expressed that Qatar residents exhibited more positivity than concern towards the event. By analyzing the geotagged tweets from Qatar, our findings addresses the above where it indicates a decrease in negative tweets and an increase in neutral tweets when comparing the period before and during the event in Qatar, supporting the findings of Al-Emadi et al. (Citation2022) regarding the concerns and worries among Qatar residents before the event.

On the other hand, the study also revealed an increase in positive tweets when comparing the period before, during, and after the event, aligning with the predictions made by Ishac et al. (Citation2022). The continuous rise in the positive sentiment expressed on Twitter suggests that the event positively affected public perception. In addition, the observed increase in positive tweets and decrease in negative tweets during the period before and during the event can be attributed to the successful engagement of individuals, which aligns with previous findings Inoue and Havard (Citation2014); Dewi and Arianto (Citation2023). In their research, Inoue and Havard (Citation2014) suggest that sport events generate enthusiasm by showcasing renowned athletes from various nations and regions. This aspect likely contributed to the overall positive sentiment expressed on Twitter during the 2022 FIFA World Cup.

The study also revealed a decrease in the number of individuals’ tweets when comparing the period during the event to the period after the event. Specifically, there was an increase in support and interest during the event itself but a gradual decline after the event concluded. This finding aligns with the research conducted by Ribeiro et al. (Citation2022), which suggests that support and interest in the event tend to change over time.

Overall, the study's results support the notion that the sentiment expressed on Twitter shifted towards more positive emotions and decreased negative emotions during and after the event. Moreover, the decrease in the number of individuals’ tweets after the event reflects a typical trend observed in studies such as Ribeiro et al. (Citation2022), where interest and engagement gradually decrease after the event's conclusion.

Conclusion and recommendation

In conclusion, our study adopted a novel approach to analyze the emotions expressed in tweets regarding the 2022 FIFA World Cup before, during, and after the event. We specifically focused on Arabic tweets generated within the GCC countries. Our findings supported using data analysis to examine sentiment in tweets, aligning with our expectations and providing a reasonable explanation. Furthermore, the utilization of extensive data analysis enabled us to analyze a substantial volume of Arabic tweets collected from Arabic-speaking users, including tweets from the GCC countries, enhancing the robustness of our study.

The findings of our sentiment analysis demonstrated that the perceived influence of the 2022 FIFA World Cup was perceived positively in a broader sphere. This study addresses a crucial aspect that has not been extensively covered in previous literature. By examining Arabic tweets, including tweets from the GCC region, decision-makers and sports organizers can extract valuable insights into users’ opinions, which can help in enhancing the strategic planning of mega sport events in the region.

Our results indicate that the positive opinions expressed by Arabic-speaking users, including Arabic users tweeted from the GCC countries, increased during and after the event compared to the period before the commencement of the 2022 FIFA World Cup. This result highlights the event's positive impact on the sentiments of individuals within the region. Furthermore, the study underscores the significance of capturing and analyzing these sentiments to comprehend the societal shifts and transformation accompanying mega sporting events like the FIFA World Cup. Furthermore, the results of our study reinforce the viewpoint presented by Ishac et al. (Citation2022) that hosting a major event like the FIFA World Cup has the potential to extend its positive impact beyond the host country. Moreover, Ishac et al. (Citation2022) anticipated that hosting the 2022 FIFA World Cup in the Arab region would contribute to dispelling negative stereotypes associated with Muslim countries and showcase Arab hospitality. This increase in positive tweets potentially has positively affected public perceptions and may have reshaped stereotypes, in line with the expectations outlined by Ishac et al. (Citation2022). Building upon the findings of Tichaawa et al. (Citation2015), who argued that the 2010 FIFA World Cup had the potential to contribute to the development of Africa, our study indicates a similar influence radiating across Arabic-speaking users’ and across the GCC region based on the results of hosting the 2022 FIFA World Cup.

Moreover, our research emphasizes the significance of assessing public sentiment, enabling organizers to gauge the general population's feelings toward government policies and initiatives. In the context of regional development initiatives, particularly concerning the substantial investment made to host the 2022 FIFA World Cup, our study's results can highlight that this positive public perception provides a better understanding of government strategies.

Focusing on sentiment analysis allows for observing dynamic shifts in individuals’ sentiments as they tweet about this specific event. However, future studies must consider measuring the impact of these sentiments on each of the residents of the countries involved. This approach can provide valuable insights into the implications of hosting mega sporting events. By assessing the impact associated with the sentiment on individuals, researchers can better understand how the event influences residents’ attitudes, behaviors, and perceptions, which can give indications of whether changes are occurring within societies and determine if hosting the event contributes to promoting national pride, improving regional development, and or challenging stereotypes.

However, it is important to acknowledge the limitations of this study. Firstly, the sample of geotagged users collected from the different countries in the GCC was compact. Even though geotagged tweets can provide location-based information, they may not purely indicate the residents’ perception. Future research should consider analyzing user profile information or conducting a Time Zone analysis to understand users’ demographics and sentiments comprehensively.

Despite the robustness of the sentiment analysis methodology employed in this study, other limitations must be acknowledged. The approach relies on a lexicon-based method, wherein tweets are deconstructed into lexicons, and the cumulative polarities of specific lexicons determine the overall sentiment of the tweet. However, this approach has inherent constraints as it fails to consider the nuanced contextual aspects of the tweets. Future research should consider different sentiment analysis tools that can help in augmenting the accuracy of sentiment classification. Presently, Twitter no longer provides free access to its API. This limitation implies in this study that it may not be reproducible for future researchers who do not have access to the previously available free API.

Finally, future research should consider evaluating the global impact of hosting the 2022 FIFA World Cup, specifically focusing on co-hosting countries with distinct cultural backgrounds, such as the United States, Canada, and Mexico. By examining how the World Cup influences these diverse cultures, we can comprehensively understand its effects on individuals from different demographic backgrounds in co-hosting nations. This knowledge would be a valuable reference for future co-hosting countries, enabling them to anticipate and assess the potential impact of hosting such a major sporting event on their unique cultural landscapes.

Acknowledgements

Open Access funding provided by the Qatar National Library.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • Al-Emadi, A., Sellami, A. L., & Fadlalla, A. M. (2022). The perceived impacts of staging the 2022 FIFA world cup in Qatar. Journal of Sport & Tourism, 26(1), 1–20. https://doi.org/10.1080/14775085.2021.2017327
  • Aloufi, S., Alzamzami, F., Hoda, M., & El Saddik, A. (2018). Soccer fans sentiment through the eye of big data: The UEFA champions league as a case study. 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR).
  • Aloufi, S., & El Saddik, A. (2018). Sentiment identification in football-specific tweets. IEEE Access, 6, 78609–78621. https://doi.org/10.1109/ACCESS.2018.2885117
  • Alrumaih, A., Al-Sabbagh, A., Alsabah, R., Kharrufa, H., & Baldwin, J. (2020). Sentiment analysis of comments in social media. International Journal of Electrical & Computer Engineering (2088-8708), 10(6).
  • Al-Thani, M. (2021). Channelling soft power: the Qatar 2022 world cup, migrant workers, and international image. The International Journal of the History of Sport, 38(17), 1729–1752. https://doi.org/10.1080/09523367.2021.1988932
  • Alves, F., Baptista, A. L., Firmino, C. D. S., Oliveira, A. A., d, M. G., & Paiva, A. C. d. (2014). A Comparison of SVM versus naive-bayes techniques for sentiment analysis in tweets: A case study with the 2013 FIFA confederations cup. Proceedings of the 20th Brazilian Symposium on Multimedia and the Web.
  • Amara, M. (2005). 2006 Qatar Asian games: A ‘modernization’ project from above? Sport in Society, 8(3), 493–514. https://doi.org/10.1080/17430430500249217
  • Balduck, A.-L., Maes, M., & Buelens, M. (2011). The social impact of the tour de France: Comparisons of residents' Pre- and post-event perceptions. European Sport Management Quarterly, 11(2), 91–113. https://doi.org/10.1080/16184742.2011.559134
  • Barnaghi, P., Ghaffari, P., & Breslin, J. G. (2015). Text analysis and sentiment polarity on FIFA world cup 2014 tweets. Conference ACM SIGKDD.
  • Barnaghi, P., Ghaffari, P., & Breslin, J. G. (2016). Opinion mining and sentiment polarity on twitter and correlation between events and sentiment. 2016 IEEE second international conference on big data computing service and applications (BigDataService).
  • Bavishi, J., & Filadelfo, E. (2018). Insights into the 2018# WorldCup conversation on Twitter. Twitter. https://blogtwitter.com/official/en_us/topics/events/2018/2018-World-Cup-Insights. html. Accessed, 23.
  • Biscaia, R., Correia, A., Santos, T., Ross, S., & Yoshida, M. (2017). Service quality and value perceptions of the 2014 FIFA World Cup in Brazil. Event Management, 21(2), 201–216. https://doi.org/10.3727/152599517X14878772869685
  • Brown, M. E., Dustman, P. A., & Barthelemy, J. J. (2021). Twitter impact on a community trauma: An examination of who, what, and why it radiated. Journal of Community Psychology, 49(3), 838–853. https://doi.org/10.1002/jcop.22330
  • Budiharto, W., & Meiliana, M. (2018). Prediction and analysis of Indonesia Presidential election from Twitter using sentiment analysis. Journal of Big Data, 5(1), 1–10. https://doi.org/10.1186/s40537-018-0164-1
  • Cheng, Z., Caverlee, J., & Lee, K. (2010). You are where you tweet: a content-based approach to geo-locating twitter users. Proceedings of the 19th ACM international conference on Information and knowledge management.
  • Curran, K., O’Hara, K., & O’Brien, S. (2011). The role of Twitter in the world of business. International Journal of Business Data Communications and Networking, 7(3), 1–15. https://doi.org/10.4018/jbdcn.2011070101
  • Deccio, C., & Baloglu, S. (2002). Nonhost community resident reactions to the 2002 Winter Olympics: The spillover impacts. Journal of Travel Research, 41(1), 46–56. https://doi.org/10.1177/0047287502041001006
  • Dewi, S., & Arianto, D. B. (2023). Twitter sentiment analysis towards Qatar as host of the 2022 world cup using textblob. Journal of Social Research, 2(2), 443–455. https://doi.org/10.55324/josr.v2i2.615
  • Dredge, S. (2014). World Cup was biggest event yet for Twitter with 672 m tweets. the Guardian.
  • Dun, S., Rachdi, H., Memon, S. A., Pillai, R. K., Mejova, Y., & Weber, I. (2022). Perceptions of FIFA men’s world Cup 2022 host nation Qatar in the twittersphere. International Journal of Sport Communication, 1(aop), 1–10.
  • Fan, M., Billings, A., Zhu, X., & Yu, P. (2020). Twitter-based BIRGing: Big Data analysis of English national team fans during the 2018 FIFA World Cup. Communication & Sport, 8(3), 317–345. https://doi.org/10.1177/2167479519834348
  • FIFA Publications. (2023). The football landscape – The Vision 2020-2023. Available online at: https://publications.fifa.com/en/vision-report-2021/the-football-landscape/ (accessed April 18, 2023).
  • Fredline, E., & Faulkner, B. (2001). Variations in residents’ reactions to major motorsport events: Why residents perceive the impacts of events differently. Event Management, 7(2), 115–125. https://doi.org/10.3727/152599501108751524
  • Gibson, H. J., Walker, M., Thapa, B., Kaplanidou, K., Geldenhuys, S., & Coetzee, W. (2014). Psychic income and social capital among host nation residents: A pre–post analysis of the 2010 FIFA World Cup in South Africa. Tourism Management, 44, 113–122. https://doi.org/10.1016/j.tourman.2013.12.013
  • Hecht, B., Hong, L., Suh, B., & Chi, E. H. (2011). Tweets from Justin Bieber's heart: the dynamics of the location field in user profiles. Proceedings of the SIGCHI conference on human factors in computing systems.
  • Inoue, Y., & Havard, C. T. (2014). Determinants and consequences of the perceived social impact of a sport event. Journal of Sport Management, 28(3), 295–310. https://doi.org/10.1123/jsm.2013-0136
  • Ishac, W. (2020). Arab countries’ strategies to bid and to host major sport events. In F. Hong, & L. Zhouxian (Eds.), The routledge handbook of sport in Asia (pp. 437–446). Routledge).
  • Ishac, W., Sobry, C., Bouchet, P., & Cernaianu, S. (2018). The influence of hosting an international sport event on the young generation: the case of Qatar. International Sports Studies, 40(2), 19–33. https://doi.org/10.30819/iss.40-2.03
  • Ishac, W., & Swart, K. (2022). Social impact projections for Qatar youth residents from 2022: The case of the IAAF 2019. Frontiers in Sports and Active Living, 4, 922997. https://doi.org/10.3389/fspor.2022.922997
  • Ishac, W., Swart, K., & Mollazehi, M. (2022). Qatar Residents’ Perceptions of the 2022 FIFA World Cup: Projections for Future Co-hosting Countries.
  • Jansen, B. J., Zhang, M., Sobel, K., & Chowdury, A. (2009). Twitter power: Tweets as electronic word of mouth. Journal of the American Society for Information Science and Technology, 60(11), 2169–2188. https://doi.org/10.1002/asi.21149
  • Kabakus, A. T., Simsek, M., & Belenli, Y. (2018). The wisdom of the silent crowd: predicting the match results of world cup 2018 through twitter. International Journal of Computer Applications, 182, 40–45.
  • Kim, J., Kang, J. H., & Kim, Y.-K. (2014). Impact of mega sport events on destination image and country image. Sport Marketing Quarterly, 23(3).
  • Kim, S. S., & Petrick, J. F. (2005). Residents’ perceptions on impacts of the FIFA 2002 world cup: The case of Seoul as a host city. Tourism Management, 26(1), 25–38. https://doi.org/10.1016/j.tourman.2003.09.013
  • Kim, W., Jun, H. M., Walker, M., & Drane, D. (2015). Evaluating the perceived social impacts of hosting large-scale sport tourism events: Scale development and validation. Tourism Management, 48, 21–32. https://doi.org/10.1016/j.tourman.2014.10.015
  • Kogoya, K., Guntoro, T. S., & Putra, M. F. P. (2022). Sports event image, satisfaction, motivation, stadium atmosphere, environment, and perception: A study on the biggest multi-sport event in Indonesia during the pandemic. Social Sciences, 11(6), 241. https://doi.org/10.3390/socsci11060241
  • Liu, B. (2015). Sentiment analysis: mining sentiments, opinions, and emotions. Cambridge University.
  • Liu, D. (2016). Social impact of major sports events perceived by host community. International Journal of Sports Marketing and Sponsorship.
  • Lucas, G. M., Gratch, J., Malandrakis, N., Szablowski, E., Fessler, E., & Nichols, J. (2017). GOAALLL!: Using sentiment in the world cup to explore theories of emotion. Image and Vision Computing, 65, 58–65. https://doi.org/10.1016/j.imavis.2017.01.006
  • Meier, H. E., Mutz, M., Glathe, J., Jetzke, M., & Hölzen, M. (2021). Politicization of a contested mega event: The 2018 FIFA World Cup on Twitter. Communication & Sport, 9(5), 785–810. https://doi.org/10.1177/2167479519892579
  • Mohammad, S. M., & Turney, P. D. (2013). Nrc emotion lexicon. National Research Council, Canada, 2, 234.
  • Moore, C. (2022). “Twitter delivered 147 billion impressions for World Cup 2022”. Worldsoccertalk. Retrieved from: https://worldsoccertalk.com/news/twitter-delivered-147-billion-impressions-for-world-cup-2022-20221220-WST-412893.html.
  • Mourão, T., Ribeiro, T., & Cunha de Almeida, V. M. (2022). Psychic income benefits of the Rio 2016 Olympic Games: comparison of host community pre- and post-Games perceptions. Journal of Sport & Tourism, 26(1), 21–41. https://doi.org/10.1080/14775085.2021.2023364
  • Nichols, J., Mahmud, J., & Drews, C. (2012). Summarizing sporting events using twitter. Proceedings of the 2012 ACM international conference on Intelligent User Interfaces.
  • Oja, B. D., Wear, H. T., & Clopton, A. W. (2018). Major sport events and psychic income: The social anchor effect. Journal of Sport Management, 32(3), 257–271. https://doi.org/10.1123/jsm.2016-0170
  • Oshimi, D., Yamaguchi, S., Fukuhara, T., & Taks, M. (2021). Expected and experienced social impact of host residents during rugby world cup 2019: A panel data approach. Frontiers in Sports and Active Living, 3, 628153. https://doi.org/10.3389/fspor.2021.628153
  • Patel, R., & Passi, K. (2020). Sentiment analysis on twitter data of world cup soccer tournament using machine learning. IoT, 1(2), 218. https://doi.org/10.3390/iot1020014
  • Qatar Olympic Committee. (2011). Sports Sector Strategy Map (2011-2016). Available online at: https://blogs.napier.ac.uk/qatar2022/wp-content/uploads/sites/29/2015/06/sports_sector_strategy_final-English.pdf (accessed March 12, 2023).
  • Raghupathi, V., Ren, J., & Raghupathi, W. (2020). Studying public perception about vaccination: A sentiment analysis of tweets. International Journal of Environmental Research and Public Health, 17(10), 3464. https://doi.org/10.3390/ijerph17103464
  • Ranco, G., Aleksovski, D., Caldarelli, G., Grčar, M., & Mozetič, I. (2015). The effects of Twitter sentiment on stock price returns. PLoS One, 10(9), e0138441. https://doi.org/10.1371/journal.pone.0138441
  • Ribeiro, T., Yoda, R., Papadimitriou, D. A., & Correia, A. (2022). Resident attitudes toward the Rio 2016 Olympic Games: A longitudinal study on social legacy and support behaviours. Journal of Hospitality and Tourism Management, 50, 188–198. https://doi.org/10.1016/j.jhtm.2022.02.018
  • Rodriguez, N. S. (2017). #FIFAputos: A Twitter textual analysis over “Puto” at the 2014 World Cup. Communication & Sport, 5(6), 712–731. https://doi.org/10.1177/2167479516655429
  • Rogers, S. (2014). Insights into the# WorldCup conversation on Twitter. Twitter Blog. Available online at: https://blog.twitter.com/en_us/a/2014/insights-into-the-worldcup-conversation-on-twitter (accessed April 14, 2023).
  • Scharfenort, N. (2012). Urban development and social change in Qatar: The Qatar national vision 2030 and the 2022 FIFA world cup. Journal of Arabian Studies, 2(2), 209–230. https://doi.org/10.1080/21534764.2012.736204
  • Stojanovski, D., Strezoski, G., Madjarov, G., & Dimitrovski, I. (2015). Emotion identification in FIFA world cup tweets using convolutional neural network. 2015 11th International Conference on Innovations in Information Technology (IIT).
  • Storm, R. K., & Jakobsen, T. G. (2020). National pride, sporting success and event hosting: An analysis of intangible effects related to major athletic tournaments. International Journal of Sport Policy and Politics, 12(1), 163–178. https://doi.org/10.1080/19406940.2019.1646303
  • Sullivan, G. B. (2018). Collective emotions: A case study of South African pride, euphoria and unity in the context of the 2010 FIFA World Cup. Frontiers in Psychology, 9, 1252. https://doi.org/10.3389/fpsyg.2018.01252
  • Theodorakis, N. D., Akindes, G., Wann, D. L., & Chadwick, S. (2019). Attitudes and consumption behaviors of football fans in the Middle East. Journal of Sport Behavior, 42(2), 225–250.
  • Tichaawa, T. M., Bama, H., & Swart, K. (2015). Community perceptions of the socio-economic legacies of the 2010 FIFA World Cup in Nelson Mandela Bay, Port Elizabeth, South Africa: A four-year post-event analysis. African Journal for Physical Health Education, Recreation and Dance, 21(4.2), 1376–1388.
  • Vetitnev, A. M., & Bobina, N. (2017). Residents’ perceptions of the 2014 Sochi Olympic Games. Leisure Studies, 36(1), 108–118. https://doi.org/10.1080/02614367.2015.1105857
  • Von Scheve, C., Beyer, M., Ismer, S., Kozłowska, M., & Morawetz, C. (2014). Emotional entrainment, national symbols, and identification: A naturalistic study around the men’s football World Cup. Current Sociology, 62(1), 3–23. https://doi.org/10.1177/0011392113507463
  • Waitt, G. (2003). Social impacts of the Sydney Olympics. Annals of Tourism Research, 30(1), 194–215. https://doi.org/10.1016/S0160-7383(02)00050-6
  • Xu, Z., Wu, C., & Li, X. (2022). Residents’ perceptions and behavioral intentions towards mega-sports events: A case study of Beijing 2022 Olympic winter games. Sustainability, 14(22), 14955. https://doi.org/10.3390/su142214955
  • Yu, Y., & Wang, X. (2015). World Cup 2014 in the Twitter World: A big data analysis of sentiments in U.S. sports fans’ tweets. Computers in Human Behavior, 48, 392–400. https://doi.org/10.1016/j.chb.2015.01.075