Search in:

Connection Science Volume 36, 2024 - Issue 1

Submit an article Journal homepage

Open access

890

Views

CrossRef citations to date

Altmetric

Listen

Research Article

Blockchain-based privacy-preserving multi-tasks federated learning framework

Yunyan Jiaa School of Computer and Software Engineering, Xihua University, Chengdu, People's Republic of China

https://orcid.org/0009-0001-3490-9437 View further author information

Ling Xionga School of Computer and Software Engineering, Xihua University, Chengdu, People's Republic of ChinaCorrespondence[email protected]

https://orcid.org/0000-0001-9900-2978 View further author information

Yu Fanb Xihua Honor College, Xihua University, Chengdu, People's Republic of China

https://orcid.org/0009-0007-2530-7083 View further author information

Wei Liangc School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, People's Republic of China

https://orcid.org/0000-0003-3813-3223 View further author information

Neal Xiongd Department of Computer Science and Mathematics, Sul Ross State University, Alpine, TX, USA

https://orcid.org/0000-0002-0394-4635 View further author information

Fengjun Xiaoe Zhejiang Informatization Development Institute, Hangzhou Dianzi University, Hangzhou, People's Republic of China

https://orcid.org/0000-0002-4114-4209 View further author information

Article: 2299103 | Received 05 Jul 2023, Accepted 20 Dec 2023, Published online: 08 Jan 2024

Cite this article
https://doi.org/10.1080/09540091.2023.2299103
CrossMark

In this article

1. Introduction
2. Related work
3. Preliminary
4. System model and security requirement
5. The proposed framework
6. Security analysis
7. Experiment and performance analysis
8. Conclusion
Disclosure statement
Additional information
References

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Federated learning (FL), as an effective method to solve the problem of “data island”, has become one of the hot and widespread concern topics in recent years. However, with the using of FL technology in the practical applications, an increasing number of FL tasks make the training management be more complex and the trade-off of multi-task becomes difficult. To overcome this weakness, this work proposes a privacy-preserving FL framework with multi-tasks using partitioned blockchain, which can run several different FL tasks by multiple requesters. First, a temporary committee is formed for an FL task to facilitating visualization, organization and management of security aggregation. Second, the proposed framework combines Paillier homomorphic encryption with Pearson correlation coefficient to protect users' privacy and ensure the accuracy of global model. Finally, a new blockchain-based reward method is presented to inspire participants to share their valuable data. The experimental results show that the global model accuracy of our proposed framework is able to reach 98.43 $%$ . Obviously, the proposed framework is more suitable for practical application environment, especially in industrial application field.

Keywords:

Partitioned blockchain
federated learning
privacy-preserving
multi-tasking

1. Introduction

Over the past decades, Artificial Intelligence (AI) has widely applied to various areas, such as Industrial Internet of Things (Hu et al., Citation2020; Kang et al., Citation2019; Zhang et al., Citation2021), Internet of Vehicles (Ayaz et al., Citation2022; Otoum et al., Citation2020), smart healthcare (Kumar et al., Citation2021; Yang et al., Citation2023) and smart cities (Ullah et al., Citation2020). In practical, AI derives valuable insights from large amounts of high-quality data scattered by different groups or individuals. However, these data generally contain private or sensitive information, which makes the data owner hesitant to expose or upload his data. Thus the development of AI has been hampered by that limitation. To overcome this weakness, Federated learning (FL) (Yang et al., Citation2021; Zhao et al., Citation2021), as a distributed machine learning framework, has attracted extensive interest of academic and industrial circles.

FL overcomes the problem of “data silos” and achieves data “available but invisible”, which allows a central server to create a shared global model using gradients instead of original data. Clearly, it is an effective way to enhance the use of data and the sharing of data. However, this central mode can lead to the overload of computation and communication with the number of clients and tasks. Besides, the central server may skew some clients for a certain benefit, which will lose the accuracy of global model. Blockchain (Kim et al., Citation2020; Ma et al., Citation2022; Majeed & Hong, Citation2019), as a decentralised technology, functions with some attractive properties like openness, autonomy, immutability and traceability, provides a solution (Qi et al., Citation2021; Wan et al., Citation2022a) to overcome these weaknesses. Nevertheless, the transparent blockchain enables every node to obtain participants' gradients and aggregate a valuable global model. It is the gradient leakage (Wang et al., Citation2021; Wei et al., Citation2020) issues that may obey the original intention of FL. Hence, it is urgent to protect the gradient of participants.

To avoid the leakage of gradients, a large number of FL frameworks with privacy-preserving have been proposed. These frameworks can be generally categorised into four types: Differential Privacy-based (DP), Secure Multi-Party Computation-based (SMPC), Random Mask-based (RM) and Homomorphic Encryption-based (HE). These methods have proven to be excellent and have made significant contributions to privacy protection in FL. Each has distinct advantages and disadvantages. DP protects the privacy of gradients by adding noise to the local gradient model, which has a negative effect on model accuracy (Geyer et al., Citation2017; He et al., Citation2023; Wei et al., Citation2020). SMPC distributes the local model parameters of each participant to others, and then encrypts and decrypts the data to ensure the confidentiality of the data during the calculation process. But SMPC requires heavy communication cost (Sayyad, Citation2020; Sotthiwat et al., Citation2021). RM adds a random number to local gradients, which may be at risk of going offline (Bonawitz et al., Citation2017; Kursawe et al., Citation2011; Li et al., Citation2021). HE encrypts the local model parameters and calculates the encrypted data directly to guarantee both data integrity and participants' local dataset privacy. In addition, HE allows clients disconnect without affecting the accuracy of the global model in FL environment (Phong et al., Citation2018; Zhang et al., Citation2020). All of the above preferred conclusions call for a balance between privacy and efficiency in the design of blockchain-based FL. Considering the privacy disclosure of participants and the poor convergence of the global model due to the low accuracy of the local model to be solved in this paper, we choose the Homomorphic encryption algorithm as the essential tool. This work proposes a privacy-preserving FL framework using the HE method.

Most recently, Hao et al. (Citation2019) proposed an efficient FL scheme for privacy-preserve based on stochastic gradient descent (SGD) (Lian et al., Citation2017). This scheme employs lightweight additive HE technology and Laplace-based DP technology to enhance privacy. This scheme can tolerate users dropout and experimental results show that the global model has high accuracy. Mo et al. (Citation2021) propose a privacy protection method using trusted execution environment (TEE). They proposed layer-wise training and aggregation in FL to solve the problem of limited memory in TEE. Kanagavelu et al. (Citation2020) proposes a two-phase SMPC-enabled FL framework to protect privacy. The two-phase SMPC selects a subset of FL member to exchange secret shares of data. This method requires fewer secret sharing and exchange compared to traditional SMPC and improves the practicality of SMPC method. But there are two problems in these frameworks: single point failure and lack of reward mechanism.

To overcome these problems, a series of blockchain-based FL frameworks (Qi et al., Citation2021; Wan et al., Citation2022a) have been proposed. Bao et al. (Citation2019) proposed FLchain, which includes a distribute trust mechanism. The trustworthy mechanism evaluates the reliability and contribution of local models for fair reward distribution to participants. Chai et al. (Citation2021) propose a secure matrix factorization framework under the federated learning setting named FedMF. The core components in FedMF are a user-level distributed matrix factorization. Furthermore, they combine additive HE to increase the security. However, these frameworks do not consider the situation of malicious participants (Zhang et al., Citation2018) and unable resist data poisoning attacks. Later, Hao et al. (Citation2020) propose to re-encrypt local models by BGV fully HE and additive HE to solve the problem. Specifically, the scheme embeds the ciphertext of gradients into augmented learning with error to achieve the noninteractive aggregation protocol. In 2021, Liu et al. (Citation2021) proposed privacy enhanced federated learning scheme (PEFL), using HE technology, which can effectively resist two typical poisoning attacks in FL, label flipping and backdoor attacks. In recent years, Fu et al. (Citation2022) proposed a method to protect user privacy in FL using Lagrange interpolation and blinding techniques. These frameworks provide various methods to protect model aggregation and greatly improved security and participant privacy protection.

The challenges faced by federated learning in industrial applications encompass two main categories: model training and system security. However, beyond these, there is an additional challenge that cannot be ignored. With the big data era approaching, there is a significant increase in data volume, making it necessary to consider multitasking scenarios. However, these frameworks only consider single FL task.

To solve the above-mentioned problems, this work proposes a privacy-preserving FL framework with multiple FL tasks based on blockchain. The proposed framework builds a group for each FL task, and system users are free to choose to be a role in the group. Meanwhile, multiple groups run simultaneously. The main contributions can be summarised as follows:

The proposed blockchain-based FL privacy-preserving framework features flexibility and autonomy, which can perform multiple tasks in parallel.
The proposed framework combines Paillier Homomorphic Encryption and the Pearson Correlation Coefficient (PCCs) to achieve a balance between privacy protection and high accuracy global models.
Our framework proposes an incentive mechanism related to quality detection results, which guarantee the reward fairness to encourage clients to actively participate in FL.

The remaining sections are organised as follows. Section 2 introduces the related work. In Section 3, we describe the related technologies used in our framework. Section 4 presents the system model and security requirements in FL. Then the details of the system design are presented in Section 5. Section 6 describes the security analysis of the framework. Section 7 presents the experiment and performance analysis. Finally, the conclusion of this paper can be found in Section 8.

Table 1. Notations.

Display Table

2. Related work

The FL framework faces a lot of challenges, and a key challenge among them is privacy. To address this issue, Xu et al. (Citation2019) proposed HybridAlpha, which implements SMC by introducing a trusted third party and adopts functional encryption. Hao et al. (Citation2020) propose PEFL to enhance FL privacy protection, which is a noninteractive secure aggregation protocol that utilises improved BGV HE to encrypt local model gradients and embed ciphertext into augmented learning with error (A-LWE) term. Fang et al. (Citation2022) propose a nondestructive accuracy compression method and a secure aggregation protocol with privacy protection by integrating verifiable secret sharing, polynomial commitment and gradient mask techniques. In addition, So et al. (Citation2023) propose Multi-RoundSecAgg to avoid multiple rounds of privacy leakage in FL by introducing a new metric to capture long-term privacy. The experimental results show that Multi-RoundSecAgg provides better long-term privacy protection while ensuring the global model accuracy (Table ).

Table 2. Comparison of schemes of introduction and related work section.

Display Table

These schemes include a centralised server, which may lead to the single point of failure. Furthermore, the data owner is reluctant to participate in FL tasks due to the consideration that training local models will consume computation and communication resources (Fang et al., Citation2014) of local devices. The blockchain technology, which functions as a distributed ledger system that incorporates a reward mechanism presents a suitable resolution for these issues. At the same time, the openness and transparency of blockchain enable any node to access the data on the chain, thereby the FL framework based on blockchain has a higher demand for privacy protection.

To cope the challenge, Wan et al. (Citation2022b) proposed an FL privacy-preserving scheme based on blockchain for B5G-Driven edge computing, which protects the FL local model parameters privacy through the integration of Wasserstein Generative Adversarial Network and DP. Qi et al. (Citation2021) use DP to protect vehicle location to prevent errors or low-quality models uploaded by malicious clients. In addition, they effectively protect user privacy by introducing encryption mechanisms.

Encrypting the gradients of local models effectively protects user privacy, but it also brings the drawback that the aggregator cannot identify the accuracy of local models contributed by participants. That's in FL, the accuracy of local models is closely related to the accuracy of the global model, and even worse, large portions of low accuracy local models can lead to global model convergence failure.

Considering lazy clients may upload a fake local model, Rückel et al. (Citation2022) use a non-interactive zero-knowledge proof to enable clients to verify whether other clients are actually training on their local data set. Based on the assumption that participants are dishonest, Bao et al. (Citation2019) purpose federated learning chain (FLchain). FLchain removes incorrect calculations by recording the information about the trainer and FL models and comparing it with computation result on-chain. Wang et al. (Citation2022) propose the double local disturbance localised differential privacy (DLD-LDP) algorithm, which achieves the balance between data quality loss and resource consumption while protecting privacy. At the same time, they propose a reputation calculation algorithm (Sig-RCU) to select high reputation clients to improve global model accuracy. In 2019, Zhao et al. (Citation2019) proposed an exponential-based trust and reputation evaluation system called ETRES to weaken malicious attacks within the wireless sensor networks. Recently, Wang et al. (Citation2022) propose a blockchain-based privacy-preserving federated learning scheme named BPFL, which adopts a variant of the Byzantine resilient aggregation rule Krum Multi-Krum technology to filter out malicious updates. BPFL contains a set of validation nodes and several aggregation nodes, where the validation nodes filter out the local models with malicious updates, while the aggregation nodes aggregate the global models.

In the light of the abundance of data generated every day, these data relate to various domains. Different industries also take the significant advantages of FL to serve themselves. Consequently, the quantity of tasks associated with FL is increasing rapidly, with complex and interlaced aspects. Therefore, an FL framework capable of multi-task in parallel is urgently needed. However, the above frameworks have not taken this critical requirement into consideration.

To solve the above-mentioned problems, this paper proposes a public blockchain-based FL framework. We personalise the selection committee and clients for each FL task and establish a solitary group with the publisher. When selecting group members, we took the user's wishes into consideration and built a flexible and free FL system.

3. Preliminary

In this section, we present the techniques of our framework for the clear statement, including blockchain, homomorphic encryption, the Pearson correlation coefficient and the function for add noisy.

3.1. Blockchain

The blockchain is a distributed record-keeping ledger that comprises a sequence of blocks. Blockchains can be classified into three types: public blockchains, consortium blockchains and private blockchains. Our framework is specifically designed for use with public blockchains.

In our framework, the blockchain storage space is divided into two areas: the publish area and the client area (as shown in Figure ). Users are granted varying permissions based on their respective roles in different areas. The publish area is designated for the storage of FL task parameters that are published by the publisher, whereas the client area is intended for saving local model parameters that are uploaded by the clients. We enable users to participate in collaborative training in FL more efficiently by splitting the areas, while also better organising multiple parallel FL tasks. For different system users, users who want to obtain a global model can publish FL task and reward details in the publish area. Of course, for users who want to get rewards by participating in training with local datasets, they can view and choose to participate in certain FL tasks from the publish area. More importantly, users who want to receive rewards but do not want to involve local datasets can campaign for committee members, using only local computing and communication resources to achieve their goals. This approach can mobilise the subjective initiative of system users and significantly enhance the flexibility of the system.

Figure 1. Blockchain.

We adopt a committee consensus mechanism to validate, detect and aggregate local models. The members of the committee are composed of endorsement nodes. The selection of endorsement nodes adopts the delegated proof of stake (DPOS) (Xu et al., Citation2020) election mechanism. During the FL task process, committee members need to reach an internal consensus, which can be achieved when more than 1/2 of endorsement nodes agree.

3.2. Federated learning

Federated learning is a distributed machine learning model proposed to address the issue of data owner privacy leakage caused by directly sending local datasets. It achieves “data immutable model movement”, thereby eliminating privacy concerns of data owners and connecting data silos. The multi-task FL in our paper refers to the presence of multiple FL tasks executing simultaneously in our system, as shown in Figure . There are four FL tasks executing in parallel at time t2.

Figure 2. Multi-task federated learning execution time graph.

3.3. Homomorphic encryption

Homomorphic encryption, which supports ciphertext computation, is often used for privacy protection. Our framework adopts Paillier homomorphic encryption (Paillier, Citation1999) to realise the privacy protection of clients. The Paillier encryption algorithm consists of the following three parts:

$HE . KeyGen (L) \to (Hpk, Hsk)$ We let $L$ represent the safety parameter. Randomly select two large prime p, q , where p and q have the same bit length $L$ . We compute $n = p * q$ and $λ = lcm (p - 1, q - 1)$ . The function $lcm (.)$ is typically used to calculate the least common multiple of two integers. We randomly select $g \in Z_{n^{2}}^{*}$ . The homomorphic private key $Hsk = (λ)$ and homomorphic public key $Hpk = (n, g)$ .
$HE . Enc (Hpk, x) \to c$ . Enc is an encryption function. c is the ciphertext encrypted by HPK and x is the plaintext corresponding to c. User selects a random number r and encrypt x according to the following formula: (1) $\begin{aligned} [[x]]_{Hpk} = g^{x} r^{n} \mod n^{2} . \end{aligned}$ (1)
$HE . Dec (Hsk, c) \to x$ . Dec is a decryption function. We define $L (d) = (d - 1) / n$ . Upon receiving the encrypted data c, the user with Hsk can directly decrypt it to obtain x: (2) $\begin{aligned} x = (L (c^{λ} \mod n^{2}) / L (g^{λ} \mod n^{2})) modn \end{aligned}$ (2)

Paillier homomorphic encryption has the linear property of formula (Equation3a(3a) $\begin{aligned} [[x_{1}]]_{Hpk} \cdot [[x_{2}]]_{Hpk} & = [[x_{1} + x_{2}]]_{Hpk} \end{aligned}$ (3a) ), that is, given two plaintexts $x_{1}$ and $x_{2}$ , a random number r and a public key Hpk, we have (3a) $\begin{aligned} [[x_{1}]]_{Hpk} \cdot [[x_{2}]]_{Hpk} & = [[x_{1} + x_{2}]]_{Hpk} \end{aligned}$ (3a) (3b) $\begin{aligned} [[x_{1}]]_{Hpk}^{r} & = [[r x_{1}]]_{Hpk} \end{aligned}$ (3b)

3.4. Pearson correlation coefficient

The Pearson Correlation Coefficient ( PCC) is a statistical measure used to evaluate the linear correlation between two variables X and Y, where the coefficient value falls between −1 and 1. Its calculation formula is presented in Formula (Equation4(4) $\begin{aligned} ρ (X, Y) = \frac{Cov (X, Y)}{σ (X) σ (Y)} \end{aligned}$ (4) ). In our work, we employ the Pearson correlation coefficient to assess the correlation between the local gradient model and the global model. (4) $\begin{aligned} ρ (X, Y) = \frac{Cov (X, Y)}{σ (X) σ (Y)} \end{aligned}$ (4)

3.5. Function for add noisy

We define function $A (.)$ and $B (.)$ as noise-adding functions based on additive homomorphic encryption and multiplicative homomorphic encryption, respectively. The specific calculation of function $A (.)$ and $B (.)$ is shown in the formula (5) $\begin{aligned} A (c, r) & = c \cdot [[r]]_{Hpk} = [[x + r]]_{Hpk} = [[\hat{x}]]_{Hpk} \end{aligned}$ (5) (6) $\begin{aligned} B (c, r) = c^{r} & = [[rx]]_{Hpk} = [[\tilde{x}]]_{Hpk} \end{aligned}$ (6) where r is a random number, c is the ciphertext homomorphic encrypted corresponding to plaintext m, that is, c and m satisfied $c = [[x]]_{pk}$ . We denote $\hat{x}$ and $\tilde{x}$ as the noised result additive noisy and multiplicative noisy respectively. $A$ is used in the global model aggregation phase, and $B$ is used in the quality detection phase. More details will be given in Section 5. We have $ρ ([[\tilde{X}]]_{Hpk}, [[\tilde{Y}]]_{Hpk}) = ρ ([[X]]_{Hpk}, [[Y]]_{Hpk})$ . The correctness proof is as follows

Correctness proof: The covariance of $\tilde{X}$ and $\tilde{Y}$ $Cov (\tilde{X}, \tilde{Y})$ can be transformed as follows: $\begin{aligned} Cov (\tilde{X}, \tilde{Y}) & = Cov (r_{1} \cdot X, r_{2} \cdot Y) \\ = E [(r_{1} \cdot X - \bar{r_{1} \cdot X}) (r_{2} \cdot Y - \bar{r_{2} \cdot Y})] \\ = r_{1} r_{2} E [(X - \bar{X}) (Y - \bar{Y})] \\ = r_{1} r_{2} Cov (X, Y) \end{aligned}$ where $E (\cdot)$ denotes the expectation and $\bar{X}$ is the average of variable X. At the same time, the standard deviation $σ (\bar{X})$ of $\bar{X}$ has $\begin{aligned} σ (\bar{X}) & = σ (r_{1} X) \\ = \sqrt{E ((r_{1} X)^{2}) - (E (r_{1} X))^{2}} \\ = r_{1} \sqrt{E ((X)^{2}) - (E (X))^{2}} \\ = r_{1} σ (X) \end{aligned}$ Similarly, we can get $σ (\bar{Y}) = r_{2} σ (Y)$ . Combining the above derivation, we can obtain $\begin{aligned} ρ (\tilde{X}, \tilde{Y}) & = \frac{Cov (\tilde{X}, \tilde{Y})}{σ (\tilde{X}) σ (\tilde{Y})} \\ = \frac{r_{1} r_{2} C o v (X, Y)}{r_{1} σ (X) r_{2} σ (Y)} \\ = \frac{Cov (X, Y)}{σ (X) σ (Y)} \\ = ρ (X, Y) \end{aligned}$ We combine Paillier and Pearson coefficients and apply them to our scheme. We take the random numbers r1 and r2 as noise and add noise to the parameters by using the homomorphic property of Paillier. Consequently, we can get $ρ ([[\tilde{X}]]_{Hpk}, [[\tilde{Y}]]_{Hpk}) = ρ ([[X]]_{Hpk}, [[Y]]_{Hpk})$ . More details on the application of this formula will be shown in Section 5.

4. System model and security requirement

In this section, we first present the system model of our proposed system and describe the threat model of FL task. Then, we introduce the design goals. Table lists the notations used in our paper.

4.1. System model

We propose a blockchain-based FL secure aggregation framework to protect the privacy of clients' datasets. Figure shows our system model that includes three roles: publisher, client and committee.

Figure 3. Framework.

Publisher: The publisher P is responsible for publishing the FL task and paying for the reward. When P has a requirement to get a global model, he will submit a request to the publish area in the blockchain. Besides, P also helps committee to compute some data. After FL complete, P gets a global model.

Client: Client $U_{i}$ trains local model gradient parameters using his local datasets. Finally, $U_{i}$ will receive rewards after FL task complete. Users download the training task from the publish area, select the task they want join and submit intentions to client area.

Committee: The committee detects the quality of local model and aggregates the global model. The committee contains a set of endorsement nodes. In the proposed framework, we use DPOS contract mechanism to select endorsement nodes. According to the previous record of participating performance in FL tasks, nodes with stable network connections and high reputation will be selected as endorsement nodes.

It is worth noting that the three roles of users in the system are registered users of our system. Our system assigns different identities to each user according to their different needs.

4.2. Threat model

Based on the framework in Zamani et al. work (Melis et al., Citation2019; Nasr et al., Citation2019; Zamani et al., Citation2018), the purpose of an adversary is to reduce the accuracy of the global model and even make the global model fail to converge or gain user privacy. The most common method they use is to carry out various attacks on user local models or local data sets. In our proposed framework, we assume that the system users are semi-honest. They will honestly perform all processes, such as encryption and decryption, as well as uploading and sending. However, they want to obtain the dataset features or meaningful information of the other user. Our proposed framework follows the standard threat model in many blockchain systems (Cai et al., Citation2019; Zamani et al., Citation2018), where at least 2/3 of all users are assumed to be honest. For the committee members selected from the blockchain, we assume that more than 2/3 of them will perform their work properly. Certainly, as a core part of conducting the FL aggregation global model work, they are vulnerable to attacks. The committee members who have been successfully attacked will result in incorrect calculations. We assume that even if committee members are attacked, more than half of them still perform normally. Similar to committee members, some clients may be attacked. As the global model owner, we assume the publisher is honest for a high-accuracy model. Here we assume that there is no collusion between the committee, the publisher and the client. Additionally, we assume clients and committee members can securely obtain the correct initialization global model parameters. By assuming these conditions, we can design and implement effective countermeasures against potential security threats in FL.

4.3. Security requirements

The security requirements of federated learning in applications mainly include the following:

Resistance to gradient leakage attack: FL was originally proposed to prevent privacy leakage caused by direct transmission of local data. However, some studies have shown that the attacker can obtain privacy and sensitive information from the local model parameter upload by client. Therefore, the FL framework should take further measures to prevent privacy leakage.
Safeguard global model accuracy: The global model is the result of work and pull together, which involves mass communication and computational overhead. However, the adversary is able to attack some clients' computers and inject wrong data or marked data (Bagdasaryan et al., Citation2020; Nuding & Mayer, Citation2020) into their local datasets. This will affect the accuracy of the local model and results in an unavailable global model. An FL framework should avoid this kind of malignant result.
Confidentiality: In our framework, the global model is not shared and is purchased by the publisher, so final only the publisher is available. Therefore, in the process of collaboration among all users, we need to ensure this. In addition, if the publisher obtains the local model gradients of each participant, there is also a risk of privacy leakage for the client. Therefore, we need to ensure the confidentiality of the data during transmission, and only specific individuals can obtain the original data.
System Robustness: Users may be disconnected in FL learning process, the cloud server responsible for aggregation may consciously favour some participants and malicious users may exist. These will affect the accuracy of the global model, so it is necessary to ensure the stability and correctness of the framework.

5. The proposed framework

This section describes the process of our FL framework.

5.1. System setup

Step 1: The publisher P generates homomorphic private key $Hsk = (λ)$ and the corresponding public key $Hpk = (n, g)$ using $HE . KeyGen ()$ . After that, P calculates a signature $S_{P} = Sign (s k_{P}, Inf)$ , where $Inf = (Hpk | | address | | m | | k | | ω_{G}^{0} | | {Max}_{t} | | money | | θ)$ , where address is the address of P, m and k represent the number of endorsement nodes and clients respectively, $ω_{G}^{0}$ represents initialise the global model, $Ma x_{t}$ represents the maximum number of iterations, money and θ represent the amount to be paid and the reward proportion of the clients, respectively. Finally, P submits a requirement $Req = (Inf, S_{P})$ to publish area of blockchain.

Step 2: The smart contract(SC) of blockchain computes $Ver (P K_{P}, Inf) \overset{?}{=} true$ , if the signature is correct, SC checks the account balance of P, if it is enough to pay, SC freezes the account of the prepaid, selects committee members (endorsement nodes) $E N_{j}$ and clients $U_{i}$ as the following condition (Equation7a(7a) $\begin{aligned} P \cap U_{i} = \emptyset . \end{aligned}$ (7a) ). Otherwise, SC rejects the requirement. (7a) $\begin{aligned} P \cap U_{i} = \emptyset . \end{aligned}$ (7a) (7b) $\begin{aligned} P \cap E N_{j} = \emptyset . \end{aligned}$ (7b) (7c) $\begin{aligned} U_{i} \cap E N_{j} = \emptyset . \end{aligned}$ (7c) Finally, SC assigns a unique FL task identification serial number FID to the task and set task state FID.signal = 0, Where FID.signal is 0 or 1, which respectively represent FID in progress and completed. Later, SC publishes $FL = (FID | | FID . signal | | P | | Req)$ to the publish area. Figure shows the comprehensible system setup process.

Figure 4. System setup.

5.2. Training phase

Step 1: If iteration t = 1, $U_{i}, i \in [1, k]$ downloads the global model $ω_{G}^{0}$ from the publish area. Otherwise, $U_{i}$ receives the global model ciphertext $[[ω_{G}^{t - 1}]]_{p k_{i}}$ from P. $U_{i}$ decrypts $[[ω_{G}^{t - 1}]]_{p k_{i}}$ with $s k_{i}$ , obtains the global model $ω_{G}^{t - 1}$ , trains on the local dataset and gets the local model $ω_{i}^{t}$ .

Step 2: $U_{i}$ executes $HE . Enc (Hpk, ω_{i}^{t})$ and gets $[[ω_{i}^{t}]]_{Hpk}$ . Later, $U_{i}$ selects a random number a and calculates $A = g^{a}$ . $U_{i}$ calculates $c_{i} = [[ω_{i}^{t}]]_{Hpk} \oplus H (CP K^{a})$ , where ⊕ denotes xor operation. Then $U_{i}$ calculates signature $S_{i} = Sign (s k_{i}, c_{i})$ . Finally, $U_{i}$ sends $Mode l_{i} = (S_{i} | | c_{i} | | A)$ to committee.

5.3. Quality detect

Step 1: Committee computes $Ver (p k_{i}, c_{i}) \overset{?}{=} true$ , if the verification fails, committee discard $Mode l_{i}$ . Otherwise, committee calculates $[[ω_{i}^{t}]]_{Hpk} = c_{i} \oplus H (A^{CSK})$ . Where $A^{CSK} = CP K^{a} = g^{aCSK}$ . CPK and CSK represent a public and private key pair generated by the committee and satisfy $CPK = g^{CSK}$ .

Step 2: Committee detects the quality to $ω_{i}^{t}$ by computing PCCs. The specific calculation process is as follows:

Add noise to $[[ω_{i}^{t}]]_{Hpk}$ : Committee randomly selects a nonzero integer $r_{i}$ and executes $A ([[ω_{i}^{t}]]_{Hpk}, r_{i})$ get $[[\hat{ω_{i}^{t}}]]_{Hpk}$ , where $A ()$ is defined in Section 3.4.
Committee sends noised result $[[\hat{ω_{i}^{t}}]]_{Hpk}$ to P.
P decrypts and gets noised result $\hat{ω_{i}^{t}}$ and calculates the average $\hat{Ave}$ .
P encrypts with Hpk (show in Equation Equation1(1) $\begin{aligned} [[x]]_{Hpk} = g^{x} r^{n} \mod n^{2} . \end{aligned}$ (1) ) and sends ciphertext $[[\hat{Ave}]]_{Hpk}$ to committee.
Committee eliminates noise (show in Equation Equation8(8) $\begin{aligned} [[Ave]]_{Hpk} = [[\hat{Ave}]]_{Hpk} \cdot {[[\frac{1}{k} \sum_{i = 1}^{k} r_{i}]]}_{Hpk}^{- 1} \end{aligned}$ (8) ) gets ciphertext of average $[[Ave]]_{Hpk}$ . (8) $\begin{aligned} [[Ave]]_{Hpk} = [[\hat{Ave}]]_{Hpk} \cdot {[[\frac{1}{k} \sum_{i = 1}^{k} r_{i}]]}_{Hpk}^{- 1} \end{aligned}$ (8) Algorithm 1 describes the details of the average compute process.

Step 3: P assists committee to calculate PCCs. The process details are as follows:

Add noise to $[[Ave]]_{Hpk}$ and $[[ω_{i}^{t}]]_{Hpk}$ . Committee randomly selects two nonzero integers $r_{1}$ and $r_{2}$ and executes $B ([[Ave]]_{Hpk}, r_{1})$ and $B ([[ω_{i}^{t}]]_{Hpk}, r_{2})$ to get $[[\tilde{Ave}]]_{Hpk}$ and $[[\tilde{ω_{i}^{t}}]]_{Hpk}$ respectively, where $B ()$ is defined in Section 3.4.
Committee sends $[[\tilde{Ave}]]_{Hpk}$ , $[[\tilde{ω_{i}^{t}}]]_{Hpk}$ to P.
P decrypts $[[\tilde{Ave}]]_{Hpk}$ and $[[\tilde{ω_{i}^{t}}]]_{Hpk}$ with Hsk and gets $\tilde{Ave}$ and $\tilde{ω_{i}^{t}}$ .
Later, P calculates the PCCs $ρ_{i}$ between $\tilde{Ave}$ and $\tilde{ω_{i}^{t}}$ , then uploads transaction $TRA = {t | | (p k_{1}, ρ_{1}) | | (p k_{2}, ρ_{2}) | | \dots | | (p k_{k}, ρ_{k})}$ to client area.
Algorithm 2 describes the detailed of compute PCCs.

5.4. Filter out unqualified local models

Committee downloads TRA and calculates the weight $W_{i}$ (show in Equation Equation9(9) $\begin{aligned} W_{i} = max {0, l n (\frac{1 + ρ_{i}}{1 - ρ_{i}}) - 0.5} \end{aligned}$ (9) ) of $U_{i}$ . (9) $\begin{aligned} W_{i} = max {0, l n (\frac{1 + ρ_{i}}{1 - ρ_{i}}) - 0.5} \end{aligned}$ (9) It is not difficult to find that PCCs and weight are positively related. Here we denote the minimum value of $W_{i}$ as 0. If $W_{i} < 0$ , we think that the accuracy of $ω_{i}^{t}$ is too low and the quality detection is unqualified. $ω_{i}^{t}$ will not take part in the global model aggregation, and $U_{i}$ will not be rewarded in this iteration. Later, committee uploads $Si g_{W} = (CSK, Wlist)$ to client area, where $Wlist = (t | | W_{1} | | W_{2} | | \dots | | W_{k})$ .

5.5. Aggregate local model

The committee aggregates the local model and get $[[ω_{sum}^{t}]]_{Hpk}$ . The calculation of the aggregate process is shown in Equation (Equation10(10) $\begin{aligned} [[ω_{sum}^{t}]]_{Hpk} = \prod_{i = 1}^{k} [[ω_{i}^{t}]]_{Hpk} (W_{i} \neq 0) . \end{aligned}$ (10) ). (10) $\begin{aligned} [[ω_{sum}^{t}]]_{Hpk} = \prod_{i = 1}^{k} [[ω_{i}^{t}]]_{Hpk} (W_{i} \neq 0) . \end{aligned}$ (10) Then, committee generates the signature $S_{c} = Sign (CSK, [[ω_{sum}^{t}]]_{Hpk}$ ) and submits $S_{c}$ and $[[ω_{sum}^{t}]]_{Hpk}$ to client area.

5.6. Update phase

P validates the signature $S_{c}$ and downloads $[[ω_{sum}^{t}]]_{Hpk}$ , executes $Dec (Hsk, [[ω_{sum}^{t}]]_{Hpk})$ and gets $ω_{sum}^{t}$ . Then P updates the global model $ω_{G}^{t}$ according to formula (Equation11(11) $\begin{aligned} ω_{G}^{t} = ω_{G}^{t - 1} - η \frac{ω_{sum}^{t}}{k_{1}} \end{aligned}$ (11) ), where $k_{1}$ denotes the number of $W_{i} > 0, i \in [1, k]$ . Later, P evaluates the accuracy of $ω_{G}^{t}$ . If the accuracy requirements are satisfied or $t = Ma x_{t}$ , P set FID.signal = complete, SC executes the reward distribute. Otherwise, P calculates $[[ω_{G}^{t}]]_{p k_{i}} = Enc (p k_{i}, ω_{G}^{t}), i \in [1, k]$ and sent it to $U_{i}$ . Finally, P sets t = t + 1 and turn to the training phase. (11) $\begin{aligned} ω_{G}^{t} = ω_{G}^{t - 1} - η \frac{ω_{sum}^{t}}{k_{1}} \end{aligned}$ (11)

5.7. Reward distribute phase

After the FL task is complete, both clients and the committee members should get the reward. For different identities, we adopt different allocation schemes. Clients should allocate the rewards according to the weight ratio in each iteration. $U_{i}$ gets rewards $Re w_{i}$ shown in Equation (Equation12(12) $\begin{aligned} Re w_{i} & = \frac{money \cdot θ \cdot Σ_{t = 0}^{T} W_{i}}{Σ_{i = 1}^{k} Σ_{t = 0}^{T} W_{i}} (i = 1, 2, \dots, k) \end{aligned}$ (12) ). Here, we ignore the consume of compute and communicate and the differences between their datasets. We take the correlation between Ave and $ω_{i}^{t}$ as the unique standard for reward distribution. For the committee members $E N_{j}$ , the workload of each is same, so the reward should be distributed equally. $E N_{j}$ get rewards $Re w_{j}$ shown in Equation (Equation13(13) $\begin{aligned} Re w_{j} & = \frac{money \cdot (1 - θ)}{m} (j = 1, 2, \dots, m) \end{aligned}$ (13) ). T indicates the total number of iterations. (12) $\begin{aligned} Re w_{i} & = \frac{money \cdot θ \cdot Σ_{t = 0}^{T} W_{i}}{Σ_{i = 1}^{k} Σ_{t = 0}^{T} W_{i}} (i = 1, 2, \dots, k) \end{aligned}$ (12) (13) $\begin{aligned} Re w_{j} & = \frac{money \cdot (1 - θ)}{m} (j = 1, 2, \dots, m) \end{aligned}$ (13) Algorithm 3 describes the details of SC distribute reward, where $T_{x}$ represents the reward distribution transaction and $T_{x} = (sender, receiver, value)$ .

6. Security analysis

Resistance to gradient leakage attack:

Theorem

Our scheme for system users will not disclose any information about the local model gradient $ω_{i}^{t}$ .

Proof.

We use the encryption method that combines the Paillier HE and hash functions for local model gradient $ω_{i}^{t}$ . Client sends ciphertext $c_{i}$ to committee, $c_{i} = [[ω_{i}^{t}]]_{Hpk} \oplus H (CP K^{a})$ . The committee can easily obtain $[[ω_{i}^{t}]]_{Hpk}$ . But if the committee wants to get $ω_{i}^{t}$ , it must solve the hardness of large integer decomposition problem of the Paillier HE (Zhou et al., Citation2015). So our scheme provides CPA-secure. As a result, committee members are unable to obtain any explicit information. Furthermore, hash functions have collision resistance and irreversibility. These features prevent publisher $P$ from knowing any information in $ω_{i}^{t}$ .

Safeguard global model accuracy: In our framework, we assume that the attackers have successfully attacked some clients's local datasets and tampered with them. In addition, these clients will train local models with low accuracy or error and send them to the committee. However, in our framework, quality detection is achieved through PCCs, and we believe that there is a significant difference between the local model generated by the tampered dataset and the normal dataset. It will be filtered out during the quality inspection phase. With the help of publishers, the committee directly calculates based on homomorphic ciphertext, and only the calculation results that reach consensus can be recognised. Therefore, committee members cannot forge detection results. Therefore, local models with low accuracy or errors will inevitably be filtered out, effectively preventing their impact on the accuracy of the global model. Therefore, the accuracy of the global model is guaranteed.

Confidentiality: The confidentiality of the local model of the client has been explained in the resistance to gradient leakage attack section. Here, we consider the confidentiality of the global model. From the details of the local model stage aggregation, it can be seen that the committee directly calculates homomorphic ciphertext. This means that throughout the entire calculation process, the committee cannot obtain any information about the global model. Security provided by Paillier HE.

System robustness: For the case of user disconnection, our framework supports the disconnected user to join again. For malicious users and poisoning attacks, our framework can effectively screen them for quality detection to ensure global model accuracy. The behaviour of some malicious members of the committee intends to affect the accuracy of the global model, we adopt a consensus mechanism to prevent malicious results. The robustness of our framework is guaranteed through these mechanisms.

Through the above security analysis of the framework, we know that our framework meets the security requirements in FL.

7. Experiment and performance analysis

7.1. Experimental methodology

Our framework is proposed based on the Ethereum public chain. The operating mechanism of the blockchain has not been modified in our framework, we only modify the operation content (SC, committees, etc.). And we focus on privacy protection for FL. Therefore, we only test FL performance here in experiment.

We use PyTorch (version 1.9.1+cu111) and python (version 3.9.11) to build an FL running environment. After local model training, each participant uploads the local model parameters, then the committee calculates PCCs and weight and aggregate global model. After updating the global model, each participant uses the new global model to train the local model.

Experiments are conducted on Windows 11 with a GeForce MX250 2GB GPU and 8GB of RAM. In our experiments, we use the image dataset MNIST of handwritten digits as experiment dataset. MNIST contains 60,000 training samples and 10,000 test samples. We first randomly divided the training dataset into 100 parts. Then we assign each part to a participant as the participant's local dataset. Thus each participant has 600 training samples. We randomly selected 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10 participants from 100 participants for the experiment. We use a Convolutional Neural Network (CNN) as the training model, which consists of a convolutional layer, pooling layer, dropout layer, and fully connected layer. We use the Stochastic Gradient Descent (SGD) algorithm. The learning rate η is set as 0.01 and the momentum set as 0.5. We also perform 10, 20, 30 and 40 iterations, and each participant trains with 600 training samples to get local gradient model parameters and tests accuracy with 128 randomly selected test samples.

7.2. Performance evaluation

We evaluate the performance of FL from two aspects: execution time and accuracy. Table presents detailed experimental data.

Table 3. Execution time and global model accuracy for different iterations with different number of clients.

Display Table

Evaluate execution time: The execution time includes the training time and testing time of the model. Experiments perform FL with multiple participants (i.e. clients) for 10, 20, 30 and 40 iterations respectively (as shown in the upper portion of Table ). From the data in the table, it can be seen that when the six participants perform 10, 20, 30 and 40 iterations respectively, the execution time requires 177.43 s, 356.80 s, 530.30 s and 664.28 s. When the number of participants is the same and the number of iterations increases, the longer the execution time required, and the execution time is linear positively correlated with the number of iterations. Figure provides an visual histogram of the execution time of different numbers of participants under distinct iterations. It is not difficult to find that the more iterations, the larger the slope of the line graph for execution time.

Figure 5. Execution time cost of participants.

Evaluate the global model accuracy: We evaluated the global model accuracy of different numbers of participants running different iterations, and the experimental data is shown in the lower portion of Table . First, we compare the data in the same column, for example, with 3 participants, and perform 10, 20, 30 and 40 iterations respectively. The accuracy of the global model is 96.48 $%$ , 97.27 $%$ , 97.85 $%$ and 98.20 $%$ . Similarly, comparing data from other columns, it is not difficult to find that when the number of participants is constant, the global model accuracy increases with the number of iterations. When 10 participants conducted 40 iterations, the global model accuracy reached the highest value of 98.42 %. Second, we compare the data listed between columns. Compare the three columns with 1, 4 and 10 participants (coloured columns in Table ), we found that the global model accuracy improves with the increase of the number of participants. Although a few specific experimental data do not conform to this pattern, such as when the number of iterations is 30, the global model accuracy trained by 6 participants is 98.13 $%$ , while the global model accuracy trained by 7 participants is only 98.03 $%$ , with a decrease of 0.10 $%$ . However, scan widely all experimental data, along with the number of participants and iterations increase, the accuracy of the global model shows an upward trend.

In particular, FL degenerates into traditional machine learning when only one participant is training, since all datasets are sourced from the same place and it is unnecessary for the model aggregates with other participants. It can be seen from the experiment that the more participants, the more samples can be used for training, and the model will have more effective features, so the accuracy will increase accordingly. Figure shows the variation of global model accuracy in the form of a line graph.

Figure 6. Global model accuracy of participants.

7.3. Overhead analysis

We analyse the computational overhead and communication overhead complexity from the perspective of different system user roles. Note: In this part, we neglect the disconnected users and carry out complexity analysis with the number of online clients as n. The description of the symbols we use is shown in Table .

Table 4. Sample of computational overhead analysis.

Display Table

Computation overhead:

Publisher: In quality detect phase, publisher $P$ decrypts the local model gradient separately. So, the computation cost of $P$ is $3 n T_{HDe} + T_{HEn}$ . And in update phase, $P$ has $T_{vsig} + T_{HDe} + n T_{ECC}$ computation cost. Therefore, the computational overhead of $P$ can be expressed as $(3 n + 1) T_{HDe} + n T_{ECC} + T_{HEn} + T_{vsig}$ .

Client: The client needs to decrypt the updated global model before starting a new iteration and then conduct local training. Afterwards, encrypt the local model and send it to the committee. The total computational cost is $T_{DECC} + T_{HEn} + T_{\exp} + T_{hash} + T_{xor} + T_{sig}$ .

Committee: Committee members act as a link in the FL task. They need to interact with both the publisher $P$ and the client. When interacting with clients, the computational cost is $n (T_{vsig} + T_{\exp} + T_{hash} + T_{xor})$ . And in quality detection phase, the computational cost when interacting with the $P$ is $n T_{HA} + (2 n + 1) T_{HM} + T_{sig}$ . Furthermore, the total computational cost of committee is $n (T_{vsig} + T_{\exp} + T_{hash} + T_{xor} + T_{HA}) + (2 n + 1) T_{HM} + T_{sig}$ .

Communication overhead:

We suppose that the homomorphic ciphertext length is $C_{HE}$ bits, the signature length is $C_{sig}$ bits, and the length of A in training phase is $C_{A}$ bits. Then the communication cost to send for each client is ${C R}_{cli} = m (C_{HE} + C_{sig} + C_{A})$ . The total communication cost consists of two parts: send cost and receive cost. So the total communication cost of clients can be expressed as ${C T}_{cli} = n ((m + 1) C_{HE} + m (C_{sig} + C_{A}))$ . ${C S}_{com}$ represents the communication cost to receive of the committee. We have ${C S}_{com} = (3 n + 2) C_{HE}$ . The total communication of committee ${C T}_{com} = nm (C_{HE} + C_{sig} + C_{A}) + (3 n + 2) C_{HE}$ . Similarly, we can deduce that the total communication cost for publishers ${C T}_{pub}$ is $C_{HE} + n C_{ECC}$ . $C_{ECC}$ represents the bit length of ECC output.

The above-mentioned computational and communication complexity analyses are aimed at once FL iteration. FL task contains T iterations.

8. Conclusion

This article proposes a multi-task FL framework for privacy protection based on blockchain. This framework achieves synchronous execution of multiple FL tasks through SC. In addition, we combined Paillier HE and PCCs to achieve privacy protection and quality detection, while integrating incentive mechanism functions. We construct a group for each FL task, consisting of three types of roles that collaborate to complete the learning process, and all members are sourced from system users. Unlike many previous frameworks that pursue FL privacy protection and secure aggregation, we are more concerned with building an FL system that enables user collaboration and sharing. We only formulate rules and do not participate in any FL learning process. We have analysed the security of the system, and the analysis results prove that our framework complies with FL's security requirement. The experimental results conducted on MINST show that the accuracy of the global model can reach up to 98.03 $%$ , and our analysis of computational and communication complexity shows that these costs are acceptable to users.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by National Natural Science Foundation of China[62202390]Science and Technology Fund of Sichuan Province[2022NSFSC0556]Science and Technology Fund of Sichuan Province[2023YFG0306]National Natural Science Foundation of China[62177019].

References

Ayaz, F., Sheng, Z., Tian, D., & Guan, Y. L. (2022). A blockchain based federated learning for message dissemination in vehicular networks. IEEE Transactions on Vehicular Technology, 71(2), 1927–1940. https://doi.org/10.1109/TVT.2021.3132226
Web of Science ®Google Scholar
Bagdasaryan, E., Veit, A., Hua, Y., Estrin, D., & Shmatikov, V. (2020). How to backdoor federated learning. In International conference on artificial intelligence and statistics (pp. 2938–2948). PMLR.
Google Scholar
Bao, X., Su, C., Xiong, Y., Huang, W., & Hu, Y. (2019). FLChain: A blockchain for auditable federated learning with trust and incentive. In: 2019 5th international conference on big data computing and communications (BIGCOM) (pp. 151–159). https://doi.org/10.1109/BIGCOM.2019.00030
Google Scholar
Bonawitz, K., Ivanov, V., Kreuter, B., Marcedone, A., Mcmahan, H. B., Patel, S., Ramage, D., Segal, A., & Seth, K. (2017). Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications securit (pp. 1175–1191).
Google Scholar
Cai, C., Zheng, Y., Zhou, A., & Wang, C. (2019). Building a secure knowledge marketplace over crowdsensed data streams. IEEE Transactions on Dependable and Secure Computing, 18(6), 2601–2021. https://doi.org/10.1109/TDSC.2019.2958901
Web of Science ®Google Scholar
Chai, D., Wang, L., Chen, K., & Yang, Q. (2021). Secure federated matrix factorization. IEEE Intelligent Systems, 36(5), 11–20. https://doi.org/10.1109/MIS.2020.3014880
Web of Science ®Google Scholar
Fang, C., Guo, Y., Ma, J., Xie, H., & Wang, Y. (2022). A privacy-preserving and verifiable federated learning method based on blockchain. Computer Communications, 186, 1–11. https://www.sciencedirect.com/science/article/pii/S0140366422000081. https://doi.org/10.1016/j.comcom.2022.01.002
Web of Science ®Google Scholar
Fang, W., Li, Y., Zhang, H., Xiong, N., Lai, J., & Vasilakos, A. V. (2014). On the throughput-energy tradeoff for data transmission between cloud and mobile devices. Information Sciences, 283, 79–93.
Web of Science ®Google Scholar
Fu, A., Zhang, X., Xiong, N., Gao, Y., Wang, H., & Zhang, J. (2022). VFL: A verifiable federated learning with privacy-preserving for big data in industrial IoT. IEEE Transactions on Industrial Informatics, 18(5), 3316–2022. https://doi.org/10.1109/TII.2020.3036166
Web of Science ®Google Scholar
Geyer, R. C., Klein, T., & Nabi, M. (2017). Differentially private federated learning: A client level perspective. arXiv preprint, arXiv:1712.07557.
Google Scholar
Hao, M., Li, H., Luo, X., Xu, G., Yang, H., & Liu, S. (2020). Efficient and privacy-enhanced federated learning for industrial artificial intelligence. IEEE Transactions on Industrial Informatics, 16(10), 6532–6542. https://doi.org/10.1109/TII.2019.2945367
Web of Science ®Google Scholar
Hao, M., Li, H., Xu, G., Liu, S., & Yang, H. (2019). Towards efficient and privacy-preserving federated deep learning. In: ICC 2019–2019 IEEE International Conference on Communications (ICC) (pp. 1–6). https://doi.org/10.1109/ICC.2019.8761267
Google Scholar
He, Z., Wang, L., & Cai, Z. (2023). Clustered federated learning with adaptive local differential privacy on heterogeneous IoT data. IEEE Internet of Things Journal, 1–1. https://doi.org/10.1109/JIOT.2023.3299947
Web of Science ®Google Scholar
Hu, W.-J., Fan, J., Du, Y.-X., Li, B.-S., Xiong, N., & Bekkering, E. (2020). MDFC–ResNet: An agricultural IoT system to accurately recognize crop diseases. IEEE Access, 8, 115287–115298. https://doi.org/10.1109/Access.6287639
Web of Science ®Google Scholar
Kanagavelu, R., Li, Z., Samsudin, J., Yang, Y., Yang, F., Goh, R. S. M., Cheah, M., Wiwatphonthana, P., Akkarajitsakul, K., & Wangz, S. (2020). Two-phase multi-party computation enabled privacy-preserving federated learning. In: 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID) (pp. 410–419). https://doi.org/10.1109/CCGrid49817.2020.00-52
Google Scholar
Kang, L., Chen, R.-S., Xiong, N., Chen, Y.-C., Hu, Y.-X., & Chen, C.-M. (2019). Selecting hyper-parameters of gaussian process regression based on non-inertial particle swarm optimization in internet of things. IEEE Access, 7, 59504–59513. https://doi.org/10.1109/Access.6287639
Web of Science ®Google Scholar
Kim, H., Park, J., Bennis, M., & Kim, S. L. (2020). Blockchained on-device federated learning. IEEE Communications Letters, 24(6), 1279–1283. https://doi.org/10.1109/LCOMM.2019.2921755
Web of Science ®Google Scholar
Kumar, R., Khan, A. A., Kumar, J., Golilarz, N. A., Zhang, S., Ting, Y., Zheng, C., & Wang, W. (2021). Blockchain-federated-learning and deep learning models for COVID-19 detection using CT imaging. IEEE Sensors Journal, 21(14), 16301–16314. https://doi.org/10.1109/JSEN.2021.3076767
PubMed Web of Science ®Google Scholar
Kursawe, K., Danezis, G., & Kohlweiss, M. (2011). Privacy-friendly aggregation for the smart-grid. In: Privacy Enhancing Technologies: 11th International Symposium, PETS 2011, Waterloo, ON, Canada, July 27–29, 2011. proceedings (pp. 175–191). Springer.
Google Scholar
Li, A., Sun, J., Zeng, X., Zhang, M., Li, H., & Chen, Y. (2021). FedMask: Joint computation and communication-efficient personalized federated learning via heterogeneous masking. In: Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems (pp. 42–55).
Google Scholar
Lian, X., Zhang, C., Zhang, H., Hsieh, C. J., Zhang, W., & Liu, J. (2017). Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent. Advances in Neural Information Processing Systems, 30, arXiv:1705.09056.
Google Scholar
Liu, X., Li, H., Xu, G., Chen, Z., Huang, X., & Lu, R. (2021). Privacy-enhanced federated learning against poisoning adversaries. IEEE Transactions on Information Forensics and Security, 16, 4574–4588. https://doi.org/10.1109/TIFS.2021.3108434
Web of Science ®Google Scholar
Ma, C., Li, J., Shi, L., Ding, M., Wang, T., Han, Z., & Poor, H. V. (2022). ). When federated learning meets blockchain: A new distributed learning paradigm. IEEE Computational Intelligence Magazine, 17(3), 26–33. https://doi.org/10.1109/MCI.2022.3180932
Web of Science ®Google Scholar
Majeed, U., & Hong, C. S. (2019). FLchain: Federated learning via MEC-enabled blockchain network. In: 2019 20th Asia-Pacific Network Operations and Management Symposium (APNOMS) (pp. 1–4). https://doi.org/10.23919/APNOMS.2019.8892848
Google Scholar
Melis, L., Song, C., De Cristofaro, E., & Shmatikov, V. (2019). Exploiting unintended feature leakage in collaborative learning. In: 2019 IEEE symposium on security and privacy (SP) (pp. 691–706). https://doi.org/10.1109/SP.2019.00029
Google Scholar
Mo, F., Haddadi, H., Katevas, K., Marin, E., Perino, D., & Kourtellis, N. (2021). PPFL: Privacy-preserving federated learning with trusted execution environments. In: Proceedings of the 19th annual international conference on mobile systems, applications, and services (pp. 94–108)
Google Scholar
Nasr, M., Shokri, R., & Houmansadr, A. (2019). Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. In: 2019 IEEE symposium on security and privacy (SP) (pp. 739–753). https://doi.org/10.1109/SP.2019.00065
Google Scholar
Nuding, F., & Mayer, R. (2020). Poisoning attacks in federated learning: An evaluation on traffic sign classification. In: Proceedings of the tenth ACM conference on data and application security and privacy (pp. 168–170).
Google Scholar
Otoum, S., Al Ridhawi, I., & Mouftah, H. T. (2020). Blockchain-supported federated learning for trustworthy vehicular networks. In: Globecom 2020 – 2020 IEEE global communications conference (pp. 1–6). https://doi.org/10.1109/GLOBECOM42002.2020.9322159
Google Scholar
Paillier, P. (1999). Public-key cryptosystems based on composite degree residuosity classes. In: International conference on the theory and applications of cryptographic techniques (pp. 223–238). Springer.
Google Scholar
Phong, L. T., Aono, Y., Hayashi, T., Wang, L., & Moriai, S. (2018). Privacy-preserving deep learning via additively homomorphic encryption. IEEE Transactions on Information Forensics and Security, 13(5), 1333–1345. https://doi.org/10.1109/TIFS.2017.2787987
Web of Science ®Google Scholar
Qi, Y., Hossain, M. S., Nie, J., & Li, X. (2021). Privacy-preserving blockchain-based federated learning for traffic flow prediction. Future Generation Computer Systems, 117(2946), 328–337. https://doi.org/10.1016/j.future.2020.12.003
Google Scholar
Rückel, T., Sedlmeir, J., & Hofmann, P. (2022). Fairness, integrity, and privacy in a scalable blockchain-based federated learning system. Computer Networks, 202, 108621. https://doi.org/10.1016/j.comnet.2021.108621
Web of Science ®Google Scholar
Sayyad, S. (2020). Privacy preserving deep learning using secure multiparty computation. In: 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA) (pp. 139–142). https://doi.org/10.1109/ICIRCA48905.2020.9183133
Google Scholar
So, J., Ali, R. E., Güler, B., Jiao, J., & Avestimehr, A. S. (2023). Securing secure aggregation: Mitigating multi-round privacy leakage in federated learning. In: Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 37, pp. 9864–9873).
Google Scholar
Sotthiwat, E., Zhen, L., Li, Z., & Zhang, C. (2021). Partially encrypted multi-party computation for federated learning. In: 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet computing (CCGRID) (pp. 828–835). https://doi.org/10.1109/CCGrid51090.2021.00101
Google Scholar
Ullah, Z., Al-Turjman, F., Mostarda, L., & Gagliardi, R. (2020). Applications of artificial intelligence and machine learning in smart cities. Computer Communications, 154, 313–323. https://doi.org/10.1016/j.comcom.2020.02.069
Web of Science ®Google Scholar
Wan, Y., Qu, Y., Gao, L., & Xiang, Y. (2022a). Privacy-preserving blockchain-enabled federated learning for B5G-Driven edge computing. Computer Networks, 204, 108671. https://doi.org/10.1016/j.comnet.2021.108671
Web of Science ®Google Scholar
Wan, Y., Qu, Y., Gao, L., & Xiang, Y. (2022b). Privacy-preserving blockchain-enabled federated learning for B5G-Driven edge computing. Computer Networks, 204, 108671. https://www.sciencedirect.com/science/article/pii/S1389128621005454. https://doi.org/10.1016/j.comnet.2021.108671
Web of Science ®Google Scholar
Wang, N., Yang, W., Wang, X., Wu, L., Guan, Z., Du, X., & Guizani, M. (2022). A blockchain based privacy-preserving federated learning scheme for internet of vehicles. Digital Communications and Networks. https://www.sciencedirect.com/science/article/pii/S2352864822001134. https://doi.org/10.1016/j.dcan.2022.05.020
Web of Science ®Google Scholar
Wang, W., Wang, Y., Huang, Y., Mu, C., Sun, Z., Tong, X., & Cai, Z. (2022). Privacy protection federated learning system based on blockchain and edge computing in mobile crowdsourcing. Computer Networks, 215, 109206. https://www.sciencedirect.com/science/article/pii/S1389128622002936. https://doi.org/10.1016/j.comnet.2022.109206
Web of Science ®Google Scholar
Wang, Y., Fang, W., Ding, Y., & Xiong, N. (2021). Computation offloading optimization for UAV-assisted mobile edge computing: a deep deterministic policy gradient approach. Wireless Networks, 27(4), 2991–3006. https://doi.org/10.1007/s11276-021-02632-z
Web of Science ®Google Scholar
Wei, K., Li, J., Ding, M., Ma, C., & Poor, H. V. (2020). Federated learning with differential privacy: Algorithms and performance analysis. IEEE Transactions on Information Forensics and Security, 153454–3469. https://doi.org/10.1109/TIFS.2020.2988575
Web of Science ®Google Scholar
Wei, W., Liu, L., Loper, M., Chow, K. H., Gursoy, M. E., Truex, S., & Wu, Y. (2020). A framework for evaluating gradient leakage attacks in federated learning. arXiv preprint, arXiv:2004.10397.
Google Scholar
Xu, G., Liu, Y., & Khan, P. W. (2020). Improvement of the DPoS consensus mechanism in blockchain based on vague sets. IEEE Transactions on Industrial Informatics, 16(6), 4252–4259. https://doi.org/10.1109/TII.2019.2955719
Web of Science ®Google Scholar
Xu, R., Baracaldo, N., Zhou, Y., Anwar, A., & Ludwig, H. (2019). HybridAlpha: An efficient approach for privacy-preserving federated learning. In: Proceedings of the 12th ACM workshop on artificial intelligence and security (pp. 13–23).
Google Scholar
Yang, Q., Fan, L., Tong, R., & Lv, A. (2021). IEEE federated machine learning. IEEE Federated Machine Learning – White Paper 1-18.
Google Scholar
Yang, Z., Chen, Y., Huangfu, H., Ran, M., Wang, H., Li, X., & Zhang, Y. (2023). Dynamic corrected split federated learning with homomorphic encryption for U-shaped medical image networks. IEEE Journal of Biomedical and Health Informatics, 27( 12), 5946. https://doi.org/10.1109/JBHI.2023.3317632
PubMed Web of Science ®Google Scholar
Zamani, M., Movahedi, M., & Raykova, M. (2018). Rapidchain: Scaling blockchain via full sharding. In: Proceedings of the 2018 ACM SIGSAC conference on computer and communications security (pp. 931–948).
Google Scholar
Zhang, C., Li, S., Xia, J., Wang, W., Yan, F., & Liu, Y. (2020). BatchCrypt: Efficient homomorphic encryption for Cross-Silo federated learning. In: 2020 USENIX annual technical conference (USENIX ATC 20) (pp. 493–506).
Google Scholar
Zhang, W., Lu, Q., Yu, Q., Li, Z., Liu, Y., Lo, S. K., Chen, S., Xu, X., Zhu, L. (2021). Blockchain-based federated learning for device failure detection in industrial IoT. IEEE Internet of Things Journal, 8(7), 5926–5937. https://doi.org/10.1109/JIOT.2020.3032544
Web of Science ®Google Scholar
Zhang, W., Zhu, S., Tang, J., & Xiong, N. (2018). A novel trust management scheme based on Dempster–Shafer evidence theory for malicious nodes detection in wireless sensor networks. The Journal of Supercomputing, 74(4), 1779–1801. https://doi.org/10.1007/s11227-017-2150-3
Web of Science ®Google Scholar
Zhao, C., Gao, Z., Wang, Q., Xiao, K., & Mo, Z. (2021). AFL: An adaptively federated multi-task learning for model sharing in industrial IoT. IEEE Internet of Things Journal, 1–1. https://doi.org/10.1109/JIOT.2021.3125989
Google Scholar
Zhao, J., Huang, J., & Xiong, N. (2019). An effective exponential-based trust and reputation evaluation system in wireless sensor networks. IEEE Access, 7, 33859–33869. https://doi.org/10.1109/ACCESS.2019.2904544
Web of Science ®Google Scholar
Zhou, J., Cao, Z., Dong, X., & Lin, X. (2015). PPDM: A privacy-preserving protocol for cloud-assisted e-healthcare systems. IEEE Journal of Selected Topics in Signal Processing, 9(7), 1332–1344. https://doi.org/10.1109/JSTSP.2015.2427113
Web of Science ®Google Scholar

Download PDF

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Blockchain-based privacy-preserving multi-tasks federated learning framework

Abstract