![MathJax Logo](/templates/jsp/_style2/_tandf/pb2/images/math-jax.gif)
Abstract
Remote sensing is considered to be a key technology for the development of smart cities. By analysing the data obtained by remote sensing, it is effective to manage and develop smart cities. However, with the rapid urbanisation and the increase in data capacity, how to store and retrieve remote sensing data which have more potential value and privacy with a more effective and safe method has become an imminent issue. In this paper, we propose an efficient verifiable and dynamic multi-keyword searchable encryption scheme for remote sensing data in smart cities. Moreover, we construct a double-chain index to support a multi-keyword search, where the Verification Index Chain is used to identify the users and the Keyword Index Chain is used to store the information of keywords. In addition, we adopt extended bitmap technology to realise update operations and verify the correctness and integrity of the search results. Experiments show that compared to the related existing schemes, the search time of the proposed scheme is reduced by 35% at most, the verification time is reduced by 14% at most and the update time is reduced by 57% at most.
Subject classification code:
1. Introduction
In recent years, improving cities, especially metropolises, is an urgent and pressing demand as the urban problems come amid rapidly increasing urbanisation, such as environmental pollution, traffic jams, massive construction, etc (Sun et al., Citation2021; Zhang et al., Citation2022). The practice of smart cities will help alleviate the increasing burden caused by urbanisation and improve the quality of all aspects of the residents’ life. In the process of building a smart city, remote sensing technology plays a very important role in spatial information detection.
Remote sensing technology can be used to detect the city’s change information and capture images of the landscape from which a large amount of valuable data can be obtained, such as where the traffic is heavy, where there is a fire point and so on. These remote sensing data can be used for city management, urban planning, resource exploration and so on (S. C. Liu et al., Citation2022; Ye et al., Citation2021). But as the amount of data increases, local storage systems can not meet the massive storage requirements and cloud storage is seen as the proper method because of its affordability and convenience. However, how to ensure the security of the remote sensing data stored in the cloud, which may contain some private and valuable information, has become an important issue. Although encryption technology guarantees the confidentiality of data, it also poses some new challenges to the management of private data, such as ciphertext search. The proposed searchable encryption (SE) technology solves this problem well, which can obtain the desired data from huge amounts of encrypted data by querying keywords without retrieving all the encrypted data (Yan et al., Citation2022).
Due to the confidentiality and privacy of remote sensing data, it has higher requirements for searchable encryption schemes in terms of security, efficiency and functionality (Tang et al., Citation2021; Yan et al., Citation2022). In terms of efficiency, it will be essential to build indexes to improve search efficiency which is related to the number of keywords rather than the number of files. In terms of functionality, most of the previous SE schemes (Fu et al., Citation2019; Y. X. Li et al., Citation2018; Shao et al., Citation2022) are about static databases, but in practical applications, users need to update the privacy data in the cloud. At the same time, if the cloud is malicious, it may execute the portion of the search request or output part of the search results. Therefore, the correctness and integrity of the search results, it needs to be verified (Chen et al., Citation2021; H. Y. Wang et al., Citation2020b). In terms of security, when the index and database are updated, the previous search trapdoor can continue to query data and leak the updated information, so forward security needs to be considered (Song et al., Citation2020; Tang et al., Citation2021).
In some existing schemes (Q. Liu et al., Citation2017, June; Wu et al., Citation2019; Yu et al., Citation2013), the index structure is based on vectors or lists, which causes a drastic decrease in search efficiency as the number of files increases. Schemes (Q. Liu et al., Citation2017, June; Wu et al., Citation2019) require the cloud to access all files of the index to acquire the results. However, these schemes require a lot of time to access each file index, which leads to a linear increase in search time as the number of files increases. The indexes of schemes (Fu et al., Citation2019; Yu et al., Citation2013) are based on vectors that also traverse all file vectors during the search, resulting in a positive correlation between the search time and the number of files. In recent years, due to frequent incorrect or incomplete search results, some verifiable SE schemes have been proposed (Q. Liu et al., Citation2019; H. Wang et al., Citation2020a; Zhu et al., Citation2018). Schemes (Chen et al., Citation2021; Q. Liu et al., Citation2019) are both accumulator-based verification schemes, but the accumulator requires modulo-power operations which cost a large time overhead in verification. Schemes (H. Wang et al., Citation2020a; Zhu et al., Citation2018) are based on the Merkle tree, which results in a huge time overhead to maintain and manage the indexes. In addition, searchable encryption schemes (Hoang et al., Citation2021; Y. X. Li et al., Citation2018) supporting dynamic updates have been proposed, which can add and delete the uploaded files, but security problems come along, for example, scheme (Y. X. Li et al., Citation2018) does not consider whether the previous trapdoor is still valid after update operation.
In summary, verifiable SE schemes tend to require large time or space overheads and dynamic SE schemes either do not have guaranteed security or consume large local storage overheads. We introduce an efficient, verifiable and dynamic multi-keyword searchable encryption (EVDMSE), which optimises the search efficiency and achieves forward security. Then, the contributions of our scheme are presented as follows.
Storing and protecting the remote sensing data. The paper proposes a scheme for the storage and management of remote sensing data, in which the attackers and unwanted third parties can’t obtain the sensitive data arbitrarily.
Dynamic update. The proposed scheme supports the dynamic update of the files and also ensures forward security. When the user submits the updated information, the relative address information of the KIC changes accordingly. This way ensures that the address information contained in the previous search trapdoor is invalidated and the previous trapdoor cannot access the index.
Multi-keyword and verifiable search. We adopt extended bitmap technology to realise multi-keyword joint query and HMAC function to compress the bitmap for verifying the correctness and integrity of the search results (Shao et al., Citation2022). The extended bitmap technology is modified to support multi-keyword query and update operations compared with the traditional bitmap technology (Cong et al., Citation2019). HMAC function is utilised to compress the keywords and extended bitmap to verify whether the results are correct.
Efficient access. It is proved by theoretical analysis and simulation experiments that our scheme has obvious advantages in efficiency, such as saving at most 6 ms in search time, saving 5.9 ms in verification time and saving 0.16 ms in update time compared to the scheme (F. Li et al., Citation2021).
The organisation of the rest paper is as follows. We introduce the related work in Section 2. In Section 3, we introduce the preliminaries. We show the problem formulation in Section 4. In Section 5, the details of the algorithmic construction are described. Then, we analyse the security of the proposed scheme in Section 6. Next, the performance analysis of the program is detailed in Section 7. Ultimately, we summarise our article in Section 8.
2. Related work
Based on the remote sensing data, we propose a forward secure multi-keyword SE scheme in which the verification of search results and dynamic update of the database are implemented.
Dynamic update means that the user is allowed to add and delete the data after uploading it. To meet the demand for dynamic updates, Chang et al. (Chang & Mitzenmacher, Citation2005) first proposed a dynamic SE scheme that implements update operations in cloud servers. The drawback of the scheme is that it would leak additional information due to the newly added files associated with the previous query trapdoor. To reduce and even avoid such leakage, Stefanov et al. (Stefanov et al., Citation2014) proposed the notion of forward security in 2014. The goal of forward security is that the malicious cloud cannot obtain any information from the existing database during the update phase. They also proposed a forward security scheme similar to ORAM (Oblivious Random Access Machine), which explains that a recently added file may be associated with a previous trapdoor and the same trapdoor can leak the information when searching again. In 2016, Bost (Bost, Citation2016) proposed a forward secure SE scheme with low communication cost and low computational overhead, which hides the relation between file identifiers and keywords by trapdoor substitution. In recent years, with the widespread implementation of forward secure search encryption, the technique has been used in more scenarios. In 2020, Du and Wang (Citation2020, December) applied the blockchain to forward secure search encryption, which not only improves the query efficiency but also achieves verification with the help of blockchain technology. In 2021, Tang et al. (Citation2021) proposed a multi-user forward secure SE scheme in P2P, which solves the problems of ciphertext search and data update in the P2P environment. However, the index of the scheme relied on the number of files rather than the number of keywords, so the search efficiency decreased as the number of files increased.
Verification means how to verify the correctness and integrity of the query results returned by the cloud. Soleimanian et al.(Soleimanian & Khazaei, Citation2019) introduced a verifiable symmetric encryption scheme with the help of a third-party server. Q. Liu et al. (Citation2019) proposed a verifiable SE scheme using bitmap and RSA accumulator, which has a performance penalty in modular exponentiation during the verification phase. Wang and Cheng (Citation2019) proposed an efficient aggregated-key verifiable SE scheme that greatly saves computational overhead in the search and verification phases. However, it is anchored on public-key encryption, which requires more time costs and does not support dynamic updates. H. Wang et al. (Citation2020a) introduced a verifiable and dynamic SE scheme that uses Merkle trees and a time-stamp chain to verify the results, but it requires a large computational cost to update the Merkle trees.
Based on the problems above, features such as verifiable multi-keyword search, supporting dynamic updates, ensuring update security and relying on the keyword are desirable in a SE scheme. We list the functions of some mentioned schemes (Du, Citation2020, December; Tang, Citation2021; Soleimanian, Citation2019; Wang and Cheng, Citation2019; Wang, Citation2020a) in , it can be seen that the proposed scheme can satisfy all the required features.
Table 1. Functionality in various schemes.
3. Preliminaries
3.1. Extended bitmap index
The extended bitmap index is a modification of the bitmap index that uses bits to represent a file instead of one bit. When there are
files in a dataset, we use
bits to represent a file and a
-length string
to denote the dataset. If file
exists in the dataset, the strings of extended bitmap from
to
represent
, where the
-th is set to 1. For example, when
,
, we present the creation, addition, deletion and joint search for the database in the form of an extended bitmap, as shown in Figure . (a) shows the creation of an extended bitmap index, which includes files
,
and
; Figure (b) presents the add operation, which adds the file
to the index; Figure (c) illustrates the delete operation, which removes the file
from the index; Figure (d) indicates the joint search when the query keywords include the keywords
,
and
, each keyword corresponds to an extended bitmap index
. After the joint query, the three
corresponding to
,
and
are retrieved to do add operation, and if there exists the
-th bit to
-th bit is “011”, it means that the file
contains both keywords
,
and
. In Figure (d), the files
and
satisfy the query, and then they are returned.
3.2. A double-chain index
We construct a double-chain index structure in the proposed scheme. One is called a Verification Index Chain (VIC), which applies a one-way hash function to encrypt the user’s identity and establishes a one-way chain to control the access and modification rights to the index for the user. The other chain is called Keyword Index Chain (KIC) which is used to establish the index between the keywords and the files. In addition, the KIC is attached to the VIC and the position of the KIC is recorded with the counter . VIC utilises a one-way hash function to encrypt the user’s identity to realise that the current address cannot be inferred from the previous address. Each node of the KIC stores the current node address
, the next node address
and information about the keyword
including keyword, bitmap and verification function, where the last node points to null, as shown in Figure .
When an update operation is performed, the position of the KIC is moved to the next node of the VIC, that is, the address of the first node of the KIC and the counter is modified. Due to the one-way hash function, the previous trapdoor cannot access the updated KIC.
Based on the user’s identity, the system applies a hash function to create a VIC of length .
is determined by the system, as shown in Equation (1).
(1)
(1) where
means the user’s identity and
means the one-way hash function, as shown in Equation (2).
(2)
(2)
Since is a one-way hash function, when
is known, the hash value before
cannot be inferred, i.e.
cannot be inferred (He et al., Citation2020).
KIC is attached to a node of VIC, initially attached to the VIC’s first node. When updated, the KIC moves backward and the update counter records the position of the KIC. The KIC can be accessed only after the address of the KIC’s first node is obtained, and each node on the KIC contains keywords and related information. When updated, the address of the KIC is modified and the location of the KIC is recorded using . Then, according to the update request, the node of the index is modified.
4. Problem formulation
4.1. System model
Figure presents the system model of the management of remote sensing data. The cluster head receives data from sensor nodes which can collect information about the surrounding situation, and then it sends the remote sensing data to the satellite through electromagnetic waves. The users gather the data obtained by the satellite, extract the data into the form of a file and then perform searchable encryption to ensure its security and privacy. In the management of remote sensing data, it is assumed that the cloud server is semi-honest and can record and analyse information about file ciphertext and trapdoors. In the encryption phase, users extract the index of the files about remote sensing data, encrypt the files and then upload them to the cloud server together. In this case, neither the cloud server nor the attackers nor unwanted third parties would obtain the sensitive data. When a user wants to retrieve the files, he can submit the query trapdoor generated by the query keywords. And then the cloud server performs the search operation and returns the search results that matched the query to the user.
4.2. Algorithm definition
Since the proposed scheme is for a two-party model, the data user is the same as the data owner who uploaded the data. In the model, the data owner shares the data with the cloud server to save local storage, and then they can access their own data and also add or delete their uploaded data when there is a need. The proposed scheme consists of five algorithms: ,
,
,
and
, which are defined as follows.
. The algorithm is executed by the system. The system selects security parameters
as inputz and output keyz
,
,
, private key
.
.The algorithm is executed by the user. The user encrypts the files set
based on the identity
to generate the ciphertexts set
and the
.
. The algorithm is executed by the user. The user enters
,
and the keywords set
and outputs the search trapdoor
.
. The algorithm is executed by the cloud server. The cloud server enters
, outputs the node information
, ciphertext
and the encrypted keywords set match successfully
.
. The algorithm is executed by the user. The user enters
,
and
and the algorithm would output the result
, 1 denotes the user accepts the search result, otherwise, rejects it. (reject/accept).
.The algorithm is executed by the user. The user inputs the operation type
, the file
, the keywords in the file
the user’s identity
and outputs the address of KIC
and information about each update node
.
4.3. Security model
There is an “honest but curious” cloud server in the proposed scheme that will execute search commands honestly as required, but will also record private data or analyse the search pattern. We demonstrate the security of the scheme through the real game and the simulated game
, where the leakage function is defined as
. We define a query
and an update operation
, where
denotes the operation time,
represents the keyword and
means the extended bitmap of files. We define the search pattern as
which can distinguish the same query and the resulting pattern as
which can leak the search results. The update time leakage function for each keyword is defined as
, where
denotes the adaptive adversary and
is a probabilistic polynomial time simulator. The definitions of the two games are as follows.
: Adversary
selects files and then the system executes the encryption algorithm and returns ciphertext to
. The adversary
can send three adaptive queries: the search query, the add query and the delete query. Eventually,
outputs
as result.
: The adversary
selects files, and then the simulator
generates simulated ciphertext based on the leakage function
, and then sends the ciphertext to
. The adversary
can also send a polynomial number of adaptive queries. Based on the queries and leakage function
and
,
generates search commands, adds commands deletes commands and sends them to
. Eventually,
outputs
as result.
Definition 4.1:
For a polynomial-time adversary , a scheme satisfies adaptive security, if
can make
, where the function
is negligible.
4.4. Forward security
Forward security is a powerful property defined in dynamic symmetric searchable encryption (Tang et al., Citation2021). In a word, the cloud cannot query updated information using a previous search trapdoor.
Definition 4.2:
The adaptive secure scheme is said to satisfy forward security if the leakage function in the update phase is , where
denotes the type of operation,
denotes the pair of updated file-keyword and
denotes the number of keywords corresponding to the file
.
In the proposed scheme, since the update is based on an extended bitmap index, the add/delete operations are performed as a Modular Addition function so the cloud finds hard to distinguish the operations. The leakage function is defined as , where
is the number of updated keywords.
5. Detailed scheme
: The system generates the keys
and the symmetric key
, where
is a secure symmetric encryption algorithm, and then sends
to the user. The system generates a pseudo-random function
and a hash function
.
:The system confirms the identity of all users
, and the user retrieves the keyword set
from the file set
. The user encrypts
to get a set of ciphertexts
with the symmetric encryption algorithm
. The user
generates VIC and KIC with the method in Section 3.2. The VIC contains several nodes
, of which
is determined by the scale of files
indicating the position of KIC in VIC,
is the counter. The node of KIC, shown in Figure , contains three types of information: the first is the current node address
, the second is the next node address
and the third is the information about the keyword
which includes the encrypted keyword
, extended bitmap index
, bitmap verification function
and the file verification function set
. In detail, the verification function
is used to verify the validity of
. For each file
, generate the file verification function
and store it in the set
for verifying the completeness of the results, in which
denotes the encrypted file set containing
. The user
generates exclusive KIC
which includes the keyword node
. The detailed index generation is shown as Algorithm 1. Eventually, the user
sends the index
and ciphertext set
to the cloud, in which
and
denote exclusive VIC and KIC for the user
.
Table
: The algorithm is executed by the user and the detail is shown in Algorithm 2. The user applies the identity
and the counter
to generate
, which locates the address of the VIC to access the KIC. The user encrypts each query keyword to generate a set
. Set the threshold
, which is used to detect which files on
meet the query requirement in Algorithm 3. Finally, the user sends the query trapdoor
to the cloud.
Table
: The algorithm is executed by the cloud and the details are shown in Algorithm 3. First, the cloud finds the corresponding VIC according to the
, and then executes the search process with the nodes in turn in the associated KIC. Second, the cloud accesses each node according to the node address
and determines whether the keyword of the node is a query keyword. If the match is successful, the keywords would be stored in the set
, and the node information
is retrieved; otherwise, no node information is recorded and the next node is accessed according to
. Then, the cloud server does the joint operation on all the retrieved
(See Section 3.1), if the associated bits in the computed
are equal to
, it means that the file that the matched bit represents satisfies the query requirement, and then stores the encrypted file in the set
. For each encrypted keyword
and ciphertext
matched successfully, the function
is retrieved from the
and stored in the set
. For each
, the cloud server returns
to the user.
Table
: When the user receives the search results returned by the cloud, he can execute the verification algorithm, as shown in Algorithm 4. First, for each
of the returned results, verify whether Equation (2) holds. If the equation holds, it means that the returned
is correct.
(3)
(3) Then perform the joint operation on
about each
and determine whether the result is the same as
from the returned
. Finally, for each
and
, verify whether Equation (4) holds. This program is to determine whether the files corresponding to the keyword are complete.
(4)
(4) If the equation holds, the user accepts the results.
Table
:
Add: When the user wants to add the file
, extracts the keywords set
from the file, and does the updating operation as Algorithm 5. The user first modifies the VIC according to the identity
, then modifies the node position of the KIC with the
, meanwhile alters the counter
and the address of the first node
in the KIC. For each
, the user adds the
about the new file to the original
and also generates
. And then user generates a file verification function
and adds it to the new authentication set
. At last, the user sends the updated address
and
to the cloud.
Delete : The operation is the same as add operation. The user first modifies the VIC and KIC. For each
, the user performs an addition of the
after complement operation to obtain the updated
and modifies
. The user removes
from
and obtains the new authentication set
. Finally, the user sends the updated address
and
to the cloud.
Table
6. Security analysis
6.1. Adaptive security
The proposed scheme satisfies adaptive security under the random oracle model.
Theorem 6.1:
Given a CPA-secure symmetric encryption algorithm, a pseudo-random function , a hash function
and a one-way hash function
, if the leakage function
satisfies the following definitions:
,
,
, EVDMSE presented in section 5 is the adaptive security.
Proof:
A simulator is used to simulate the real algorithm, and no polynomial-time adversary can distinguish between the real and simulated algorithms.
simulates the algorithm according to the leakage functions
.
Game :
represents the real EVDMSE security game
, so that
Game
:
simulates Algorithm 1
according to the leakage function
. When generating the key
,
picks a random string instead of calling
(see line 5 of Algorithm 1). For each user’s identity
,
encrypts the identity with
and records the result
with a table
. When the update is performed, the adversary
can use the content in
. If the encrypted identity exists, it directly calls it; if not,
generates encrypted identity and places it in
.
and
are indistinguishable by the indistinguishability between
and random function. Therefore, there is an adversary
meeting:
Game
:
simulates the Algorithm 5
according to the leakage function
. Since the leakage functions of both add and delete operations are
, which only leaks the number of keywords, so we discussed them together.
is exactly as
except that we apply random strings instead of the hash values
and
in the update phase, and record the value with the dictionaries
and
which are used to record the results of
and
, respectively in random queries, and the table
is used to store the update requests and results. For each update operation,
first generates
for the update file and random key
, then invokes the HMAC function to generate the authentication functions for
and keywords and finally deposits the value into the table
. Under the random oracle model, if there is an adversary
that can distinguish between
and
, it can also distinguish between hash functions and random functions, which means
can make:
Game
:
simulates the query trapdoor based on the leakage function
. In addition, two tables
and
are used to record query trapdoors and search results, and the table
records the hash value
during the trapdoor generation. During the search process, the simulator determines the duplication of the same keyword queries based on the leak function
, but it works only before the update operation, because after the occurrence of the update, the previous trapdoors is invalid. The result pattern leak function is
, which only leaks the file identifier in the binary form
and then
is used to record the file identifiers matching the query keywords. During the trapdoor generation, if the query keyword is issued for the first time,
uses a random number instead of the encrypted identity
and then performs hash operations based on the counter
instead of directly obtaining
, finally stores the hash results in
. Meanwhile,
generates the position of KIC
with random numbers and random keys
, encrypts the query keywords with symmetric encryption algorithms, stores the query trapdoors in
and the search results in
. When performing the search,
first checks table
. If there is a corresponding trapdoor, directly employ the trapdoor in
and get the search result from
; if not, generate a new query trapdoor and place it in the table
. The main difference between
and
is that the query trapdoor is generated randomly which is indistinguishable by the adversary
, so that we can make:
Simulator
: The difference between
and
is that the simulator
in
uses the leak function
to generate a query trapdoor without using the encrypted identity
. For the adversary
, the simulator generates a search trapdoor of the same size which is perfectly indistinguishable from
, and thus
Conclusion: Based on the above games and the simulator
,
is a pseudo-random function and both
and
are hash functions, the following equation can be obtained:
In summary, Theorem 1 stands.
6.2. Forward security
Corollary 6.1:
When is a pseudo-random function and
and
are hash functions, the
adaptive scheme satisfies forward security when the leakage function satisfies
.
Proof:
In the search protocol, the cloud can learn the encrypted keywords from the search trapdoor. But the keywords are encrypted by a secure symmetric encryption algorithm, so the cloud server can learn nothing except the number of encrypted keywords. In addition, the cloud can obtain the encrypted identity of the user during the search process, and since the identity is processed through a hash function, no information other than the encrypted identity is disclosed. In the update protocol, the cloud can obtain the number of updated keywords and the extended bitmap of the updated file. However, the first node address of KIC points to the next node of VIC and the first node address is processed by the one-way hash function, so that the current address cannot be inferred from the previous node address. Due to the change of address, the previous trapdoor is invalid after updating, which effectively avoids the association between the generated search trapdoor and the new file. In summary, the proposed scheme satisfies forward security.
7. Performance
7.1. Theoretical analysis
The proposed scheme will be compared with Chen’s scheme (Chen et al., Citation2021), Liu’s scheme (Q. Liu et al., Citation2019), Li’s scheme (F. Li et al., Citation2021) and Bost’s scheme (Bost, Citation2016) in terms of functional aspects such as keyword types, verifiability and time overheads such as search cost and update cost, as well as storage overheads such as client storage and index size, as shown in Table .
Table 2. Comparison of functionality and overhead.
and
denote the number of all keywords and all files, respectively;
indicates the average number of keywords contained in a file, where
;
represents the number of query keywords;
represents the average number of files corresponding to a keyword, where
;
means the pseudo-random function operation;
means the hash function operation;
denotes the HMAC function operation;
indicates the trapdoor function operation and its inverse operation;
means the modulo exponentiation operation;
means the obtaining address operation;
is the symmetric encryption.
Functionality: (1) Search type. Bost’s scheme, Chen’s scheme and Liu’s scheme only support the single-keyword search, while Li’s scheme and the proposed scheme support the multi-keyword search. (2) Verifiability. Bost’s scheme does not support the verification, while other schemes support the verification of search results. Unlike the above three verifiable schemes, the proposed scheme does not use an RSA accumulator but employs the HMAC function which reduces some verification costs.
Time overhead: (1) Search overhead. From Table , it can be seen the search overhead of the proposed scheme is much smaller than that of Bost’s scheme and Chen’s scheme, and the same as that of Liu’s scheme. However, the time of Liu’s scheme only spends on searching a single keyword, while our scheme takes the same amount of time to finish a multi-keyword search. The search cost of Li’s scheme is associated with the number of query keywords, which is far smaller than the number of all keywords, but it has to traverse the keyword index for every query keyword, which requires hash function operations and modulo exponentiation operations that is very time-consuming. The required search time increases linearly with the number of query keywords. During the search process, the proposed scheme traverses the keyword nodes in KIC and retrieves information on the nodes that satisfy the query keywords, which does not require additional modulo exponentiation operations compared to Li’s scheme. (2) Update overhead. Bost’s scheme requires trapdoor function operation for all keywords in an index, while the other schemes are all concerned with the average number of keywords contained in a file, so the updating overhead is much more efficient than Bost’s scheme. Chen’s scheme is similar to Liu’s scheme, which performs a hash operation for each update keyword and then generates the corresponding update information for each update keyword, but Liu’s scheme, which requires more hash operations and addresses fetching operations, requires relatively more time overhead. Li’s scheme requires pseudo-random function operation, hash function operation and modulo exponentiation operation for each update keyword. The proposed scheme performs symmetric encryption and HMAC operations for each update keyword while hashing the user’s identity and calculating the address of the KIC.
Storage overhead: (1) User storage. Except for the proposed scheme, other schemes need to store files contained in each keyword, so for all keywords, the storage overhead can reach
. In our scheme, the user only stores the keyword information for verification and the rest of the verification information is from the cloud. In the update phase, the user needs to store numeric strings
of each keyword, which saves the storage overhead from
to
compared with other schemes. (2) Index size. In Bost’s scheme, the index was built according to all the keywords and the file information, which was processed by the hash function, was stored together with keywords. In Li’s scheme, the index stores information on all keywords and the verification information of all documents. The proposed scheme is similar to Liu’s scheme and Chen’s scheme in that the index needs to store each keyword and the corresponding file information, so that index size is determined by the number of keywords.
In summary, through the above analysis, our scheme has certain merits in terms of functionality, time overhead and storage overhead compared with the above four schemes.
7.2. Experimental analysis
7.2.1. Experimental environment
NSF Basic Research Awards which contains 129,000 files from 1990 to 2003 is used as the test dataset in this experiment. 5000 files are selected as test data and the 10 keywords with the highest frequency in each file that has been excluded. The experiments are based on AMD Ryzen 5 4600H CPU and 16G DDR4 RAM. The programs are implemented in Python 3.7 and compiled using PyCharm 2020.2.3.
7.2.2. Experimental results
Through the theoretical analysis in Table , it is clear that our scheme has certain advantages in terms of time overhead. For more accuracy in estimating the performance, we analyse the efficiency from the search time, update time and verification time through simulation experiments in this section.
Figure (a) indicates the trend of search time with an increasing number of files for the single-keyword search. Our scheme is compared with Bost’s scheme and Chen’s scheme. The search time of Bost’s scheme increases linearly with the number of files in the database, and also has the fastest growth rate. The search time of Chen’s scheme also grows linearly with the number of files, but the rate is lower than that of Bost’s scheme. The search time of the proposed scheme is less affected by the number of files but related to the number of keywords, so the search time increases smoothly. As the number of documents grows, the search time grows slower. We can see that the rate from 2000 to 5000 is lower than the rate from 1000 to 2000.
Figure 5. Search time, (a) search time for single keyword, (b) search time with file set size = 1000, and (c) search time with file set size = 5000.
![Figure 5. Search time, (a) search time for single keyword, (b) search time with file set size = 1000, and (c) search time with file set size = 5000.](/cms/asset/db95d8db-9823-4ce7-aefa-d0f90820daab/ccos_a_2165638_f0005_oc.jpg)
Figure (b,c) show the trend of the search time with the increase in the number of query keywords. Li’s scheme searches the index once and performs a modulo exponentiation operation for each query keyword, so the search time increases rapidly with increasing query keywords. The proposed scheme only traverses the index once during the whole search process, and when the query keyword is matched, the cloud accesses the verification information, so the search time grows slowly with increasing query keywords. With the number of query keywords increasing, the gap between the proposed scheme and Li’s scheme is widening, in which the gap can reach 6 ms when the number of query keywords is 5. When searching a single keyword, the search time of the proposed scheme increases from 9.4 to 16 ms when moving from 1000 to 5000 database files. When searching with 5 query keywords, the search time will increase from 10.2 to 17.3 ms. The growth of the search time for the proposed scheme stays within 7–8 ms as the number of files increases.
Figure shows the verification time. Figure (a) indicates the trend of the verification time for the single-keyword search with the increasing number of files. With the same number of documents, both Liu’s scheme and Chen’s scheme take more time to verify than the proposed scheme. When the number of files increases, the verification time of Liu’s scheme and Chen’s scheme grows rapidly, while our scheme grows at a slow rate because only the HMAC operations are required for the verification process.
Figure 6. Verification time, (a) verification time for single keyword, (b) verification time with file set size = 1000, and (c) verification time with file set size = 5000.
![Figure 6. Verification time, (a) verification time for single keyword, (b) verification time with file set size = 1000, and (c) verification time with file set size = 5000.](/cms/asset/ba43b785-a3c8-4c74-a3ab-6943bb25c640/ccos_a_2165638_f0006_oc.jpg)
Figure (b,c) present the changing trend of the verification time with the different query keywords. The gap between the proposed scheme and Li’s scheme is widening with the increase of query keywords. When the number of database files is 1000, the gap goes from 1.2 to 5.9 ms. When the number of database files is 5000, the gap goes from 0.5 ms to 3.6 ms. When the number of keywords is 1 and the number of database files is 1000, Li’s scheme and the proposed scheme take 27 and 25.8 ms, respectively; when the number of database files is 5000, it takes 33 and 32.5 ms, respectively. When searching with 5 keywords in 1000 database files, the verification time needs 43 and 37.1 ms. When the number of database files grows up to 5000, the verification time needs 46.2 and 42.6 ms. We can see that the growth of the verification time in the proposed scheme stays within 6–7 ms when the number of files increases. With the number of query keywords increasing, the gaps between the proposed scheme and Li’s scheme are widening.
Figure indicates the trend of update time as the number of extracted keywords increases. Liu’s scheme takes more verification time than other schemes because it requires inserting the updated information into the verification matrix, it also needs to expand and move the array. Li’s scheme establishes an index for each updated keyword and uses the modulo exponentiation operation to create verification functions for the keywords, so it takes about the same time as the proposed scheme when the number of keywords is less, but the verification time grows rapidly as the number of keywords increases. In Chen’s scheme, buffers are used to store the updated information, thus it takes less time than that of Liu’s scheme. The proposed scheme modifies the extended bitmap index for each updated keyword node and generates a verification function when updating, which saves much time overhead.
8. Conclusion
In this paper, we construct a verifiable forward security multi-keyword searchable encryption scheme which is based on remote sensing data in smart cities. The index of the proposed scheme not only achieves authentication but also improves search efficiency. To allow a multi-keyword search, we introduce an extended bitmap index that also supports updating. In addition, we prove the proposed scheme meets adaptive forward secure under the random oracle model. Last, through the theoretical and experimental analysis, the proposed scheme has certain advantages in efficiency compared to similar schemes. The scheme is suitable for ensuring the safety of remote sensing data and retrieving data from the cloud server effectively. However, our paper also has some limitations, such as when the keyword of the added file is not in the index, the added operation will fail because of inserting a new node into the index.
In future, we will continue the research on multi-keyword searchable encryption. We plan to store the index in a trusted service (e.g. blockchain), store the ciphertext in the cloud server, and make the trusted service manage the index. When the added keyword is not in the index, the trusted service can generate the node of the KIC, which ensures that the information is not leaked to the cloud server.
Acknowledgements
We would like to express our gratitude to the foundation for their generous support, and also thank the editors and reviewers for their efforts.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Additional information
Funding
References
- Bost, R. (2016). ∑Oφo: Forward secure searchable encryption. The 2016 ACM SIGSAC Conference, Vienna Austria, October 24-28, 2016.
- Chang, Y. C., & Mitzenmacher, M. (2005, June 7–10). Privacy preserving keyword searches on remote encrypted data. Applied Cryptography and Network Security, Third International Conference, ACNS 2005, New York, NY, USA, Proceedings.
- Chen, C. M., Tie, Z., Wang, E. K., Khan, M. K., Kumar, S., & Kumari, S. (2021). Verifiable dynamic ranked search with forward privacy over encrypted cloud data. Peer-to-Peer Networking and Applications, 14(5), 2977–2991. https://doi.org/10.1007/s12083-021-01132-3
- Cong, Z., Sun, S.-F., Liu, J., Shao, J., Pieprzyk, J., Pieprzyk, J., Shao, J., Liu, J., Sun, S.-F., & Zuo, C. (2019). Dynamic searchable symmetric encryption with forward and stronger backward privacy. Springer.
- Du, R., & Wang, Y. (2020, December). Verifiable blockchain-based searchable encryption with forward and backward privacy. 16th International Conference on Mobility, Sensing and Networking (MSN, Japan, December 17-19, 2020.), Japan, December 17-19, 2020.
- Fu, Z., Huang, F., Sun, X., Vasilakos, A. V., & Yang, C. (2019). Enabling semantic search based on conceptual graphs over encrypted outsourced data. IEEE Transactions on Services Computing, 12(5), 813–823. https://doi.org/10.1109/TSC.2016.2622697
- He, K., Chen, J., Zhou, Q., Du, R., & Xiang, Y. (2020). Secure dynamic searchable symmetric encryption with constant client storage cost. IEEE Transactions on Information Forensics and Security, 16(1), 1–1. https://doi.org/10.1109/tifs.2020.3033412
- Hoang, T., Yavuz, A. A., & Guajardo, J. (2021). A secure searchable encryption framework for privacy-critical cloud storage services. IEEE Transactions on Services Computing, 14(6), 1675–1689. https://doi.org/10.1109/TSC.2019.2897096
- Li, F., Ma, J., Miao, Y., Jiang, Q., Liu, X., & Choo, K. K. R. (2021). Verifiable and dynamic multi-keyword search over encrypted cloud data using bitmap. IEEE Transactions on Cloud Computing, 28(5). https://doi.org/10.1109/TCC.2021.3093304
- Li, Y. X., Zhou, F. C., Qin, Y. H., Lin, M. Q., & Xu, Z. F. (2018). Integrity-verifiable conjunctive keyword searchable encryption in cloud storage. International Journal Of Information Security, 17(5), 549–568. https://doi.org/10.1007/s10207-017-0394-9
- Liu, Q., Nie, X., Liu, X., Peng, T., & Wu, J. (2017, June). Verifiable ranked search over dynamic encrypted data in cloud computing. IEEE/ACM 25th International Symposium on Quality of Service (IWQoS, Barcelona, June 14–16, 2017.), Barcelona, June 14-16,2017.
- Liu, Q., Tian, Y., Wu, J., Peng, T., & Wang, G. (2019). Enabling verifiable and dynamic ranked search over outsourced data. IEEE Transactions on Services Computing, 15(1), 69–82. https://doi.org/10.1109/TSC.2019.2922177
- Liu, S. C., Bovolo, F., Bruzzone, L., Tong, X. H., & Du, Q. (2022). Editorial foreword to the special issue on recent advances in multitemporal remote-sensing data processing. Ieee Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15(1), 776–778. https://doi.org/10.1109/JSTARS.2022.3140594.
- Shao, J., Lu, R., Guan, Y., & Wei, G. (2022). Achieve efficient and verifiable conjunctive and fuzzy queries over encrypted data in cloud. IEEE Transactions on Services Computing, 15(1), 124–137. https://doi.org/10.1109/TSC.2019.2924372
- Soleimanian, A., & Khazaei, S. (2019). Publicly verifiable searchable symmetric encryption based on efficient cryptographic components. Designs, Codes and Cryptography, 87(1), 123–147. https://doi.org/10.1007/s10623-018-0489-y
- Song, X., Dong, C., Yuan, D., Xu, Q., & Zhao, M. (2020). Forward private searchable symmetric encryption with optimized I/O efficiency. IEEE Transactions on Dependable and Secure Computing, 17(5), 912–927. https://doi.org/10.1109/TDSC.2018.2822294
- Stefanov, E., Papamanthou, C., & Shi, E. (2014). Practical dynamic searchable encryption with small leakage. Network & Distributed System Security Symposium, 24(2), 765–774. https://doi.org/10.14722/ndss.2014.23298
- Sun, W., Liu, K., Ren, G., Liu, W., Yang, G., Meng, X., & Peng, J.. (2021). A simple and effective spectral-spatial method for mapping large-scale coastal wetlands using China ZY1-02D satellite hyperspectral images. International Journal of Applied Earth Observation and Geoinformation, 104(1), 102–572. https://doi.org/10.1016/j.jag.2021.102572
- Tang, Y., Li, J., Yan, X., & Zhao, Q. J. M. I. S. (2021). Edge-cloud-assisted multiuser forward secure searchable encryption (EMFSSE) scheme in the P2P networking environment. Encryption (EMFSSE) Scheme in the P2P Networking Environment, 2021(6), 1–14. https://doi.org/10.1155/2021/7003252
- Wang, H., Fan, K., Li, H., Ya Ng, Y. & Applications. (2020a). A dynamic and verifiable multi-keyword ranked search scheme in the P2P networking environment. Peer-to-Peer Networking, 13(16), 2342–2355. https://doi.org/10.1007/s12083-020-00912-7
- Wang, H. Y., Fan, K., Li, H., & Yang, Y. T. (2020b). A dynamic and verifiable multi-keyword ranked search scheme in the P2P networking environment. Peer-to-Peer Networking and Applications, 13(6), 2342–2355. https://doi.org/10.1007/s12083-020-00912-7
- Wang, X., & Cheng, X. (2019). Efficient verifiable key-aggregate keyword searchable encryption for data sharing in outsourcing storage. IEEE Access, 99(8), 11732–11742. https://doi.org/10.1109/ACCESS.2019.2961169
- Wu, C. F., Ti, Y. W., Kuo, S. Y., & Yu, C. M. (2019). Benchmarking dynamic searchable symmetric encryption with search pattern hiding. 2019 International Conference on Intelligent Computing and Its Emerging Applications (ICEA, China, August 3-6, 2019.), China, August 3–6, 2019.
- Yan, X. X., Yin, P., Tang, Y. L., & Feng, S. W. (2022). Multi-keywords fuzzy search encryption supporting dynamic update in an intelligent edge network. Connection Science, 34(1), 511–528. https://doi.org/10.1080/09540091.2021.2023097
- Ye, M., Ruiwen, N., Chang, Z., He, G., Tianli, H., Shijun, L., Yu, S., Tong, Z., & Ying, G. (2021). A lightweight model of VGG-16 for remote sensing image classification. Ieee Journal Of Selected Topics In Applied Earth Observations and Remote Sensing, 99(14), 6916–6922. https://doi.org/10.1109/JSTARS.2021.3090085
- Yu, J., Lu, P., Zhu, Y., Xue, G., & Li, M. (2013). Toward secure multikeyword top-k retrieval over encrypted cloud data. IEEE Transactions on Dependable and Secure Computing, 10(4), 239–250. https://doi.org/10.1109/TDSC.2013.9
- Zhang, N. H., Wang, Y. B., & Feng, S. (2022). A lightweight remote sensing image super-resolution method and its application in smart cities. Electronics, 11(7), 1050. https://doi.org/10.3390/electronics11071050
- Zhu, J., Li, Q., Wang, C., Yuan, X., Wang, Q., & Ren, K. (2018). Enabling generic, verifiable, and secure data search in cloud services. Ieee Transactions on Parallel and Distributed Systems, 29(8), 1721–1735. https://doi.org/10.1109/TPDS.2018.2808283