Full article: Spatial-temporal Offshore Current Field Forecasting Using Residual-learning Based Purely CNN Methodology with Attention Mechanism

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

Spatial-temporal current forecasting is indispensable for ocean engineering and marine science exploration, for instance aiding in the conservation and protection of marine ecosystems, planning shipping-routes and determining the length and fuel consumption of sea-going voyages, obtaining deeper insights into the distribution of heat flux within the ocean, which is vital for better understanding climate changes, and so on. Most present related-studies primarily focused on single location or grid-cell-based forecasting, such methodologies are site-specific and neglect the importance of spatial-temporal fidelity. Furtherly, the Recurrent Neural Networks-based methods previously employed exhibit low efficiency in terms of model convergence concerning practical engineering purposes, and numerical weather models are time-consuming and computational expensive. A newly improved Unet-based model using residual-learning with attention strategy is proposed for 2D sea surface current (SSC) velocity predictions with a more efficient perspective. Several machine-learning methodologies were adopted for a better performance comparison. The final predictions demonstrated its superiorities that the proposed neural-learning method outperforms the other established approaches with spatial-resolved mean RMSE less than 0.009 m/s and 0.006 m/s. As a promising surrogate for SSC predictions, the proposed methodology has strong potential in operation marine monitoring and engineering constructions.

Introduction

Coastal and nearshore sea surface variability forecasting plays a significant role in ocean state monitoring and offshore engineering, such as coastal storm surge and flooding forecasting (Röhrs et al. Citation2021), marine oil spill and pollution trajectories (Tamtare, Dumont, and Chavanne Citation2021), safety of sea-going navigation (Wen, Yang, and Wang Citation2021), and marine accident rescue activities (Shen et al. Citation2019). Accurate and efficient ocean surface current prediction can provide invaluable benefits for marine fishing, ocean shipping, tourist industry, and the coastal community. In addition, coastal tidal energy-based power stations could vary from flood dominant to ebb regions over small spatial scales; it is very important to obtain a better understanding of the variability of the tidal and sea surface current (SSC)-based green-resources, especially in a complete spatial perspective, prior to choosing a safe and economic location for the potential tidal energy project (Khare and Bhuiyan Citation2022; Monahan, Tang, and Adcock Citation2023).

Yet, sea surface current forecasting can be a challenging task, since unlike tidal level variability, the ocean surface current suffers from the stochastic and non-linear behavior coupled with turbulence, bathymetric interactions, waves, and meteorological forcing (Neill, Hashemi, and Lewis Citation2014); and especially in semi-enclosed regions with complicated topography, the ocean current can vary sharply over short distances (Sarkar, Osborne, and Adcock Citation2019). In addition, sea surface variabilities in semi-enclosed ocean region with shallow waters can be significantly distorted because of oceanic and meteorological nonlinear interactions that exacerbate the dynamics extremely (Röhrs et al. Citation2021), especially for complex tidal signal propagations. The signal processing approaches, such as the wavelet transform and Fourier transform, are commonly used to explore the sequential time series issues, and the harmonic analysis as one of the most classical methodology is popular in tidal analysis, and it had been utilized for decades (Monahan, Tang, and Adcock Citation2023). Yet, these techniques for sea level and sea surface current analysis can only realize grid-point-based forecasting, and one has to resort to the numerical simulation models to explore the spatial variability in a complete spatial perspective. Nevertheless, such methodologies are time-consuming and computationally expensive (Kalinić et al. Citation2017). The variational data assimilation techniques, including the Kalman-filters and the particle filters approaches, have been utilized to improve ocean environment estimation, such as enhancing the sea surface current velocities prediction accuracy by aggregating the best use of available real observations, yet, preparing a data assimilation approach usually requires significant computational costs, for instance the adjoint model in four-dimensional variational (4D-VAR) data assimilation and the computation of model background errors in Kalman-filters (Ren, Hu, and Hartnett Citation2018).

With the development of artificial neural network (ANN) and deep learning techniques, they have now been becoming popular for accurate and efficient prediction of ocean science (Ali et al. Citation2021, Citation2023; Ma and Chen Citation2023). Sarkar, Osborne, and Adcock (Citation2018) proposed a method combining machine learning techniques within the framework of Bayesian theory for realizing tidal current predictions, and they demonstrated that the proposed Bayesian-based machine learning approach can achieve better forecasting than the traditional ones. Qian et al. (Citation2022) developed a hybrid machine learning methodology using long short-term memory (LSTM) and hierarchical extreme learning machine (H-ELM), to predict tidal current at specific locations. Aly (Citation2020) established several hybrid models based on clustering approaches for harmonic tidal currents constitutions predictions; experimental results in this study proved that intelligent machine learning approaches can improve the performance on the aspect of both hybrid system and current predictions. Immas, Do, and Alam (Citation2021) developed two predictive models by employing deep learning, including a Transformer and a Long Short-Term Memory (LSTM) Recurrent Neural Network, to implement real-time in-situ ocean current predictions for different locations. Xie et al. (Citation2023) established different attention mechanisms-based deep learning methods for obtaining better current predictions; they demonstrated that the introduced attention strategy can improve model performance.

However, although the aforementioned machine-learning-based models have achieved satisfactory performance, limitations still exist:

It can be seen that most of these aforementioned implementations primarily focus on single-location/grid-cell specified ocean current forecasting, which ignores the spatial inter-correlations of each adjacent grid-cells and can cause spatial co-variabilities information losses. It is, thereby, difficult to investigate the surface current dynamic-variability in a fully spatial perspective by only aggregating a single or few grid-points. The spatial-temporal 2D ocean current speed forecasting is more beneficial for the overall wind power regulation with dynamic planning and the resource management (Manucharyan, Siegelman, and Klein Citation2021; Sinha and Abernathey Citation2021).
Many of the aforesaid studies were mainly developed by involving the RNN/LSTM modulations, nevertheless, implementation of RNN/LSTM-based neural-learning models for sea surface current prediction, using backward propagation through time training strategy, is very inefficient, would cost heavily computational consumption (Bradbury et al. Citation2016; Yu, Gonzalez, and Li Citation2021), which discloses a prominent deficiency with regard to model operational practicality forecasting. Still, the RNN/LSTM neural network may exhibit superiority on reproducing the temporal dependency by using sequential time-series, yet, not good at involving inter-correlated high-dimensional features of adjacent grid-cells in continuously complete 2D sequential fields. On the contrary, the CNN-based neural-learning model holding spatial feature mapping capability, by the inherent 2D convolve-kernel, represents remarkable superiorities both in spatial correlated-feature extraction and model convergence efficiency (LeCun et al. Citation2015).
Relatively burdensome signal decomposition was involved in previous forecasting methods, which can cause additional signal processing errors. Traditional machine learning-based approaches require redundant manually based feature selection and pre-processing, and obvious deterioration and workload of model performance can appear as the amount of data increases (Liu and Wang Citation2021); these disadvantages are not conducive to perform practically operational ocean environment-based predictions in real world.

A residual-learning-based mechanic was developed for deep ocean current forecasting (Manucharyan, Siegelman, and Klein Citation2021). While ResNet can be trained very quickly, the deeper architecture with more neural-layers also represents that much more computational resources will be required. And there is still a potential of model overfitting, especially when the training sample is not representative of the real-underlying data distribution. Sinha and Abernathey (Citation2021) trained the Machine-Learning models by using multiple ocean variables, including sea surface height, sea surface temperature, and wind stress to predict ocean current, yet, more involving variables may increase the computational burdensome and data preparation works. A hybrid model combining the complete ensemble empirical mode decomposition (CEEMD) and the empirical orthogonal function (EOF) analysis with ANN was developed for achieving sea surface multivariable predictions, including the ocean current (Shao et al. Citation2021). Yet, signal decomposition strategy might lead to additional signal processing errors. A convolutional denoising autoencoder (CDAE) model (Gibbs, Bingham, and Paiement Citation2023) was established to estimate the relatively large-scale ocean surface current, yet, they did not introduce attention or residual learning blocks which could better capture the local features of current fields. These afore-mentioned studies focused merely on single grid-points or locations-based tidal current forecasting, which could ignore the spatial interactions of each adjacent grid-points and cause the loss of co-variabilities information. Zhang, Stanev, and Grayek (Citation2020) proposed a CNN-based deep learning model to reconstruct the sea surface level variability in the basin-wide North Sea, which demonstrated that the machine learning method can accurately reproduce the spatial-temporal ocean surface dynamics and can greatly save computational resources.

In order to alleviate the aforementioned drawbacks with regard to both methodologies and complex nonlinearities of the ocean dynamics, we proposed a purely CNN-based approach by combining the residual-learning with an introduced multi-head attention strategy (MHA-ResUnet) for realizing better 2D sea surface current forecasting. Hourly current field dataset in the semi-enclosed North Sea was aggregated from the reanalysis dataset, and the dataset is produced by employing an ocean numerical model using data assimilation, with tides, at 7 km horizontal resolution. The aggregated reanalysis dataset was, then, adopted as the model training and testing sample. The proposed purely 2D CNN model was established by only aggregating 2D CNN blocks without physical equation constrains, thereby, as a directly end-to-end user-friendly model, all the up-sampling layer in the raw Unet was replaced by using the Conv2DTranspose layer, all MaxPooling2D layers in the Encode part together with all BatchNormalization layers were also removed, and two multi-head attention blocks were firstly incorporated into this forecasting model. The RNN/LSTM-based methodology has been indicated that it could cost heavily computational consumption, while the purely 2D CNN framework can greatly improve model efficiency. The introduced attention strategy is capable of query and project more embedded features within refinement process, and can augment the original feature maps based on skip-connection operations. The deeper feature augmentation and refinement operation implemented by the aggregated attention strategy can improve the final field forecasting. In addition, by employing the residual-learning mechanism for model training and validation, the internal residual modules can integrate skip-feature-connections to alleviate the gradient vanishing problem caused by the increasing depth for deeper neural models, which can, thus, effectively enhance the forecasting performance of deep networks. The sequential sliding-window mechanism introduced in the data-preparation stage will furtherly help the forecasting model to select the highly correlated historical current-field time-lags, thereby, the temporal-dependent features can then be preserved inside the input-tensors (Yin and Wang Citation2021).

Accurate ocean surface current predictions can help shipping industries to optimize the navigational plan and route, save fuel consumption, and furtherly ensure navigation safety by avoiding areas with complex current variabilities or avoiding severe marine weather. The proposed neural-learning models can aggregate large amounts of historical dataset to estimate the future surface current patterns in advance, which enables the marine-shipping industries to schedule their voyages more efficiently. Moreover, considering on the aspect of ocean renewable energy exploration (Nezhad et al. Citation2024), accurate current field forecasting can be utilized to deploy and regulate the operation of wave and tidal current-based energy-harvesting facilities, accordingly (Neshat et al. Citation2022; Wu, Liang, and Gao Citation2023, Wu et al. Citation2023).

The main contributions of this study are as follows:

Purely CNN-based approach combined with attention strategy using residual-learning is developed for spatial-temporal sea surface current field prediction in a semi-enclosed sea region.
Autocorrelation analysis method was adopted for deriving the inter-correlation and inter-connection of the historical ocean current field time-series, to furtherly determine the optimal input time-lags.
Instead of using the burdensome signal-based decomposition, only historical current velocity characteristics were employed for the future evolution of ocean current forecasting in advance.
Experiments were performed for both Northward and Eastward current velocity components with different leading-heads predictions.
The proposed approach can improve surface current prediction accuracy remarkably by capturing and preserving spatial co-variabilities and temporal dependencies.
Multi-steps ahead field forecasting was furtherly elaborated in detail for a better validation of the proposed approach and better understanding of the spatial distributions and variabilities.
This study provides an alternative promising-insight and perspective to improve offshore ocean dynamic predictions on the aspect of both accuracy and efficiency by using purely CNN-based neural-leaning methods.

The remainder of this manuscript is arranged as follows. Section 2 introduces the proposed machine learning approaches. Section 3 presents the study area, dataset, and details of the experimental results with discussions. Finally, the conclusions are presented in Section 4.

Methodology

The Unet Framework

In the proposed method, we aim to established a practical end-to-end learning model for realizing efficient and accurate 2D ocean current forecasting, and both the input and output tensors were aggregated directly as the 2D matrix (imagery-like format). At the meantime, the Unet is a CNN-based model that was originally developed for image processing tasks. And it has been widely employed in various imagery fields, consisting of medical image analysis, satellite image processing, and video object tasks. In addition, the Unet backbone exhibits several advantages in the field of 2D fluid dynamics (Liu et al. Citation2023; Wu et al. Citation2019; Yu, Chen, and Wei Citation2023). U-Net is considered as a classical model with encoder-decoder framework, and it has a relatively concise architecture, which enables it more easier to train, and usually with less computational-resources requirements compared to other deep-learning approaches. In addition, the Unet backbone holds a relatively simple framework-architecture that can be conveniently modified or extended for specific tasks. This enables it a good choice for the current flow speed prediction, where the input and the targeted output tensors may vary on the basis of the specific applications. More importantly, the U-net structure is known for its superior capability to generalize well to newly introduced dataset, especially when trained on relatively small datasets (Ronneberger, Fischer, and Brox Citation2015). This is beneficial for ocean surface current prediction, where the real-measured data can be scarce or expensive to obtain. Because of these advantages as indicated above, the Unet backbone is employed to be the foundation framework in this study. Yet, there are still some drawbacks. First, the Unet model exhibits the lack of capability on fully reproducing the temporal information, since the Unet is primarily designed for image segmentation tasks and does not inherently capture sequentially temporal features. In time-series prediction or video-series forecasting, it would be essential to involve additional strategy for preserving the inherent temporal dependencies. That is why we incorporated a sequential sliding-window strategy in the sea surface current field data preparation stage, for the purpose of obtaining the optimal inter-correlation of the historical current time-lags for ensuring the relatively short-term temporal-dependencies could be preserved inside the input tensor. Second, the Unet is constructed using several CNN blocks, and belongs to a deep neural network, which means that it could be computationally intensive to train. This would be a barrier for its applications, especially when computational resources are limited. And that is why, in this study, we removed all the BatchNormalization layers, MaxPooling2D layers, and replaced all the UpSampling2D layers from the raw Unet using Conv2DTranspose layer with 3 × 3 small convolutional kernel, to finally improve model efficiency.

The Multi-head Attention Module

The Multi-head attention mechanism was developed on the basis of the single-scaled dot-product attention module, which has been demonstrated to be able to capture more underlying high-dimensional features of spatial-temporal dynamic fields (Niu, Zhong, and Yu Citation2021). The schematic diagram of the multi-head attention module is presented in .

Figure 1. Schematic diagram of multi-head attention module.

Providing that C indicates feature number of the input temporal sequential tensors, which also represents the sequential window-length with potential temporal dependencies, H and W denote the height and width of each input matrix. According to the input sequential tensor $X = [x_{1, \dots}, x_{N}] \in ℝ^{H x W x C}$ , three corresponding query terms, the key matrix K, the query matrix Q, and the value matrix V, will be derived based on dot-product (Vaswani et al. Citation2017) with three projected weight matrices W_k $\in ℝ^{D_{x} x D_{k}}$ , W_q $\in ℝ^{D_{x} x D_{q}}$ , and W_v $\in ℝ^{D_{x} x D_{v}}$ respectively:

K = X W_{k} \in ℝ^{H x W x D_{k}}

Q = X W_{q} \in ℝ^{H x W x D_{k}}

(1)

V = X W_{v} \in ℝ^{H x W x D_{k}}

(1)

It should be noticed that the dimension of the key and query matrix ought to be identical. The similarity between the ith query feature $q_{i}^{T} \in ℝ^{D_{k}}$ and the jth key feature $k_{j} \in ℝ^{D_{k}}$ is evaluated by a normalization function ξ( $q_{i}^{T} k_{j}$ ) $\in ℝ^{1}$ . And the similarities between ξ( $q_{i}^{T} k_{j}$ ) and ξ( $q_{j}^{T} k_{i}$ ) are calculated by diverse layers, consequently, are generally not symmetric. The attention weights at position i are derived based on dot-production module by aggregating the similarity features from all pairs of tensor positions using the weighted summation:

(2)

D (K, Q, V) = ξ (\frac{Q K^{T}}{\sqrt{d_{k}}}) V

(2)

Where d_k denotes the dimension number of the queries and keys matrix. Higher dimensions of input sequential tensors would lead to larger d_k correspondingly, which drives the softmax normalization function employed in this study into a local region with extremely small gradients (Vaswani et al. Citation2017). Thus, this introduced scaled term $\frac{1}{\sqrt{d_{k}}}$ in the weighted summation process can alleviate the vanishing gradient issue.

Instead of implementing only one single attention operation within the attention mechanism using d_tensor dimensional queries, values, and keys, three attention query matrices were linearly projected multi-times, here h = 3 times, on the sequential sea surface current tensors and all the three projected values, queries, and keys matrices are computed in parallel. The final projected weight matrices are concatenated for yielding the final refinement features.

The ith row of weight matrix derived according to formula (2) can be expressed as:

(3)

D {(K, Q, V)}_{i} = \frac{\sum_{j = 1}^{N} e^{q_{i}^{T} k_{j}} v_{j}}{\sqrt{d_{k}} \sum_{j = 1}^{N} e^{q_{i}^{T} k_{j}}}

(3)

The above formula (3) can be rewritten referring to different normalization functions:

(4)

D {(K, Q, V)}_{i} = \frac{\sum_{j = 1}^{N} nor (q_{i}, k_{i}) v_{i}}{\sum_{j = 1}^{N} nor (q_{i}, k_{i})}

(4)

The correlated similarity between q_i and k_j is quantified by nor(q_i, k_j).

A specific constrain term is, in principle, required to be incorporated into the normalization function nor(), which can ensure the selected attention function to be non-negative. This constrain term can be expressed as kernel k(x, y): $ℝ^{2 x F} \to ℝ_{+}$ . The above formula with the specific constrain term providing such a kernel with a feature indication function $δ (x)$ can then be written as:

(5)

D (K, Q, V) = \frac{\sum_{j = 1}^{N} δ {(q_{i})}^{T} δ (k_{i}) v_{i}}{\sum_{j = 1}^{N} δ {(q_{i})}^{T} δ (k_{i})}

(5)

The above formula (5) can be simplified when the numerator is written in vector form:

(6)

(δ (Q) δ {(K)}^{T}) V = δ (Q) (δ {(K)}^{T} V)

(6)

The linearized attention operation based on formula (6) alleviates the memory and computational-resource requirements; and is also able to derive sequence generation/projection in linear manner, which exhibits a valuable reference for sequential time series mapping issues (Katharopoulos et al. Citation2020).

The Residual Learning Based Unet with Multi-head Attention Module

In the Unet-like CNN-based framework, highly dimensional feature maps of sequential SSC tensors are extracted by the encoder part, which is composed of low-level nonlinear information, while high-level semantic information are derived in decoder module. Yet, ordinary skip-connections mechanic in the original Unet-based architecture may lead to insufficient exploration of underlying contextual and semantic information (Wang et al. Citation2022; Yang et al. Citation2023), especially in temporal sequential pattern mapping aspect. Thus, we incorporated two multi-head attention blocks into the raw Unet structure and employed residual learning mechanic (Manucharyan, Siegelman, and Klein Citation2021) for the model training and forecasting, and both the low and high level spatial-temporal sequential features are furtherly fused and refined through the two attention blocks (as shown in the subplot attention module of ). One of the key advantages/contributions of the ResNet is that it adopts the residual blocks as its basic backbone, which enables the network to be deeper and mitigate the vanishing gradient issue that would usually occur in very deep neural networks. This increased depth can help the ResNet network to learn more complicated features and furtherly improve its performance. While ResNet can be trained very quickly, the deeper architecture with more neural layers also means that it would require much more computational resources. And there is still a potential of model overfitting, especially if the training dataset is not representative of the real-underlying distribution.

Figure 2. The diagram of MHA-ResUnet structure.

The main difference compared with Manucharyan’s model is that our proposed model is constructed based on the popular Unet-like framework which exhibited great potential in spatial-temporal feature mapping capability (Zhang, Stanev, and Grayek Citation2020), moreover, we removed all BatchNormalization layers and Dropout layers in between each CNN blocks, and we replaced all the UpSampling2D layer from the raw Unet using Conv2DTranspose layer with 3 × 3small convolutional kernel in the Decoder part. More importantly, we introduced the multi-head attention blocks combined with the residual-learning blocks. Also, we just aggregated the residual-learning blocks into the decoder part, which could accelerate the model computational efficiency compared to Manucharyan’s model. The output of our developed forecasting method is the 2D current field directly, yet, Manucharyan employed a fully connect layer at the end of their model, which indicates a furtherly tensor transition-operation is inevitable. Furthermore, by introducing a sequential sliding window strategy with strict chronological order into the original input tensor preparation stage, the proposed model can accurately map the temporal variability based on the continuous SSC evolution dependencies. Specifically, the spatial sea surface current covariance can be extracted and learned by the specific 2D spatial convolutional kernel from the sequential current tensors, and the temporal-dependencies of each single grid-point for current variability will be preserved and captured by the introduced sequential sliding window mechanism. The diagram of MHA-ResUnet structure is illustrated in .

Within the established MHA-ResUnet structure, only two attention modules were incorporated into the original Unet architecture (as indicated in the ResUnet module panel of ), one module was inserted in the last CNN layer of the decoder part, and the other one just located after the bottleneck layer in the decoder module. Yet, one should note that more attention blocks introduced inside the forecasting model increased the computation costs prominently, meanwhile, the prediction accuracy was keept unchanged almost and even deteriorated, this might be caused by model overfitting. The Bottleneck layer compresses the extracted highly features into a flatten-similar vector, and the added attention module in between Bottleneck layer then can query and project more embedded features with refinement process. While, the second attention block is capable of augmenting the original input feature-maps in between each encoder and decoder layer based on skip-connection operation, the deeper augmentation and refinement of spatial feature-based operation implemented by the introduced attention mechanism will improve the final forecasting outputs (Vaswani et al. Citation2017).

Experiments and Discussions

The Semi-enclosed Sea

The North Sea, as part of the North Atlantic Ocean, holds one of the most densely populated coastlines around the world; potential coastal flooding and storm surges caused by global climate changes can trigger natural disasters to these low-lying coastal regions (Wahl et al. Citation2013). The complicated shallow water hydrodynamics and climatology interactions can also bring significant impacts on marine engineering and offshore explorations in the North Sea. Thus, we select this semi-enclosed North Sea region (as shown within the red square in ) as the hot-spot case study to explore and predict its surface current variabilities. The bathymetric topography of the North Sea is illustrated in . Here, we implement the SSC field predictions via Deep Learning methodology by employing reanalysis dataset. The employed hourly SSC reanalysis dataset was validated against estimations from radar observation, and validation experiments demonstrated that good agreement was achieved compared to radar observations (O’Dea et al. Citation2012). The reanalysis dataset is publicly available at https://data.marine.copernicus.eu/products. Totally one year hourly SSC dataset in 2017 was selected for our forecasting experiments, each SSC map is a 128 × 1282D matrix, first half-year hourly data were employed as training dataset, the independent testing/forecasting dataset covers a total of the following independent 3 months (about 1800 hourly SSC snapshots).

Figure 3. Topography of the North Sea.

Sea Surface Current Velocity Field Prediction

Temporal Sequential Sliding Window

The Northward and Eastward SSC signal propagations are illustrated in . Both northward and eastward current velocity exhibit a quasi-periodic propagation patterns within the 48hour in each grid-point, about 12 hours per quasi-cyclicity, which can help us to determine the relatively optimal sequential sliding-window length. For instance, one current propagation quasi-cyclicity as indicated in consists of about 12 hourly grid-points based current time-series or field-based snapshots, potential temporal interdependency and interconnection could be coupled and entangled within one quasi-periodicity.

Figure 4. Time versus each grid-point-based surface current velocity diagram from reanalysis dataset. The panels (a) and (c) represent northward and eastward current propagation correspondingly, while land-region in our selected area with missing values in both subplots is replaced using 0, this can present current signal propagation in a smooth perspective. The panels on the right, (b) and (d), are the same as (a) and (c), of which white blank indicates land grid-points without sea surface current propagation.

In order to fully explore the appropriate input time-lags for the forecasting models, autocorrelation analysis for both northward and eastward current velocity was furtherly implemented. shows the autocorrelation analysis for determining the historical input time-lags. The shadow blue band in each subplot indicates the upper and lower bound of 95% confidential interval. Derived correlations distributed within the confidential interval explicate that these related time lags are not significant. The upper panel (a) in represents the northward spatial-averaged SSC time series ACF and PACF, and the panel (b) is the same as panel (a), but for eastward spatial-averaged SSC time-series. Autocorrelation analysis for the surface current time-series illustrates that the PACF within the confidence interval occurs at time lag 12 and lag 11 as indicated in , respectively. Note that visualizing autocorrelation analysis for both northward and eastward velocity can provide us with a comprehensive perspective on the overall SSC field sequential lagging-dependencies. Thus, combining the aforesaid current propagation analysis in Fig .4 and the autocorrelation illustrations in , the sequential SSC series from t‒1 to t‒12 are finally determined as the main auto-correlated historical time-lags, which includes one propagation-based quasi-periodicity. And the sequential sliding-window (SSW) with fixed length 12 for preparing the input SSC tensors is determined as SSW_t = (SSC_t‒12, … , SSC_t‒2, SSC_t‒1).Each pair of training dataset consists of 12 SSC snapshots with strict sequential order.

Figure 5. Autocorrelation analysis of SSC variability. Panel (a) denotes the autocorrelation function (ACF) and partial autocorrelation function (PACF) of the spatial-averaged northward SSC time series, while panel (b) indicates the same as (a), but for eastward current.

In the context of fluid field dynamics (Bhatnagar et al. Citation2019), flow field snapshots usually indicate a series of 2D matrix/image representing the velocity field of a specific fluid, for instance the current field, at different time-steps. These aggregated snapshots would be employed as the training samples for a 2D CNN to capture the spatial interdependencies/correlations in the temporal evolution of flow field. The 2D CNN structure was capable of learning and reproducing the spatial hierarchies of field features from the model input samples, which helps in capturing spatial interdependencies (Bhatnagar et al. Citation2019). As illustrated in , totally 12 SSC snapshots with strict sequential order were aggregated as the model input 2D tensor, a relatively short-term spatial SSC patterns, thus would be captured by the 2D kernel-based filter, the specific 2D kernel is a small matrix that is able to learn to detect specific regional- current patterns in the input tensor. For example, a filter could be designed to detect edges or sharp variabilities in surface current direction, thus, can effectively reproduce the inter-correlations among the local regions. Specifically, the feature maps derived based on the 2D kernel can capture the spatial inter-dependencies between adjacent grid-cells by representing local current patterns and relationships (Dai et al. Citation2023).

Figure 6. The convolutional operation in the 2D sea surface current input tensor.

Each snapshot of the SSC is composed of Height × Width grid points (128 × 128). A fixed sliding window length is proposed as 12 as indicated above, then, a set of training and testing dataset consists of 12 × Height × Width sequential SSC tensor with strict chronological order, and one output tensor with given leading-steps were set, respectively. For the sake of speeding up model training and convergence, the prepared dataset is normalized into the range [−1,1] before implementing experiments. The learning rate of the Adam optimizer was set as 1e-4, the batchsize was set as 120, and the model training procedure will be early terminated when the set validation loss has stopped improving after continuous seven steps. The loss function employed in our proposed model is Huber loss, which computes the mismatch between predicted SSC field and reanalysis-based SSC field, and it was minimized with gradient descent algorithm.

(7)

L_{ς} (O, Ψ (X)) = \{\begin{matrix} \frac{1}{2} {(O - Ψ (X))}^{2} \\ ς |O - Ψ (X)| - \frac{1}{2} ς^{2} \end{matrix}

(7)

where Ψ indicates the established prediction model, O denotes the reanalysis targets. In general, the Huber loss function varies from a quadratic to linear term when ς changes from small to big values (Meyer Citation2021). Huber loss is less sensitive to outlier values, since it tends to approach L2 loss t when ς tends to be 0, and approximate L1 when ς tends to be positive infinity; we test and finally set ς as 1.5 in this study.

The research workflow using the established neural-learning method, consisting of data pre-processing, model training process, optimal model saving strategy, and the final forecasting procedures, is illustrated in .

Figure 7. The research flowchart of the developed method.

We implemented spatial-temporal SSC velocity forecasting experiments within the deep learning framework of Tensorflow using a single NVIDIA A100 GPU. It took about less than 3 minutes to finish one total training session, which can save the computational resources and improve the forecasting efficiency significantly. The work was carried out at National Supercomputer Center in Tianjin, and the calculations were performed on TianHe-HPC.

Prediction Experiments

The Spatial-resolved Pearson Correlation Coefficient (PCC), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE) were employed as assessment metrics to evaluate the prediction performance of the established forecasting models. Several machine learning methodologies, including the Encoder-Decoder method, the purely Unet-based deep learning approach, CNN-LSTM, RNN and LSTM, were established for a comprehensive methodological comparison.

(8)

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(X_{prediction, i} - Y_{observation, i})}^{2}}

(8)

(9)

PCC = \frac{\sum_{i = 1}^{n} (X_{prediction, i} - \overset{ˉ}{X}) (Y_{observation, i} - \overset{ˉ}{Y})}{\sqrt{\sum_{i = 1}^{n} {(X_{prediction, i} - \overset{ˉ}{X})}^{2}} \sqrt{\sum_{i = 1}^{n} {(Y_{observation, i} - \overset{ˉ}{Y})}^{2}}}

(9)

(10)

MAE = \frac{1}{n} \sum_{i = 1}^{n} |X_{prediction, i} - Y_{observation, i}|

(10)

Where X_prediction,i represents the predicted SSC snapshots, while Y_{observation,i} indicates the reanalysis SSC data. Note that higher values of obtained SPCC represent better temporal covariance between the forecasting and reanalysis target.

We employ different established methodologies to implement 1hour, 3hour, 6hour, 9hour, 12hour, 24hour-ahead, totally six individual SSC predictions, and to fully assess the prediction performance of the proposed methodology. The derived assessment metrics were displayed from , note that all the SSC units including error metrics, reanalysis, and model predictions in this study are in meter per second (m/s). The derived experimental results and figures below are all based on the testing dataset with 1800 hourly SSC fields.

Figure 8. Spatial-averaged PCC derived based on different methods with different prediction leading steps.

Figure 9. Spatial-averaged MSE derived based on different methods with various prediction leading steps.

Figure 10. Spatial-averaged RMSE derived based on different methods with different prediction leading steps.

As illustrated by the three error-metrics-based evaluators, consisting of MAE, RMSE and PCC correspondingly, in , the established multi-head attention Unet approach presents the best forecasting performance amongst all the employed deep-learning methods. Lowest MAE and RMSE were obtained by the proposed model for all leading-step predictions. And highest spatial-resolved area mean PCC in these different leading-predictions indicates that the proposed MHA-ResUnet can realize best temporal covariance mapping between predictions and reanalysis. The obtained best forecasting performance also proved that the adopted sequential sliding window strategy for preparing sequential training dataset can well preserve the temporal SSC interdependency and interconnection within quasi-periodicity.

A table chart-based spatial-averaged PCC derived using different machine learning methods with different prediction leading steps is furtherly shown in . As illustrated in , better PCC were achieved by the proposed MHA-ResUnet method compared to other approaches amongst all different ahead predictions.

The visual comparison of spatial-averaged MAE amongst these established machine learning methods for SSC forecasting with different leading steps is displayed in . As can be seen in , the lowest MAE were derived by the proposed MHA-ResUnet method across all different leading-ahead forecasting experiments.

The visual comparisons of spatial averaged RMSE metrics are shown in , same as the afore-mentioned MAE comparison, smallest RMSE were derived and achieved by the proposed Unet based methodology, which proved again the superiorities of the MHA-ResUnet. Thus, it can be concluded that the proposed deep learning methodology outperforms the rest selected models in the semi-enclosed basin-wide SSC variability pattern mapping. And the spatial-averaged RMSE for 1hour ahead and 24hour ahead forecasting is 0.0099 m/s and 0.0385 m/s respectively, which is especially sufficient for tracking and forecasting oil spill, ocean garbage trajectory and sea surface rescue operation in marine accidents, especially within acceptable error range.

Furtherly, in order to check the similarity of forecasting distribution patterns concerning on the spatial-averaged indices, the distribution similarity between predictions and reanalysis target was also derived and quantified by employing the two-sample Kolmogorov-Smirnov test (Hodges Citation1958) as a testing metric (as indicated in ). The smaller of KS statistics was derived, the greater possibility that these two distributions will be similar, as can be seen from that the smallest KS statistic was achieved by the stablished MHA-ResUnet, which confirms that this new neural-learning model outperforms the other techniques in terms of SSC speed field trend forecasting.

Table 1. Kolmogorov-Smirnov test statistic with p-value for quantifying the data distribution between SSC observation and forecasting means (according to the significant test, if the computed p-value <.01, it is very possible that the two data distributions are different).

Download CSV Display Table

Spatial-resolved Predictions Using MHA-Resunet

To furtherly explore the prediction performance implemented by the developed MHA-ResUnet approach in a more clearly spatial perspective, the snapshot of spatial-resolved SSC velocity field predictions with different leading steps together with the corresponded deviation maps is derived and illustrated in . It is indicated by the reanalysis and forecasting results between panel (b) and (c) from that the proposed method can well predict the sequential SSC variability patterns, although discrepancies gradually occurred with longer prediction leading steps. Small deviations shown in panel (a) amongst different leading step predictions denote that the general spatial patterns were well captured and reproduced in the forecasting experiments.

Figure 11. Snapshots of spatial-resolved northward SSC velocity field predictions with different leading steps. Panel (a) indicates the deviation between reanalysis (b) and predictions (c).

The spatial-resolved error metrics derived between predictions based on the MHA-ResUnet method is illustrated in . It is shown in that the proposed approach can achieve favorable forecasting performance up to 24hrs ahead, while the forecasting error is expected to be gradually growing with longer leading head predictions (yet, the spatial area-averaged RMSE is less than 0.04 m/s in 24hrs ahead predictions).

Figure 12. Spatial-resolved Pearson correlation coefficient distribution maps (a), mean absolute error maps (b) and Root mean square error distribution maps (c) between northward SSC velocity field forecasting and reanalysis, derived with different leading steps using the MHA-ResUnet method.

As indicated in , the longer the number of leading-steps we set, the lower the prediction accuracy will be derived, yet, the general SSC spatial patterns were still captured and preserved by the proposed methods as illustrated in . For area-mean SSC time series calculated from the field predictions, 0.99 and 0.90 PCC were obtained in 1hrs and 24hrs ahead prediction respectively, the RMSE is 0.00190 and 0.03423 m/s correspondingly. In addition, the area-mean based quantitative forecasting assessment of northward current was derived and shown in .

Figure 13. Scatter plots for the area-mean northward current velocity between reanalysis (indicated as observation on the y-axis of each subplots) and machine learning model predictions. The histogram distributions for reanalysis and forecasting results are introduced as additional plots on the y-axis and x-axis of each subplot, respectively.

Table 2. Area-averaged PCC of ocean current field prediction with different prediction leading steps.

Download CSV Display Table

The Eastward SSC velocity field forecasting experiments were furtherly implemented by using the proposed Unet-based machine learning method, which can also provide us with a perspective on validating the generalization capability of the proposed methods in spatial-temporal SSC predictions, and experimental results were derived and illustrated as follows.

It was illustrated in that accurate predictions were achieved by the developed MHA-ResUnet approach, the spatial patterns of eastward current velocity field in panel (c) of were well reproduced compared to the reanalysis target in panel (b) of .

Figure 14. The same as , but for the eastward SSC velocity predictions.

Figure 14. The same as Figure 11, but for the eastward SSC velocity predictions.

The spatial-resolved forecasting metrics for eastward current velocity field are displayed in below . The developed deep learning methodology in our study can also well predict the eastward current velocity field with spatial-averaged PCC as 0.99 and 0.83 for 1hr and 24hrs ahead predictions, and the spatial-averaged RMSE for 1hour and 24hrs ahead predictions was less than 0.006 m/s and 0.02 m/s. These highly accurate current velocity predictions can provide us with invaluable benefits and guidance for fishing, shipping industry, tourist industry and ocean research based activities.

Figure 15. The same as , but for the eastward SSC velocity predictions.

Figure 15. The same as Figure 12, but for the eastward SSC velocity predictions.

The same as northward current field forecasting experiments, for eastward current predictions, the longer the number of leading steps in advance prediction we set, the lower the prediction accuracy will be obtained as displayed in , still, the general SSC spatial patterns were preserved by the proposed methods as illustrated in . And, the area-mean based quantitative forecasting assessment of eastward current was derived and shown in .

Figure 16. The same as , but for the eastward SSC velocity predictions.

Figure 16. The same as Figure 13, but for the eastward SSC velocity predictions.

Table 3. The same as table 2, but for eastward SSC.

Download CSV Display Table

Furthermore, for area-mean time series derived from the field predictions, 0.999 and 0.947 PCC were achieved in 1hour and 24hour ahead prediction respectively, the RMSE is 0.00319 and 0.02949 m/s correspondingly.

Based on the northward and eastward current velocity forecasting experiments, accurate spatial-temporal current field predictions can be achieved by our proposed Unet-based deep learning model, which proves the superiority of the purely CNN-based technique with introduced attention mechanism.

Spatial-temporal 2D sea surface current field variability is of great interest and importance on the aspect of marine shipping, cleaner ocean energy, and offshore engineering constructions, and is especially important for urgent managements in extreme weather scenarios. This study established an efficient 2D ocean current pattern estimation model to bridge the aforementioned research gaps, including the employment burdensome signal decomposition approaches, focusing on only single grid-point based current velocity time-series predictions, using the relatively time-consuming RNN/LSTM based structures, in terms of 2Docean current field prediction. The proposed neural-learning approach realized accurate and efficient ocean current field forecasting, and can provide valuable guidance for nearshore and offshore marine engineering. Yet, this new method was implemented within a semi-enclosed sea region, it would be more interesting to furtherly test its practicality in larger open-sea region.

Conclusions

Spatial-temporal ocean surface current predictions can provide invaluable benefits for offshore engineering and marine research activities, for instance oil spill trajectory tracking, marine accident rescue operation, tidal current-based green energy exploration, navigation planning of Autonomous underwater vehicle, and so on. Yet, most present studies focused only on single grid-cell and location-based current forecasting without a fully spatial perspective. And the regional numerical ocean models consume too much computational resources, which is time-consuming and computational expensive. Furthermore, accurate prediction of surface current variabilities in semi-enclosed offshore ocean is more important, of which less attention has yet been paid. In this study, a Unet-based deep learning methodology with two introduced multi-head attention blocks using residual learning was proposed for spatial-temporal sea surface current velocity forecasting in a semi-enclosed sea region, the basin-wide North Sea. Several additional machine learning methods, consisting of encoder-decoder model, purely Unet-like based method, CNN-LSTM, RNN and LSTM, were employed for a better performance comparison with the proposed approach. The forecasting was implemented using 1hour, 3hrs, 6hrs, 9hrs, 12hrs and 24hrs ahead individual experiments, by employing reanalysis dataset extracted from the AMM7 model. Both northward and eastward current velocity were employed in the forecasting experiment, the final experimental results demonstrated that the proposed multi-head attention ResUnet approach outperforms other two machine learning methodologies. Very high accurate results were achieved of which RMSE is less than 0.009 m/s and 0.006 m/s for northward and eastward velocity predictions correspondingly were achieved.

As a promising-potential surrogate model for SSC forecasting, the purely CNN-based model established in this study can quickly and accurately predict the offshore SSC variabilities. Furtherly, it is valuable to predict other sea surface parameters such as surface wave field using remote sensing records directly. In addition, a fully spatial surface current prediction can also provide us with a complete spatial perspective for a better tidal energy blueprint planning. The proposed method in this study can have strong potential in marine engineering applications. Yet, potential limitations might occur when employing the proposed method, for instance, the attention mechanisms can usually be sensitive to noise-contained in the input data, which would lead to incorrect attention weights and degraded performance. Thus, it is essential to explore and test a more general ocean current field forecasting method in our next steps.

The experiment in this study testified that a purely CNN-based deep learning method can achieve better performance in mapping complex spatial-temporal SSC variability with very high computational efficiency and very high forecasting accuracy, which can help to solve real-world issues such as operational marine monitoring and ocean environmental forecasting. The CNN-based models already exhibit prominent superiorities in computer vision tasks with very high computational efficiency. Yet, it would be a promising alternative to explore its applicability on a more widely aspect of sequential ocean and meteorology field data mining issues. Future prospective could be employing surface observational current variability to predict subsurface current dynamics, which is extremely helpful for the safety navigation of submarine vehicles and submarine engineering explorations.

Disclosure Statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Funding

This work was supported by the National Natural Science Foundation of China under Grants 52271361 and 52231014, the Special Projects of Key Areas for Colleges and Universities in Guangdong Province under Grant 2021ZDZX1008, and the Natural Science Foundation of Guangdong Province of China under Grant 2023A1515010684.

References

Ali, M., R. Prasad, Y. Xiang, M. Jamei, and Z. M. Yaseen. 2023. Ensemble robust local mean decomposition integrated with random forest for short-term significant wave height forecasting. Renewable Energy 205:731–28. doi:10.1016/j.renene.2023.01.108.
Web of Science ®Google Scholar
Ali, M., R. Prasad, Y. Xiang, A. Sankaran, and R. C. Deo, F. Xiao and S. Zhu. 2021. Advanced extreme learning machines vs. deep learning models for peak wave energy period forecasting: A case study in Queensland Australia. Renewable Energy 177:1031–44. doi:10.1016/j.renene.2021.06.052.
Web of Science ®Google Scholar
Aly, H. H. H. 2020. A novel approach for harmonic tidal currents constitutions forecasting using hybrid intelligent models based on clustering methodologies. Renewable Energy 147:1554–64. doi:10.1016/j.renene.2019.09.107.
Web of Science ®Google Scholar
Bhatnagar, S., Y. Afshar, S. Pan, K. Duraisamy, and S. Kaushik. 2019. Prediction of aerodynamic flow fields using convolutional neural networks. Computational Mechanics 64 (2):525–45. doi:10.1007/s00466-019-01740-0.
Web of Science ®Google Scholar
Bradbury, J., S. Merity, C. Xiong, R. Soche. 2016. Quasi-recurrent neural networks. arXiv preprint arXiv:1611.01576.
Google Scholar
Dai, G., W. Kong, Y. Liu, Y. Ge, and S. Zhang. 2023. Multi-perspective convolutional neural networks for citywide crowd flow prediction. Applied Intelligence 53 (8):8994–9008. doi:10.1007/s10489-022-03980-9.
Web of Science ®Google Scholar
Gibbs, L., R. J. Bingham, and A. Paiement. 2023. A novel filtering method for geodetically determined ocean surface currents using deep learning. Environmental Data Science 2:e44. doi:10.1017/eds.2023.41.
Google Scholar
Hodges Jr., J. L., 1958. The significance probability of the Smirnov two-sample test. Arkiv för matematik 3 (5):469–86. doi:10.1007/BF02589501.
Google Scholar
Immas, A., N. Do, and M. R. Alam. 2021. Real-time in situ prediction of ocean currents. Ocean Engineering 228:108922. doi:10.1016/j.oceaneng.2021.108922.
Web of Science ®Google Scholar
Kalinić, H., H. Mihanović, S. Cosoli, M. Tudor, and I. Vilibić. 2017. Predicting ocean surface currents using numerical weather prediction model and Kohonen neural network: A northern Adriatic study. Neural Computing and Applications 28 (S1):611–20. doi:10.1007/s00521-016-2395.
Google Scholar
Katharopoulos, A., A. Vyas, N. Pappas, and F. Fleuret. 2020. Transformers are RNNs: Fast autoregressive transformers with linear attention. In International conference on machine learning, pp. 5156–5165. PMLR.
Google Scholar
Khare, V., and M. A. Bhuiyan. 2022. Tidal energy-path towards sustainable energy: A technical review. Cleaner Energy Systems 3:100041. doi:10.1016/j.cles.2022.100041.
Google Scholar
LeCun, Y., Y. Bengio, and G. Hinton. 2015. Deep learning. Nature 521 (7553):436–44. doi:10.1038/nature14539.
PubMed Web of Science ®Google Scholar
Liu, L., and J. Wang. 2021. Super multi-step wind speed forecasting system with training set extension and horizontal-vertical integration neural network. Applied Energy 292:1–13. doi:10.1016/j.apenergy.2021.116908.
Web of Science ®Google Scholar
Liu, G., Q. Zhou, X. Xie, and Q. Yu. 2023. Dual conditional GAN based on external attention for semantic image synthesis. Connection Science 35 (1):2259120. doi:10.1080/09540091.2023.2259120.
Web of Science ®Google Scholar
Ma, J., and J. Chen. 2023. Reconstructing higher-resolution four-dimensional time-varying volumetric data. Connection Science 35 (1):2289837. doi:10.1080/09540091.2023.2289837.
Web of Science ®Google Scholar
Manucharyan, G. E., L. Siegelman, and P. Klein. 2021. A deep learning approach to spatiotemporal sea surface height interpolation and estimation of deep currents in geostrophic ocean turbulence. Journal of Advances in Modelling Earth Systems 13 (1):e2019MS001965. doi:10.1029/2019MS001965.
Web of Science ®Google Scholar
Meyer, G. P., 2021. An alternative probabilistic interpretation of the huber loss. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 5261–9.
Google Scholar
Monahan, T., T. Tang, and T. A. A. Adcock. 2023. A hybrid model for online short-term tidal energy forecasting. Applied Ocean Research 137:103596. doi:10.1016/j.apor.2023.103596.
Web of Science ®Google Scholar
Neill, S. P., M. R. Hashemi, and M. J. Lewis. 2014. The role of tidal asymmetry in characterizing the tidal energy resource of orkney. Renewable Energy 68:337–50. doi:10.1016/j.renene.2014.01.052.
Web of Science ®Google Scholar
Neshat, M., M. M. Nezhad, N. Y. Sergiienko, S. Mirjalili, G. Piras, and D. A. Garcia. 2022. Wave power forecasting using an effective decomposition-based convolutional Bi-directional model with equilibrium nelder-mead optimizer. Energy 256:124623. doi:10.1016/j.energy.2022.124623.
Web of Science ®Google Scholar
Nezhad, M. M., M. Neshat, G. Sylaios, and D. A. Garcia. 2024. Marine energy digitalization digital twin’s approaches. Renewable and Sustainable Energy Reviews 191:114065. doi:10.1016/j.rser.2023.114065.
Web of Science ®Google Scholar
Niu, Z., G. Zhong, and H. Yu. 2021. A review on the attention mechanism of deep learning. Neurocomputing 452:48–62. doi:10.1016/j.neucom.2021.03.091.
Web of Science ®Google Scholar
O’Dea, E. J., A. K. Arnold, K. P. Edwards, R. Furner, P. Hyder, M. J. Martin, J. R. Siddorn, D. Storkey, J. While, J. T. Holt, et al. 2012. An operational ocean forecast system incorporating NEMO and SST data assimilation for the tidally driven European North-West shelf. Journal of Operational Oceanography 5 (1):3–17. doi:10.1080/1755876X.2012.11020128.
Web of Science ®Google Scholar
Qian, P., B. Feng, X. Liu, D. Zhang, J. Yang, Y. Ying, C. Liu, and Y. Si. 2022. Tidal current prediction based on a hybrid machine learning method. Ocean Engineering 260:111985. doi:10.1016/j.oceaneng.2022.111985.
Web of Science ®Google Scholar
Ren, L., Z. Hu, and M. Hartnett. 2018. Short-term forecasting of coastal surface currents using high frequency radar data and artificial neural networks. Remote Sensing 10 (6):850. doi:10.3390/rs10060850.
Web of Science ®Google Scholar
Röhrs, J., G. Sutherland, G. Jeans, M. Bedington, A. K. Sperrevik, K.-F. Dagestad, Y. Gusdal, C. Mauritzen, A. Dale, and J. H. La-Casce. 2021. Surface currents in operational oceanography: Key applications, mechanisms, and methods. Journal of Operational Oceanography 6 (1):60–88. doi:10.1080/1755876X.2021.1903221.
Web of Science ®Google Scholar
Ronneberger, O., P. Fischer, and T. Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical image computing and Computer-Assisted Intervention–MICCAI 2015: 18th international conference, ed. N. Nassir, J. Hornegger, W. M. Wells, and A. F. Frangi, 234–41. October 5–9, 2015 Proceedings, Part III vol. 18. Munich, Germany: Springer International Publishing.
Google Scholar
Sarkar, D., M. A. Osborne, and T. A. A. Adcock. 2018. Prediction of tidal currents using Bayesian machine learning. Ocean Engineering 158:221–31. doi:10.1016/j.oceaneng.2018.03.007.
Web of Science ®Google Scholar
Sarkar, D., M. A. Osborne, and T. A. A. Adcock. 2019. Spatiotemporal prediction of tidal currents using Gaussian processes. Journal of Geophysical Research Oceans 124 (4):2697–715. doi:10.1029/2018JC014471.
Web of Science ®Google Scholar
Shao, Q., G. Hou, W. Li, G. Han, K. Liang, and Y. Bai. 2021. Ocean reanalysis data‐driven deep learning forecast for sea surface multivariate in the South China Sea. Earth and Space Science 8 (7):e2020EA001558. doi:10.1029/2020EA001558.
Web of Science ®Google Scholar
Shen, Y. T., J. W. Lai, L. G. Leu, Y. C. Lu, J. M. Chen, H. J. Shao, H. W. Chen, K. T. Chang, C. T. Terng, Y. C. Chang, et al. 2019. Applications of ocean currents data from high-frequency radars and current profilers to search and rescue missions around Taiwan. Journal of Operational Oceanography 12(sup2):S126–S36. doi:10.1080/1755876X.2018.1541538.
Web of Science ®Google Scholar
Sinha, A., and R. Abernathey. 2021. Estimating ocean surface currents with machine learning. Frontiers in Marine Science 8:672477. doi:10.3389/fmars.2021.672477.
Web of Science ®Google Scholar
Tamtare, T., D. Dumont, and C. Chavanne. 2021. Extrapolating Eulerian ocean currents for improving surface drift forecasts. Journal of Operational Oceanography 14:71–85. doi:10.1080/1755876X.2019.1661564.
Web of Science ®Google Scholar
Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Jones, N. Aidan, L. Aidan, and I. Polosukhin. 2017. Attention is all you need. Advances in Neural Information Processing Systems 30:5998–6008.
Google Scholar
Wahl, T., I. D. Haigh, P. L. Woodworth, F. Albrecht, D. Dillingh, J. Jensen, R. J. Nicholls, R. Weisse, and G. Woppelmann. 2013. Observed mean sea level changes around the North sea coastline from 1800 to present. Earth Science Review 124:51–67. doi:10.1016/j.earscirev.2013.05.003.
Web of Science ®Google Scholar
Wang, H., P. Cao, J. Wang, and O. R. Zaiane. 2022, June. Uctransnet: Rethinking the skip connections in u-net from a channel-wise perspective with transformer. Proceedings of the AAAI Conference on Artificial Intelligence 36(3):2441–49. doi:10.1609/aaai.v36i3.20144.
Google Scholar
Wen, J., J. Yang, and T. Wang. 2021. Path planning for autonomous underwater vehicles under the influence of ocean currents based on a fusion heuristic algorithm. IEEE Transactions on Vehicular Technology 70:8529–44. doi:10.1109/TVT.2021.3097203.
Web of Science ®Google Scholar
Wu, Z., Y. Gao, L. Li, J. Xue, and Y. Li. 2019. Semantic segmentation of high-resolution remote sensing images using fully convolutional network with adaptive threshold. Connection Science 31 (2):169–84. doi:10.1080/09540091.2018.1510902.
Web of Science ®Google Scholar
Wu, H., Y. Liang, and X. Z. Gao. 2023. Left-right brain interaction inspired bionic deep network for forecasting significant wave. Energy 278:127995. doi:10.1016/j.energy.2023.127995.
Web of Science ®Google Scholar
Wu, H., Y. Liang, X. Z. Gao, P. Du, and S. P. Li. 2023. Human-cognition-inspired deep model with its application to ocean wave height forecasting. Expert Systems with Applications 230:120606. doi:10.1016/j.eswa.2023.120606.
Web of Science ®Google Scholar
Xie, C., P. Chen, T. Man, and J. Dong. 2023. Stcanet: Spatiotemporal coupled attention network for ocean surface Current prediction. Journal of Ocean University of China 22:441–51. doi:10.1007/s11802-023-5269-2.
Web of Science ®Google Scholar
Yang, S., X. Zhang, Y. Chen, Y. Jiang, Q. Feng, L. Pu, and F. Sun. 2023. UcUNet: A lightweight and precise medical image segmentation network based on efficient large kernel U-shaped convolutional module design[J. Knowledge-Based Systems 278:110868. doi:10.1016/j.knosys.2023.110868.
Web of Science ®Google Scholar
Yin, J., and N. Wang. 2021. Predictive trajectory tracking control of autonomous underwater vehicles based on variable fuzzy predictor. International Journal of Fuzzy Systems 23 (6):1809–22. doi:10.1007/s40815-020-00898-7.
Web of Science ®Google Scholar
Yu, Y., S. Chen, and H. Wei. 2023. Modified UNet with attention gate and dense skip connection for flow field information prediction with porous media. Flow Measurement and Instrumentation 89:102300. doi:10.1016/j.flowmeasinst.2022.102300.
Web of Science ®Google Scholar
Yu, W., J. Gonzalez, and X. Li. 2021. Fast training of deep LSTM networks with guaranteed stability for nonlinear system modelling. Neurocomputing 422:85–94. doi:10.1016/j.neucom.2020.09.030.
Web of Science ®Google Scholar
Zhang, Z., E. V. Stanev, and S. Grayek. 2020. Reconstruction of the basin-wide sea-level variability in the North sea using coastal data and generative adversarial networks. Journal of Geophysical Research Oceans 125:e2020JC016402. doi:10.1029/2020JC016402.
Web of Science ®Google Scholar

Spatial-temporal Offshore Current Field Forecasting Using Residual-learning Based Purely CNN Methodology with Attention Mechanism

ABSTRACT

Introduction