1,018
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Using machine learning for cutting tool condition monitoring and prediction during machining of tungsten

ORCID Icon, , , &
Pages 747-771 | Received 22 Dec 2022, Accepted 08 Aug 2023, Published online: 15 Sep 2023

ABSTRACT

Machining of single-phase tungsten, used as a plasma facing material in fusion energy reactors, is commonly associated with rapid tool wear and short tool life. Conventional methods of monitoring tool wear or changing cutting tools after a predetermined period are inefficient and can lead to unnecessary tool change or risk damaging the workpiece. Tool wear can adversely affect the surface finish and dimensional tolerances of machined parts. Predicting its onset can avoid this critical damage whilst ensuring maximum tool life is utilised. In this paper, firstly the tool life results in end milling single-phase tungsten using different cutting tool geometries and cutting speeds are provided for the first time. A novel method is proposed by combining sensor signal prediction and classification machine learning models. It works by forecasting the cutting tool bending moment signal which is then used for predicting future cutting tool condition in end milling of pure dense tungsten. A series of machining experiments, covering the whole life of a cutting tool, were performed to collect the sensor signals. The current time series signal from the sensory tool holder is employed to forecast the future signal by training a 1D convolutional neural network (1D CNN) and an artificial neural network (ANN). The forecasted signal is then used to predict the state of the cutting tool in the future. Machine learning classifiers namely, random forest (RF), support vector machine (SVM) and extreme gradient boosting (XGBoost) supervised learning models were trained and validated on actual sensor signals to correlate the tool conditions with specific sensor signal features. The investigations revealed that the 1D CNN performed best in forecasting the time series sensor signal whilst achieving a mean absolute error of 3.37. In addition, the RF, when trained on Wavelet Scattering features, resulted in the most accurate classification of sensor signals for tool condition detection. The analysis showed that the combination of 1D CNN signal forecasting, feature extraction through statistical analyses and RF classifier performs best in predicting the state of a cutting tool in near future. Using this method allows for decision making for changing the tool whilst ensuring that the maximum useful life of a cutting tool is utilised. It also enables preventing undesired damage to the machined surface due to late detection of tool wear or delays in taking appropriate actions. The application of this method can reliably reduce the manufacturing costs and resource consumption associated with cutting tools for machining tungsten and minimise tool wear induced damage to the workpiece.

1. Introduction

The increasing demand for clean energy and the requirements for operational safety as well as energy security have been the driving factor in the developments for fusion energy production. Fusion energy production addresses many of the issues associated with operating fission energy reactors. However, due to high costs and technical limitations, the deployment of fusion energy reactors still remains limited to research (Tikhonchuk Citation2020). Other constraints in the reactor interior including high operational temperature, enormous thermal shock from repeated plasma strikes and prolonged exposure to irradiation damage due to neutron bombardment preclude the use of majority of conventional materials (Haag et al. Citation2023). Materials selected for this harsh environment must thrive in the long term. Due to their high-temperature properties, refractory metals and specifically tungsten are used as plasma facing material components in these reactors (Ueda et al. Citation2014).

In order to be able to scale up the operations of fusion reactors, tungsten parts need to be machined to specific geometries. However, owing to the properties of tungsten such as its brittleness and high material strength and hardness, its machining is mainly characterised by short tool life (Edstrom et al. Citation1980). This in turn requires frequent tool change which can be very inefficient and uneconomical in terms of increased production time and manufacturing costs. Olsson et al. (Citation2021) investigated different tool materials, namely, ceramics, coated and uncoated tungsten carbide, cermet, PCD and PcBN in turning single-phase tungsten. They found that all tools apart from PCD and PVD coated tungsten carbide tools failed within a few seconds of machining. They noted poor surface quality and subsurface damage due to cracking, built-up-edge formation and surface cracking. A thorough review of the machining and processing of tungsten has indicated that there is a lack of knowledge on the machining performance of tungsten and its alloys (Omole et al. Citation2022). Specifically, there is no report on machining performance of tungsten in intermittent cutting processes such as milling. In machining critical and high-value products, such as tungsten used in fusion reactors, conservative measures are taken to prevent costly damages to the parts. This means that the cutting tools are often discarded prior to reaching the end-of-life criterion leading to high manufacturing and environmental costs. By implementing automated tool condition monitoring (TCM) this conservative approach can be avoided and cutting tool utilisation can be maximised whilst preventing damage to the parts.

The research on TCM spans over many decades on the implementation of various sensor systems and identifying specific features in various sensor signals which may relate to certain incidents during machining such as tool failure, wear or chipping (Byrne et al. Citation1995). Dynamometers, accelerometers and acoustic emission sensors have been used to collect data during machining which are further analysed for TCM. Aitor et al. (Citation2022) compared various sensor signals for TCM in drilling Inconel 718 and concluded that cutting forces are the best predictors for TCM. Boud and Gindy (Citation2008) reported that cutting forces, sound pressure and machine table displacement signals were most sensitive to tool wear. In order to reduce the complexity of dealing with large amount of data collected from sensors during machining, various dimensionality reduction and feature extraction methods have been used. Wang et al. (Citation2017) investigated a variety of dimensionality reduction techniques and their impact on predictive performance. The techniques include kernel principal component analysis, locally linear embedding, isometric feature mapping (ISOMAP) and minimum redundancy maximum relevance. The kernel principal component analysis performed best in terms of the sensing accuracy. Kong et al. (Citation2018) used an integrated radial basis function-based kernel principal component analysis to extract features from multi-domain signals to predict tool wear using the gaussian process regression model. The effectiveness of the proposed method was attributed to the ability of the technique to remove the negative effects of noises in the signals. Benkedjouh et al. (Citation2015) applied expectation-maximisation principal component analysis and ISOMAP reduction techniques to fit a nonlinear regression model using support vector regression (SVR). Although the authors did not make a comparison between both methods to determine which performed better, the proposed approach was found to be suitable for assessing tool wear evolution of cutting tools and predicting the remaining useful life.

Using machine learning methods instead of conventional statistical methods and feature detection have gained popularity in recent years (Serin et al. Citation2020). Machine learning methods can overcome the issues in dealing and processing with large sensor data. Shankar et al. (Citation2019) trained an ANN using cutting force and sound pressure signals to detect tool condition in machining of 7075-T6 aluminium and reported successful detection of tools with flank wear in excess of 300 µm. Cho et al. (Citation2010) extracted a range of domain-specific features from multiple sensor signals i.e. cutting forces, vibrations, acoustic emissions, and spindle power. These were combined to classify tool wear using multilayer perceptron, radial basis function network and support vector machine. Hassan et al. (Citation2021) proposed using a Wavelet Scattering Convolutional Neural Network to extract stable representation of signals in the time-frequency domain and classifying them for TCM. The authors reported 100% detecting tool failure in machining Al7075-T6 and Ti6Al4V alloys. This network configuration produces data representations which minimise intra-class variabilities whilst preserving inter-class discriminability. Wu et al. (Citation2017) extracted statistical features from cutting forces, vibrations and acoustic emissions signals in a multi-sensor fusion approach to predict tool wear using artificial neural networks, SVR and random forests. Wang et al. (Citation2017) performed a multi-domain analyses from a multi-sensory data collection of cutting forces and vibrations signals to improve the performance of an SVR model. The literature indicates that the two prominent sensor signals used for TCM are cutting forces from a dynamometer and acoustic emissions. Dynamometers, whilst effective, are expensive and restrict the size of the workpiece that can be machined. Acoustic emission sensors generate large volumes of data even for small cuts due to their very high-frequency band, making data processing computationally expensive. Therefore, both sensors are only suitable for laboratory use.

The methods in the aforementioned studies have been utilised to estimate the current tool condition with reasonable effectiveness. However, tool wear progression tends to have a nonlinear behaviour and can have detrimental effects, especially as the coating layer starts to degrade. Whilst the current tool condition monitoring is advantageous, the capability to also anticipate future tool conditions is vital. Sun et al. (Citation2020) proposed an approach to TCM using LSTM and residual CNN to enable the early detection of tool condition and support decision-making for tool changing. The LSTM was used to forecast tool wear values whilst the CNN was built for current tool condition monitoring. Cheng et al. (Citation2022) developed a method which uses a parallel CNN followed by a Bi-directional LSTM network for TCM. A multi-step approach was proposed in which a dense residual neural network was subsequently used to predict the tool wear into the nearest future. Wang et al. (Citation2019) developed a deep heterogeneous model, comprising a Bi-directional and Unidirectional gated recurrent unit (GRU), to predict future tool wear with reasonable performance. Hall et al. (Citation2022) presented a framework for forecasting sensor signals for TCM. They implemented a deep learning method using convolutional long short-term memory to forecast sensor signals based on future frames predictions of scalograms representations of raw acceleration signals.

The monitoring of the current tool condition has been the focus of most studies in the literature with limited research on the capability to predict the future condition. In addition, the studies have been conducted using multiple external sensors, such as dynamometers, accelerometers and acoustic emissions sensors, which may not be suitable for industrial adoption. Whilst the fusion of multiple sensors provides more insight into the specific events during machining for scientific research, they do not necessarily enhance the ability for tool condition monitoring. Aitor et al. (Citation2022) suggested that cutting forces provide sufficient information for TCM. However, most studies rely on a dynamometer for TCM which is not practical for industrial application.

In the absence of specialist tools for machining tungsten, this paper reports on tool life results from end milling of single-phase dense tungsten used in fusion energy production for the first time using a range of rake angles and cutting speeds. A new method is presented for independently detecting the current tool condition and predicting the future condition of an end milling tool used for machining tungsten based on tool bending moment signal from a sensory tool holder. A number of machine learning models have been trained and tested for predicting the time series future bending moment signal as the machining and hence tool wear progresses. Machine learning classifier networks are used to classify the current and predicted bending moment signal for detecting the current tool condition and predicting future tool condition respectively. The variation of cutting tool geometry and cutting speed enhances the robustness of the models. The capability to monitor and predict tool conditions help to maximise the cutting tool utilisation whilst minimising tool wear-induced damage to the workpiece.

Following this introduction, Section 2 provides the research methodology used in this study, including the experimental procedure for machining data collection as well as the frameworks of the different machine learning models used in this paper. Section 3 presents and discusses the experimental results followed by the findings from training and testing of various algorithms for future bending moment signal prediction and both the current tool condition monitoring and future tool condition prediction. Finally, Section 4 presents the conclusions and possible future works as a result of the findings from this paper.

2. Methodology

This section presents the experimental and theoretical methodologies for data collection and processing for tool condition monitoring and prediction in end milling tungsten. The experimental setup used to acquire the sensor signals during machining in an end milling operation is first described. The collected data was then pre-processed for training and testing machine learning models.

The proposed TCM method is illustrated in , and it includes the bending moment signal forecasting, current tool condition detection and future tool condition prediction stages. Here, the current tool condition detection and future tool condition prediction are two separate tasks using the same trained classifiers. The signal forecast stage includes a multi-step forecast of the sensor signals using neural networks i.e. 1D CNN and ANN. This stage comprises the following steps: data acquisition, signal preparation and pre-processing, training and validation, and forecast. In the current tool condition detection stage, the classifiers i.e. RF, SVM and XGBoost were trained and validated using features extracted from the data. In the future tool condition prediction stage, features were extracted from the forecasted signals (from the signal forecast stage) and inputted into the already-trained classifier models to predict the future tool state.

Figure 1. The proposed approach for detecting current tool conditions and predicting future conditions.

Figure 1. The proposed approach for detecting current tool conditions and predicting future conditions.

2.1. Experimental setup for data collection

The machining experiments involved monitoring the tool flank wear and tool life in end milling of 99.99% pure tungsten workpiece material. These experiments were conducted on a XYZ vertical milling centre with a 13 kW spindle in dry conditions. A two flute, 12 mm diameter solid carbide end mill with Balzers Balinit Latuma PVD AlTiN coating and 34° helix angle was used for each experiment. Four different rake angles of 8°, 10°, 12° and 14° were tested. In addition, two cutting speeds of 40 m/min and 60 m/min were used. shows the cutting parameters used in this study. A full combination of these parameters, with at least one repeat, resulted in 306 cutting passes. The experimental setup is illustrated in with the Spike sensory tool holder, cutting tool and tungsten block shown. The Spike sensory tool holder wirelessly measures the bending moments in the x and y directions in the cutting tool’s coordinate system with a 2.5 kHz sampling rate. The signals from the bending moment of the cutting tool in x and y directions in each machining pass were used for training and validation of the machine learning models. Each cutting pass was performed in the longitudinal direction of the workpiece along the 40 mm length as shown in . The flank wear was measured routinely using a digital microscope following the ISO8688–2 criterion for localised wear on any individual tooth.

Figure 2. Illustration of experimental setup for data collection.

Figure 2. Illustration of experimental setup for data collection.

Table 1. The geometric features of the cutting tool, workpiece dimensions and cutting parameters.

2.2. Sensor signal forecasting

There are various methods of forecasting time series data including statistical methods such as the autoregressive integrated moving average (ARIMA) and the recurrent neural network-based algorithms including LSTM and GRU which can capture long-range in time series data. Despite the prevalence of recurrent neural network-based algorithms, CNNs are also suitable for such tasks due to their ability to detect and preserve useful patterns in sequential data (Babu, Zhao, and Xiao-Li Citation2016). The relative simplicity of the CNN as well as the ANN serves as an advantage in time series forecasting tasks in terms of reducing computational complexity. In this study, the widely used 1D CNN and ANN have been used to forecast time-domain sensor signals in a supervised learning task. The structures of the networks are illustrated in .

Figure 3. Neural network architectures: (a) 1D CNN; (b) ANN.

Figure 3. Neural network architectures: (a) 1D CNN; (b) ANN.

The 1D CNN can detect patterns across a sequence by convolving the data using kernels. In this way, the network () can learn to preserve useful information whilst ignoring unimportant details (Jin, Cruz, and Goncalves Citation2020). ANNs have been shown to be effective for predictive tasks due to their excellent ability to capture and map the complex relationships between input features and outputs. The network architecture is characterised by the number of layers and the number of neurons in each layer as illustrated in . The network learns through backpropagation which involves an update of the weight and bias parameters in each layer (Zheng et al. Citation2018). The TensorFlow framework was used for the implementations of both algorithms in this study (Abadi et al. Citation2016). Unlike the more complex recurrent neural network-based LSTM and GRU algorithms, the 1D CNN and ANN have been chosen for their computational simplicity which supports the aim of this study to develop a relatively straightforward method for tool condition monitoring.

Prior to processing the signals for training the learning algorithms and forecasting, the data was pre-processed. Two signals were collected for tool bending moment in x and y directions. As shown in , the resultant bending moment was then calculated using Euclidean theorem. The raw signal from the sensors includes non-material cutting data before the tool engages with the workpiece and after it disengages at the end of a cutting pass. When the tool engages with the workpiece, there is a sudden increase in the bending moment of the tool. A thresholding method was used to detect the moment that the tool engages and disengages with the workpiece and the data was trimmed as shown in . The resultant bending moment signals was computed from the measured signals in the x and y directions and subsequently trimmed as depicted in . This enables the training to focus on the sensor signal during material cutting. This threshold approach was consistently applied in this paper.

Figure 4. Example bending moment signal from machining in (a) x-direction, (b) y-direction and (c) the resultant bending moment indicating the cutting tool engagement with the workpiece.

Figure 4. Example bending moment signal from machining in (a) x-direction, (b) y-direction and (c) the resultant bending moment indicating the cutting tool engagement with the workpiece.

For signal forecasting, the sensor signals collected from the experiments explained in Section 2.1 were split into 70% training, 15% validation and 15% testing. The sensor signals from the previous machining passes were used for training the networks and forecasting the sensor signal for the future pass. The signals were accumulated in each cutting pass and pre-processed in the same way i.e. signals in pass 1 were pre-processed and used to train the neural networks to forecast the signals in pass 2; signals from passes 1 and 2 were merged, pre-processed and used for training to forecast signals in pass 3; signals from passes 1, 2 and 3 were merged to forecast pass 4 signals and so on up to the final pass as depicted in . The pre-processing step specific to the signal forecasting involved the creation of a variable windowed dataset using TensorFlow. This is done such that the input features are data points within the predefined window and the next data point in the series represents the output. The window sizes include 100, 200, 300 and 400 data points, with each representing 0.04 s, 0.08 s, 0.12 s and 0.16 s of machining time, respectively. These sizes were chosen to test the effect of the variable length of input features and the stability of the algorithms in making predictions. Furthermore, each window was shifted once, each time, across the entire signals to collate input features and output values which were stacked vertically. This shift was performed until there are not enough data points to fit in the window.

Figure 5. Illustration of sensor signals measured and forecasted in each cutting pass.

Figure 5. Illustration of sensor signals measured and forecasted in each cutting pass.

Training during the forecast stage was performed on the data available up to the current cutting pass; for example, to forecast signals in the next pass, the data in all passes leading to the current pass were accumulated, pre-processed, and used to train the ANN and 1D CNN. The ANN has two hidden dense layers, with the ReLU activation applied, and an output layer with one neuron for the output. The optimised hyperparameters of the ANN are 128 and 16 units in the first and second dense layers, respectively, with a learning rate of 1×104. The 1D CNN architecture has a convolutional layer and two hidden dense layers. A rectified linear unit (ReLU) activation function was used in all layers except the output. An output layer with one neuron was used to predict the signal. The hyperparameters of the network which were optimised include: the number of neurons in the dense layers, which are 480 in layer 1 and 448 in layer 2; number of filters which is set as 256; kernel size of 12; stride of 3; and an optimiser learning rate of 1×105. The KerasTuner framework (O’Malley et al. Citation2019), which supports the search for optimal hyperparameters, was used to optimise both algorithms. The train sets were always batched (with batch size of 32) and prefetched for efficient training, which was performed for 100 epochs. The training objective was to minimise the mean absolute error (MAE) by evaluating the model performance on the validation set. Each model was deployed on a held-out test set which it has not already seen to further assess performances.

The forecast follows a multi-step sequential approach where the last data points of the accumulated signals up to the current cutting pass, and with length of equivalent window size, were fed into each trained model to predict the next value. The window was then shifted once to include the newly forecasted value to make up the window size. The new windowed data points were used to make another prediction. The prediction is repeated until data points with the equivalent length of the window are forecasted. In this way, the cutting pass signals were forecasted, starting from the second cutting pass up to the last as illustrated in .

2.3. Sensor signal classification

Supervised learning with classifiers can learn specific features in a sensor signal and associate the signal to a specific class. There are many different classifier models such as logistic regression, decision tree, K-nearest neighbour, naïve Bayes, SVM, RF, XGBoost and even ANNs and CNNs with different levels of complexity, training data and computational requirements. In this study, RF, SVM and XGBoost have been selected due to their relative simplicity, reliability, and ease of computation. shows the frameworks of these classifiers. The RF and XGBoost have been specifically chosen for the classification tasks in this study due to their ensemble learning method of combining multiple predictors.

Figure 6. Frameworks of machine learning classifier algorithms: (a) RF; (b) SVM and (c) XGBoost.

Figure 6. Frameworks of machine learning classifier algorithms: (a) RF; (b) SVM and (c) XGBoost.

The RF is an ensemble learning method consisting of several decision trees with each predictor tree trained on a random subset of the dataset. The algorithm makes decision with the core idea being that performance improves when multiple predictors are involved compared to a single predictor algorithm (Breiman Citation2001). In training the RF, different subsets of the data are generated through bootstrap aggregating where sampling is performed with replacement (Sheykhmousa et al. Citation2020). The RF algorithm also introduces extra randomness when growing the trees by selecting the best feature among a random subset of features, and this creates a greater tree diversity which help reduce variance and avoid overfitting (Wu et al. Citation2017). The RF has been particularly chosen because of its versatility in handling different tasks with little pre-processing requirements and an efficient computation. It is robust to outliers and non-linear data.

The implementation of the SVM takes an entirely different approach with its main motivation being to separate a dataset into classes with a surface that maximises the margin between them. The algorithm does this by transforming the original feature space into a higher dimensional space such that an optimal hyperplane, which maximises the separation distances among the classes, is determined. It is one of the most widely used for classification tasks as it can handle very large feature spaces with acceptable generalisation capability (Cervantes et al. Citation2020). In this study, the SVM was used to separate the tool conditions into different classes through a kernel function chosen through grid search optimisation using the Scikit-learn library (Pedregosa et al. Citation2011). Aside being effective in handling classification tasks, the algorithm is also memory efficient since it only uses a subset of the training data as support vectors; however, scaling of the input features is required as has been done in this study through the standardisation of the data.

Just like the RF, a boosting algorithm is an ensemble learning method which combines several weak predictors into a strong learner. The main idea is to sequentially train predictors with each predictor trying to correct its predecessor by fitting itself to the residual errors of the previous predictor (Gao et al. Citation2019). The XGBoost overcomes the limitations of the boosting algorithm by making training faster through its parallel implementation of the algorithm. Unlike the RF, the XGBoost inherently introduces a regularisation term to improve generalisation and prevent overfitting (Chen and Guestrin Citation2016). The implementations of all classifiers, including hyperparameter tuning and optimisation, were carried out using the Scikit-learn library (Pedregosa et al. Citation2011).

2.3.1. Feature extraction

Machining learning classifiers learn specific features from data which can be used for classifying the signals. There are various methods of extracting features from time series signals. This may include statistical analyses from the original time domain or transformation of the signals into the time-frequency domain using methods such as continuous wavelet transform (CWT) and wavelet scattering (WS). The cutting tool bending moment signals collected from the machining experiments explained in Section 2.1 with various tool geometries and cutting speeds were used for feature extraction. The effects of feature extraction were tested using the CWT and WS transform of the signals into the time-frequency domains, in addition to the extraction of statistical parameters from the original signal in the time domain as shown in .

Figure 7. Illustration of the tool condition classification with the original and transformed signals.

Figure 7. Illustration of the tool condition classification with the original and transformed signals.

i- Statistical parameter feature extraction

For statistical parameter feature extraction, the mean, median, maximum, standard deviation, kurtosis, and the root mean square (RMS) values were selected as statistical parameters from the original signal, xi, as defined in . Furthermore, φx1, φx2, φy1 and φy2 were extracted as features representing the phase differences in the signals, as a function of the spindle speed, for tooth 1 in x-direction, tooth 2 in x-direction, tooth 1 in y-direction and tooth 2 in y-direction, respectively. The features were extracted from varied window sizes with the rake angle and cutting speed also encoded as features using one-hot encoding. Standardisation was used to rescale each feature to have a zero mean and unit variance.

Table 2. List of statistical features extracted from signals.

ii- Continuous wavelet transform

The CWT compares the signal to a shifted and compressed or stretched wavelet to determine the transform coefficients for time-frequency localisation. CWT is defined for a signal, ft; wavelet analysing function, ψt; position parameter, u; and scale parameter, s(>0), as (Zhu, San Wong, and Soon Hong Citation2009):

(1) Wu,s;ft,ψt=1sftψtusdt(1)

CWT was implemented using PyWavelets (Lee et al. Citation2019), an open source wavelet transform software for Python. The Morlet wavelet with scales ranging from 1 to 100 was used to compute the coefficients. The first two principal components were extracted which, taken together across the dataset, are responsible for an average explained variance of 96%. These two components were then flattened and stacked across the set to train the classifier algorithms with the tool conditions as labels.

iii- Wavelet scattering

WS transform of the signals provides stable representations of the acquired signals in the time-frequency domain such that the low-variance features, which are insensitive to translations of the inputs, are extracted. Features are generated iteratively where the output of one stage is fed as input to the next. The three operations performed in each stage include convolution, where the input signal is transformed by each wavelet filter; nonlinearity transformation, by taking the modulus of the filtered outputs; and averaging each of the moduli with a scaling filter to generate the scattering coefficients. The operations can be mathematically represented for the first stage (Hassan, Sadek, and Attia Citation2021):

(2) S1xt,λ1=xψλ1ϕJ,(2)

where S1xt,λ1 are the corresponding first order coefficients, x is the input signal, ψλ1 are the set of wavelet filters in the first filter bank, and ϕJ is the scaling function. The second order coefficients are similarly generated in the second stage by simply applying the same operations on each filtered outputs from the first stage using the set of filters, ψλ2, to produce the second order coefficients, S2xt,λ1, as shown (Hassan, Sadek, and Attia Citation2021):

(3) S2xt,λ1=xψλ1ψλ2ϕJ(3)

The energy of the coefficients rapidly dissipates with the stage iterations through the network and converges to 0 i.e. higher order scattering coefficients have lower energies compared to the lower order coefficients (Bruna and Mallat Citation2013). Networks with two wavelet filter banks have been found to be sufficient for most applications, and as a result, the second order scattering coefficients have been chosen to extract features in this study.

The WS transform was implemented using the Kymatio package (Andreux et al. Citation2020) in Python. The NumPy frontend was called to compute the 1D first-order scattering coefficients, with the plot shown in . The parameters for the transform include the number of samples, given by the length of the original signal; the averaging scattering scale, which is specified as a power of 2 and set as 6 to get an averaging, or maximum, scale of 64; and the number of wavelets per octave, which is set as 16 to resolve frequencies at a resolution of 1/16 octaves. Similarly, the first two principal components, which account for an average explained variance of 89%, were extracted and used for training.

2.3.2. Classifier training and testing for current tool condition detection and future tool condition prediction

The tool condition monitoring approach in this study is divided into two independent stages: i) detecting the current state of the cutting tool based on the sensor signal as it is collected during the machining and ii) determining the future state of the tool based on the forecasted signal as explained in Section 2.2. As shown earlier in in Section 2, both stages are classification tasks which require the training of classifier models.

The sensor signals from the machining experiments in Section 2.1 were labelled and used for training the classifier models namely, RF, SVM and XGBoost. The signals for each tool condition were classed, based on the flank wear (VBmax), as ‘minor’, ‘medium’ and ‘severe’ for the purpose of training. The thresholds used to class the signals based on the wear states include VBmax100μm for ‘minor’, 100μm<VBmax300μm for ‘medium’, and VBmax>300μm for ‘severe’. In this way, the classifier algorithms were trained by mapping the extracted features from the signals to the wear class as indicated earlier in .

The dataset includes 306 cutting pass instances which were collected from the experiments as described in Section 2.1. This data was split into 80% training and 20% testing. Prior to training each classifier algorithm, the GridSearchCV method in Scikit-learn was used to determine the optimal hyperparameters within predefined search spaces. Cross-validation was used to aid in this search to establish these optimal values. The hyperparameters of the RF are the number of estimators and the bootstrap definition. The GridSearchCV method determined the number of estimators as 2000 whilst bootstrapping was defined as False. The kernel of the SVM was found to be linear while the C hyperparameter value was determined as unity. Similarly, the optimal number of estimators for the XGBoost was found to be 6000.

A 5-fold cross-validation was first used to assess performance and test convergence of the classifier models before testing on the held-out test set. The prediction of severe wear condition and, especially, the ability to determine its onset are more important from a tool condition monitoring perspective. The predictions are summarised in a confusion matrix where the precision, recall, F1-score, and overall accuracy can be computed as concise metrics for each tool condition class. The F1-score combines the precision and recall into a single metric and is more representative in evaluating predictive performances. As a result, the F1-score for the severe tool condition prediction has been given more importance and used to assess the performances of the classifier models. The metrics can be computed as follows:

(4) Precision=TPTP+FP(4)
(5) Recall=TPTP+FN(5)
(6) F1score=2×Precision×RecallPrecision+Recall(6)
(7) Accuracy=TP+TNTP+FN+FP+TN,(7)

where TP is the number of true positives, TN is number of true negatives, FN is the number of false negatives and FP is the number of false positives.

The same classifier models that were used for current tool condition monitoring were used for future tool condition prediction. The models were trained on actual sensor signals as explained earlier. However, for the classification, the forecasted signals from Section 2.2 were used. In the future tool condition prediction stage, statistical features were extracted from signals forecasted for each pass and used to make predictions of tool wear states using the trained classifiers. The features extracted from the forecasted signals of a particular window size were inputted into classifiers trained on data of corresponding window size. For example, the features from a forecasted signal of 0.04 s window size were input into a classifier trained on the equivalent 0.04 s window size to make predictions of the future tool states.

3. Results and discussion

Some of the results from the machining experiments are presented in Section 3.1 with the following Sections showing and discussing the results for the tool monitoring tasks. The tool bending moment signals from the machining experiments were used to train an ANN and a 1D CNN network to forecast the future sensor signals. The forecasted signals were then validated against the actual signals (ground truth) from the machining experiments. The results from training, testing, and forecasting the bending moment signals are detailed in Section 3.2. The bending moment signals from machining experiments were labelled and different feature extraction methods namely, statistical parameters, CWT and WS were used to train a number of machine learning classifiers (RF, SVM and XGBoost) as explained in Section 2.3. The trained models were used for detecting the current state of the cutting tools during machining using the actual bending moment signals and to predict the future conditions of the cutting tools based on the forecasted signals from Section 2.2. The results from training and validating the classifier models for detecting the current condition and predicting the future condition of the cutting tools are explained in detail in Section 3.3.

3.1. Machining experiments

The tool wear growth in each machining experiment is shown in . The longest tool life for machining pure dense tungsten was about 14 min using the cutting tool with 14° rake angle when machining at 40 m/min cutting speed. Irrespective of the cutting speed, the tool with 14° rake angle performed best.

Figure 8. Tool wear plots during machining of tungsten at (a) 40 m/min and (b) 60 m/min cutting speed.

Figure 8. Tool wear plots during machining of tungsten at (a) 40 m/min and (b) 60 m/min cutting speed.

The average bending moment signal for the first machining pass of each experiment is also shown in . The results indicate that the tool with 12° rake angle when machining at 60 m/min has the lowest bending moment, and thus cutting force during machining. This means that this tool experienced the least impact during machining, and it would be reasonable to expect that this experiment would also result in the longest tool life. However, from , this is not the case. The tool life for this experiment was about 8 min, even when its average bending moment was about 12% lower than when machining at the best rake angle and cutting speed combination (14° and 40 m/min). In addition, the experiment with the highest average bending moment (8° and 40 m/min) has an eventual tool life of about 11 mins which is even higher than when machining with the least initial bending moment signal (12° and 60 m/min).

Figure 9. Average peak bending moment for first machining pass.

Figure 9. Average peak bending moment for first machining pass.

This shows that a visual plot of cutting signals is not sufficient to understand the machining of tungsten. These signals cannot merely be extrapolated as the basis to monitor the tool condition at any point during machining. The approach proposed in this paper helps to mitigate this challenge as the signals are correlated to the actual tool performance by using machine learning algorithms to learn useful features from the signals.

3.2. Forecasts of bending moment signals

The bending moment signal from the machining experiments were used to train the 1D CNN and ANN to predict the future bending moment signals. In this scenario, the bending moment signal from the 1st pass is used to predict the bending moment in the 2nd pass. Hereafter, the signals from the 1st and 2nd passes are used to predict the signal in the 3rd pass and so on. shows the evaluation of the training, validation and testing of the 1D CNN and ANN with the MAE used as the metric for comparing predictions of the bending moment signal with the ground truth signal during training and validation. Additionally, an example is provided on the ability of each network in predicting the bending moment signals compared with experimental results during testing.

Figure 10. Performance evaluations of the 1D CNN and ANN on the validation and test sets: (a) learning curve of the 1D CNN; (b) test set prediction of the 1D CNN; (c) learning curve of the ANN; (d) test set prediction of the ANN.

Figure 10. Performance evaluations of the 1D CNN and ANN on the validation and test sets: (a) learning curve of the 1D CNN; (b) test set prediction of the 1D CNN; (c) learning curve of the ANN; (d) test set prediction of the ANN.

A representative evaluation of the 1D CNN on the different datasets are shown in and (b). evaluates its performance on the training and validation sets in terms of the MAE. This learning curve depicts how the algorithm learns over time, during each training epoch, whilst also showing there was no overfitting to the training set. The pre-processed accumulated test data is presented in with the equivalent time axis; the step changes in the data is due to the multi-pass accumulation of the signals. shows the prediction of the test set with the MAE being 3.37. Similarly, demonstrate the evaluation of the ANN on the training, validation and test sets with the MAE on the test set being 2.36. The results clearly indicate the superiority of the 1D CNN over ANN in predicting the time series sensor signals.

The results of each network’s prediction on selected passes (2, 5, 8, 11, 14, 17, 19 and 20) are shown in . They clearly demonstrate that the 1D CNN can model the variability of the original signals for all passes. This is also evidenced in where the algorithm’s prediction matched the test set signal undulations best. Whilst the ANN has a relatively poor variability in its prediction, its MAE per pass is comparable to that of the 1D CNN.

Figure 11. Forecasted signals using 1D CNN and ANN compared with measured signal for (a) pass 2; (b) pass 5; (c) pass 8; (d) pass 11.

Figure 11. Forecasted signals using 1D CNN and ANN compared with measured signal for (a) pass 2; (b) pass 5; (c) pass 8; (d) pass 11.

Figure 12. Forecasted signals using 1D CNN and ANN compared with measured signal for (a) pass 14; (b) pass 17; (c) pass 19; (d) pass 20.

Figure 12. Forecasted signals using 1D CNN and ANN compared with measured signal for (a) pass 14; (b) pass 17; (c) pass 19; (d) pass 20.

shows the ground truth mean resultant bending moment per pass compared to the mean forecasts per algorithm. further indicates the errors made on the ground truth per pass. The capability of the 1D CNN to match the signal variability, coupled with its low prediction errors, make it suitable for further trials, hence justifying its selection for the classification tasks in Section 3.3.3. Nevertheless, based on the predicted data presented in and the mean bending moment shown in , both the 1D CNN and ANN underpredict the bending moment in comparison with the experimental data.

Figure 13. Mean bending moment for each machining pass for (a) ground truth and prediction per algorithm; (b) prediction error per algorithm.

Figure 13. Mean bending moment for each machining pass for (a) ground truth and prediction per algorithm; (b) prediction error per algorithm.

This performance of the 1D CNN agrees with the findings by (Van et al. Citation2020). The authors compared the performance of the 1D CNN with the ANN and LSTM as well as with other traditional models, such as the autoregressive integrated moving average (ARIMA) and seasonal ARIMA. They showed that the predictions of the 1D CNN outperformed these algorithms and concluded that it may be more suitable for time series modelling. The algorithm was also able to better match the peaks in the observed data despite its underestimations. Based on these results, the 1D CNN was chosen for forecasting the future bending moment signal for future tool condition prediction in Section 3.3.3.

3.3. Tool wear classification

This section details the results from training the classifier algorithms namely, RF, SVM and XGBoost for current tool condition monitoring as well as for the predictions of the future tool conditions based on forecasted signals. In Section 3.3.1, the results from training and testing the classifier algorithms for detecting the current tool conditions are presented. The impact of signal length (window size) on the detection performance were investigated for four window sizes of 0.04 s, 0.08 s, 0.12 s and 0.16 s. Statistical features were extracted from the signals and were the basis of the training and testing in this section. The performances of the classifier algorithms in classifying the current tool conditions, when trained on features extracted using CWT and WS of the whole signal, are discussed in Section 3.3.2. In addition, the best performing feature extraction method is identified. In Section 3.3.3, the classifier algorithms trained on experimental data are tested for predicting the future tool conditions using the forecasted sensor signals from the 1D CNN as explained in Section 3.2.

3.3.1. Current tool condition detection using statistical features

The results from training and testing the classifier algorithms on statistical features for detecting current tool conditions are presented in this section. The confusion matrices in show the comparisons of the test set predictions and the actual tool conditions for the RF, SVM and XGBoost, respectively. All classifiers correctly detect all the severe cases for the 0.16 s window; however, the SVM has the highest F1-score of 96%. The maximum F1-scores of the RF and XGBoost are 93% and 90%, respectively. In all cases, there were no instances where a classifier detected a severe tool condition when the actual condition is minor and vice-versa. Furthermore, both minor and severe conditions were only misclassified as the intermediate medium condition.

Figure 14. Confusion matrices for the classification of tool conditions by the RF for (a) 0.04 s window; (b) 0.08 s window; (c) 0.12 s window; and (d) 0.16 s window.

Figure 14. Confusion matrices for the classification of tool conditions by the RF for (a) 0.04 s window; (b) 0.08 s window; (c) 0.12 s window; and (d) 0.16 s window.

Figure 15. Confusion matrices for the classification of tool conditions by the SVM for (a) 0.04 s window; (b) 0.08 s window; (c) 0.12 s window; and (d) 0.16 s window.

Figure 15. Confusion matrices for the classification of tool conditions by the SVM for (a) 0.04 s window; (b) 0.08 s window; (c) 0.12 s window; and (d) 0.16 s window.

Figure 16. Confusion matrices for the classification of tool conditions by the XGBoost for (a) 0.04 s window; (b) 0.08 s window; (c) 0.12 s window; and (d) 0.16 s window.

Figure 16. Confusion matrices for the classification of tool conditions by the XGBoost for (a) 0.04 s window; (b) 0.08 s window; (c) 0.12 s window; and (d) 0.16 s window.

The classification results suggest that extracting features from a larger window might improve performance. However, as shown in , the models trained on the statistical features extracted from the whole signal resulted in less accurate classification relative to those trained on much shorter window sizes presented in . Training the RF, SVM and XGBoost models on the features extracted from the whole signals resulted in F1-scores of 85%, 86% and 86%, respectively.

Figure 17. Confusion matrices with/without rake angle and cutting speed as encoded features for RF, SVM and XGBoost when trained on statistical features.

Figure 17. Confusion matrices with/without rake angle and cutting speed as encoded features for RF, SVM and XGBoost when trained on statistical features.

The inclusion of the cutting tool rake angle and cutting speed as one-hot encoded features (four encodings for the rake angle and two for the cutting speed) show that these bear relatively low importance and have the lowest scores of all input features. The feature importance of the combined six encoded features, using the in-built feature importance attribute in Scikit-learn, is 3.49% (0.58% on average) compared to an average of 3.69% per contributing phase difference feature. On the other hand, xmax has the maximum feature importance of 23.19%.

Retraining each algorithm with the exclusion of the rake angle and cutting speed as features show the performances are not significantly degraded, even with the potential loss of data. The results are shown in where the F1-scores are 83%, 87% and 77% for the RF, SVM and XGBoost, respectively.

The algorithms trained on the whole signal were further validated on an unseen data. The results are presented in where the RF has the best performance with a 100% F1-score. The algorithm also correctly detects the onset of severity at the 15th machining pass. The SVM and XGBoost, on the other hand, were not able to detect the onset of severity.

Figure 18. Confusion matrices on validation set for (a) RF, (b) SVM, and (c) XGBoost when trained on statistical features.

Figure 18. Confusion matrices on validation set for (a) RF, (b) SVM, and (c) XGBoost when trained on statistical features.

3.3.2. Current tool condition detection using continuous wavelet transform and wavelet scattering features

In addition to statistical feature extraction, the effect of feature extraction has been tested with the application of continuous wavelet transforms (CWT) and wavelet scattering (WS) to whole signals. Equivalent comparisons have been made with the performances of the algorithms when trained on statistical features as in Section 3.3.1, with the exclusion of the rake angle and cutting speed as features. The confusion matrices in demonstrate the capabilities of the different models in detecting the tool condition against the actual condition. The results from the CWT features are shown in . Both the RF and XGBoost have F1-scores of 89%, which are higher than their performances when trained on statistical features (83% and 77% respectively). However, the F1-score of the SVM is 85% compared to 87% on the statistical features.

Figure 19. Confusion matrices of each algorithm on test set for each algorithm when trained on (a)-(c) CWT features and (d)-(f) WS features.

Figure 19. Confusion matrices of each algorithm on test set for each algorithm when trained on (a)-(c) CWT features and (d)-(f) WS features.

presents the performances of the RF, SVM and XGBoost algorithms when trained on WS features. The performances have greatly improved compared with when trained on either statistical or CWT features. The F1-scores have increased to 92% for all classifiers, which signify that they can be more reliable when severe tool conditions are detected.

The performances of each algorithm are also tested on unseen data from the validation set to further assess the reliability of the prior predictions. The results are presented in for each algorithm when trained on CWT and WS features. With the CWT in , the RF and XGBoost have 100% F1-scores compared to 73% for the SVM. The RF and XGBoost were also able to detect the onset of severity unlike the SVM which could not. show the results with the WS features where the RF and SVM have F1-scores of 100% whilst the XGBoost has 92%. Based on these performances, the RF and SVM could detect the onset of severity.

Figure 20. Confusion matrices on validation set for each algorithm when trained on (a)-(c) CWT features and (d)-(f) WS features.

Figure 20. Confusion matrices on validation set for each algorithm when trained on (a)-(c) CWT features and (d)-(f) WS features.

Whilst the performance of SVM significantly increased when using WS, the RF trained on WS features demonstrated the best performance in correctly detecting the current cutting tool condition. Similar findings have been reported in the literature on the superior performance of the RF. Wu et al. (Citation2017) found that the RF produced more accurate predictions of tool wear relative to the SVM. In the study by Park et al. (Citation2019), the RF and XGBoost outperformed the SVM in classifying tool wear state. Das et al. (Citation2019) also found that the RF showed better classification results compared to the SVM. The ensemble approach of combining multiple decision trees in the RF could explain this superior performance compared to other non-ensemble methods.

It can also be deduced that training the classifiers on WS extracted features offer stability in performances. Compared to training on statistical features as in Section 3.3.1, the CWT and WS features help the classifiers perform better as the F1-scores and overall accuracies are increased. The capabilities of the CWT and WS transforms to provide features which simultaneously preserve both spectral and temporal information explain the improved performances. Furthermore, the superior performances of the classifiers on WS features can be attributed to the stable transformations of the signals as features which limit the intra-class variabilities whilst also maximising inter-class differences. This stable representation of signals by the WS transform was the subject of a study by Hassan et al. (Citation2021) where distortions in the signals were minimised to achieve a 98% accuracy of tool condition predictions.

3.3.3. Future tool condition prediction from forecasted signals

The results obtained from predicting tool conditions ahead of time are presented in this section where the classification results are shown from pass 2. Being able to predict the condition of the tool ahead of machining enables decision making for changing the cutting tool or adjusting the cutting parameters.

The forecasted signals in Section 3.2 are of finite intervals and it is not practical to transform them into the time-frequency domain using either the CWT, WS or other transformation methods. This is because of boundary effects or distortions which are common in processing time series signals of finite length (Su, Liu, and Jingsong Citation2012). The signals need to be extended to alleviate these distortions. Common extension methods including zero padding, periodic extension and symmetric extension are based on assumptions about the signal characteristics. They fail to produce satisfactory results in terms of preserving the time-varying characteristics of the signals (Strand and Nguyen Citation1996). As a result of this challenge, only statistical analyses have been performed to extract features from the forecasted signals to make predictions about the future tool conditions.

Statistical features were extracted from the signals forecasted per cutting pass in Section 3.2 by the 1D CNN for a whole set of experiment. These features were input into each trained classifier of corresponding window size. The results are presented in for the RF, SVM and XGBoost, respectively. In , the RF has similar prediction of the severe cases for both the 0.04 s and 0.12 s windows. Similar observations can be made for the 0.08 s and 0.16 s widows.

Figure 21. Prediction of tool conditions by the RF per cutting pass for all window sizes.

Figure 21. Prediction of tool conditions by the RF per cutting pass for all window sizes.

Figure 22. Prediction of tool conditions by the SVM per cutting pass for all window sizes.

Figure 22. Prediction of tool conditions by the SVM per cutting pass for all window sizes.

Figure 23. Prediction of tool conditions by the XGBoost per cutting pass for all window sizes.

Figure 23. Prediction of tool conditions by the XGBoost per cutting pass for all window sizes.

The SVM in has similar predictions of the severe cases across all windows, hence suggesting a more invariant response to window size. The XGBoost in has similar performances as the RF and SVM for all window sizes except with the 0.16 s window which has the worst prediction by missing three of the severe cases. The analyses here show that none of the algorithms have their performances significantly altered by the window size. However, the RF with 0.16 s window resulted in the best performance in terms of the overall accuracy, prediction of severe cases and the detection of the onset of severity at pass 14. The SVM with the 0.04 s window was also able to detect the onset of severity, but it has a relatively poorer performance. These show that the RF has a better performance than both the SVM and XGBoost and can be reliably deployed for future tool condition prediction tasks as in this study.

4. Discussion and future work

Machining of single-phase dense tungsten is increasingly required in order to be able to scale fusion energy reactors and reduce the overall costs associated with fusion energy generation. Beyond fusion energy generation, tungsten has applications in various fields including aviation, automotive, aerospace, electronics, medicine, chemicals, sports etc. (Lassner and Schubert Citation2005). A series of end milling machining experiments were performed for the first time using 12 mm diameter solid carbide tools with AlTiN coating and various rake angles. The investigations showed that a tool with 14° rake angle performed best in terms of tool life among the tested geometries. Whilst the variation of cutting tool geometry exposes the learning models to variations in cutting tools, other tool microgeometries can be further explored to enhance the machining performance for milling single-phase tungsten.

Cutting forces have been identified by various researchers as the best predictor for TCM (Aitor et al. Citation2022; Cho, Binsaeid, and Asfour Citation2010). However, using a dynamometer is not practical in industrial applications and its use is limited to laboratory tests. Wireless sensory tool holders allow for unintrusive data collection for TCM which can be applied in production scenarios and retrofitted into existing machine tools. In this study, a Spike sensory tool holder was used for collecting the cutting tool bending moment signals during machining. The bending moment signals were used for detecting and predicting the cutting tool condition, both of which are independent tasks. Instead of predicting the future tool wear or tool condition directly as is common in the literature, the cutting tool bending moment signal was first forecasted using 1D CNN and ANN models and 1D CNN was selected for further investigation. Whilst this model achieved an MAE of 3.37 in predicting the future bending moment signals, there are many other learning algorithms that can also be tested. For instance, Hall et al. (Citation2022) used ConvLSTM to predict future transformed signal frames rather than the actual sensor signals. Other potential learning algorithms for time series prediction are sequence models such as LSTM and GRU which can capture long range dependencies in the data. These models can enable the measured signals to be better matched to further minimise the errors of the forecasted signals. However, these models require large datasets for training and validation and are computationally expensive. Three classifier networks namely, RF, SVM and XGBoost were used for classifying sensor signals for current tool condition detection as well as future condition prediction. Amongst these, the RF performed best in detecting the current tool conditions with an accuracy of 95%. The RF also achieved 89.5% accuracy in predicting the future conditions. There are other deep learning classifier models such as CNN and ANN for detecting small features in sensor signals. However, these models also require large datasets for training and validation. Future work will focus on assessing and comparing the performance of these models in detecting specific events during machining such as chipping and nose breakage. Feature engineering could also benefit future studies where new features can be engineered from the signals to aid training and improve the performance of the classifiers. The short tool life experienced in machining tungsten makes conventional methods of tool wear monitoring for making decision for changing the tool extremely inefficient. Detecting the current state of the tool might not leave enough time to take appropriate actions before damaging the workpiece using a worn tool. Being able to predict the condition of the tool in the near future enables timely decision making for changing the tool.

5. Conclusion

Tool life results from end milling single phase fully dense tungsten used in fusion energy reactors have been reported for the first time and tool bending moment signals were collected for tool condition monitoring using a sensory tool holder. A method for independently detecting the current tool condition as well as predicting the future condition in end milling of single-phase tungsten has been tested and validated. In this method, the cutting tool bending moment from a sensory tool holder during machining is used for detecting and predicting the cutting tool condition. The sensor signals from the experimental data have been used to train a number of classifier models namely, random forest (RF), support vector machine (SVM) and extreme gradient boosting (XGBoost). For detecting the current tool condition, the trained models have been used to classify the actual measured signals to determine the tool condition. The method also combines two stages for predicting the future cutting tool condition. Firstly, 1D convolutional neural network (1D CNN) and artificial neural network (ANN) have been trained and validated on historical data to forecast the time series tool bending moment in the future. Secondly, the same trained classifier models have been used to classify the forecasted bending moment signal to predict the future condition of the cutting tool. In this way, the capabilities of different classifiers in detecting current tool condition based on the actual bending moment and in predicting the future tool condition based on forecasted bending moment signal have been assessed. Overall, the following conclusions were drawn:

  • Tool bending moment signals collected from a sensory tool holder have been successfully used for detecting the current tool condition by training classifiers to map the measured signals to the actual tool conditions during machining.

  • Both the 1D CNN and ANN can be used for forecasting time series bending moment signals in the near future. However, the 1D CNN outperformed the ANN in its forecasts by closely matching the actual signals with an MAE of 3.37.

  • All classifier models tested can detect the current tool condition with different degrees of accuracy. RF achieved 95% accuracy in detecting the current cutting tool condition when it was trained on features extracted from the bending moment signal using wavelet scattering.

  • Classifying the forecasted sensor signal to define the cutting tool condition in the future has shown to successfully predict the future tool condition with 89.5% accuracy. Future tool conditions can be predicted by extracting statistical features from the forecasted signals. In terms of the predicting the severe wear condition of the cutting tool, the combination of 1D CNN for forecasting bending moment signal and RF classifier can be especially used to classify the tool states whilst also being able to predict the onset of severity.

Cost effective and reliable manufacturing of parts made from refractory metals such as tungsten is necessary for successful implementation of fusion energy reactors. The proposed tools and methods presented in this paper can help industries in machining of tungsten parts and predicting when a cutting tool is expected to suffer from severe tool wear prior to damaging the workpiece, hence replacing the need for constant process interruption or changing the tools prematurely.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by the UK Atomic Energy Authority with funding from EPSRC under grant EP/T012250/1; EPSRC under grant EP/R513155/1 project 2297674; and EPRSC under grant EP/V055011/1.

References

  • Abadi, M., A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, and G. S. Corrado. 2016. “TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems.” ArXiv: 1603 04467.
  • Aitor, D., R. Basagoiti, P. J. Arrazola, and M. Cuesta. 2022. “Sensor Signal Selection for Tool Wear Curve Estimation and Subsequent Tool Breakage Prediction in a Drilling Operation.” International Journal of Computer Integrated Manufacturing 35 (2): 203–227. https://doi.org/10.1080/0951192X.2021.1992661.
  • Andreux, M., G. Exarchakis, R. Leonarduzzi, G. Rochette, L. Thiry, J. Zarka, and J. Andén. 2020. “Kymatio: Scattering Transforms in Python.” Journal of Machine Learning Research 21 (1): 2256–2261.
  • Babu, G. S., P. Zhao, and L. Xiao-Li. 2016. “Deep Convolutional Neural Network Based Regression Approach for Estimation of Remaining Useful Life.” In Database Systems for Advanced Applications, 214–228. Dallas. https://doi.org/10.1007/978-3-319-32025-0_14.
  • Benkedjouh, T., K. Medjaher, N. Zerhouni, and S. Rechak. 2015. “Health Assessment and Life Prediction of Cutting Tools Based on Support Vector Regression.” Journal of Intelligent Manufacturing 26 (2): 213–223. https://doi.org/10.1007/s10845-013-0774-6.
  • Boud, F., and N. N. Z. Gindy. 2008. “Application of Multi-Sensor Signals for Monitoring Tool/Workpiece Condition in Broaching.” International Journal of Computer Integrated Manufacturing 21 (6): 715–729. https://doi.org/10.1080/09511920701233357.
  • Breiman, L. 2001. “Random Forests.” Machine Learning 45 (1): 5–32. https://doi.org/10.1023/A:1010933404324.
  • Bruna, J., and S. Mallat. 2013. “Invariant Scattering Convolution Networks.” IEEE Transactions on Pattern Analysis and Machine Intelligence 35 (8): 1872–1886. https://doi.org/10.1109/TPAMI.2012.230.
  • Byrne, G., D. Dornfeld, I. Inasaki, G. Ketteler, W. König, and R. Teti. 1995. “Tool Condition Monitoring (TCM) - the Status of Research and Industrial Application.” CIRP Annals - Manufacturing Technology 44 (2): 541–567. https://doi.org/10.1016/S0007-8506(07)60503-4.
  • Cervantes, J., F. Garcia-Lamont, L. Rodríguez-Mazahua, and A. Lopez. 2020. “A Comprehensive Survey on Support Vector Machine Classification: Applications, Challenges and Trends.” Neurocomputing 408:189–215. https://doi.org/10.1016/j.neucom.2019.10.118.
  • Cheng, M., L. Jiao, P. Yan, H. Jiang, R. Wang, T. Qiu, and X. Wang. 2022. “Intelligent Tool Wear Monitoring and Multi-Step Prediction Based on Deep Learning Model.” Journal of Manufacturing Systems 62 (November 2021): 286–300. https://doi.org/10.1016/j.jmsy.2021.12.002.
  • Chen, T., and C. Guestrin. 2016. “XGBoost: A Scalable Tree Boosting System.” In Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, San Francisco, California, August 13-17, 785–794. New York, United States: Association for Computing Machinery.
  • Cho, S., S. Binsaeid, and S. Asfour. 2010. “Design of Multisensor Fusion-Based Tool Condition Monitoring System in End Milling.” The International Journal of Advanced Manufacturing Technology 46 (5–8): 681–694. https://doi.org/10.1007/s00170-009-2110-z.
  • Das, A., F. Yang, M. Salahuddin Habibullah, Y. Zhou, and F. Farbiz. 2019. “Tool Wear Health Monitoring with Limited Degradation Data.” IEEE Region 10 Annual International Conference, Proceedings/TENCON 2019-Octob: 1103–1108. https://doi.org/10.1109/TENCON.2019.8929526.
  • Edstrom, C. M., A. G. Phillips, L. D. Johnson, and R. R. Corle. 1980. “Literature on Fabrication of Tungtsen for Application in Pyrochemical Processing of Spent Nuclear Fuels.”
  • Gao, K., H. Chen, X. Zhang, X. Kai Ren, J. Chen, and X. Chen. 2019. “A Novel Material Removal Prediction Method Based on Acoustic Sensing and Ensemble XGBoost Learning Algorithm for Robotic Belt Grinding of Inconel 718.” International Journal of Advanced Manufacturing Technology 105 (1–4): 217–232. https://doi.org/10.1007/s00170-019-04170-7.
  • Haag, J. V., J. Wang, K. Kruska, M. J. Olszta, C. H. Henager, D. J. Edwards, W. Setyawan, and M. Murayama. 2023. “Investigation of Interfacial Strength in Nacre-Mimicking Tungsten Heavy Alloys for Nuclear Fusion Applications.” Scientific Reports 13 (1): 1–11. https://doi.org/10.1038/s41598-022-26574-4.
  • Hall, S., S. T. Newman, E. Loukaides, and A. Shokrani. 2022. “ConvLstm Deep Learning Signal Prediction for Forecasting Bending Moment for Tool Condition Monitoring.” Procedia CIRP 107:1071–1076. https://doi.org/10.1016/j.procir.2022.05.110.
  • Hassan, M., A. Sadek, and M. H. Attia. 2021. “Novel Sensor-Based Tool Wear Monitoring Approach for Seamless Implementation in High Speed Milling Applications.” CIRP Annals 70 (1): 87–90. https://doi.org/10.1016/j.cirp.2021.03.024.
  • Jin, B., L. Cruz, and N. Goncalves. 2020. “Deep Facial Diagnosis: Deep Transfer Learning from Face Recognition to Facial Diagnosis.” Institute of Electrical and Electronics Engineers Access 8:123649–123661. https://doi.org/10.1109/ACCESS.2020.3005687.
  • Kong, D., Y. Chen, and L. Ning. 2018. “Gaussian Process Regression for Tool Wear Prediction.” Mechanical Systems and Signal Processing 104:556–574. https://doi.org/10.1016/j.ymssp.2017.11.021.
  • Lassner, E., and W.-D. Schubert. 2005. “Properties, Chemistry, Technology of the Element, Alloys, and Chemical Compounds.”
  • Lee, G. R., R. Gommers, F. Waselewski, K. Wohlfahrt, and A. O’Leary. 2019. “PyWavelets: A Python Package for Wavelet Analysis.” Journal of Open Source Software 20 (1): 2018–2019. https://doi.org/10.7717/peerj.453.
  • Olsson, M., V. Bushlya, F. Lenrick, and J. Eric Ståhl. 2021. “Evaluation of Tool Wear Mechanisms and Tool Performance in Machining Single-Phase Tungsten.” International Journal of Refractory Metals and Hard Materials 94 (105379): 1–11. https://doi.org/10.1016/j.ijrmhm.2020.105379.
  • O’Malley, T., E. Bursztein, J. Long, F. Chollet, H. Jin, and L. Invernizzi. 2019. “KerasTuner.” 2019. https://github.com/keras-team/keras-tuner.
  • Omole, S., A. Lunt, S. Kirk, and A. Shokrani. 2022. “Advanced Processing and Machining of Tungsten and Its Alloys.” Journal of Manufacturing and Materials Processing 6 (1): 15. https://doi.org/10.3390/jmmp6010015.
  • Park, S., K. Lee, S. Sung, and D. Park. 2019. “Prediction of the CNC Tool Wear Using the Machine Learning Technique.” Proceedings - 6th Annual Conference on Computational Science and Computational Intelligence, CSCI 2019, 296–299. https://doi.org/10.1109/CSCI49370.2019.00059.
  • Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, and M. Blondel. 2011. “Scikit-Learn: Machine Learning in Python.” Journal of Machine Learning Research 12 (85): 2825–2830.
  • Serin, G., B. Sener, A. M. Ozbayoglu, and H. O. Unver. 2020. “Review of Tool Condition Monitoring in Machining and Opportunities for Deep Learning.” The International Journal of Advanced Manufacturing Technology 109 (3–4): 953–974. https://doi.org/10.1007/s00170-020-05449-w.
  • Shankar, S., T. Mohanraj, and R. Rajasekar. 2019. “Prediction of Cutting Tool Wear During Milling Process Using Artificial Intelligence Techniques.” International Journal of Computer Integrated Manufacturing 32 (2): 174–182. https://doi.org/10.1080/0951192X.2018.1550681.
  • Sheykhmousa, M., M. Mahdianpari, H. Ghanbari, F. Mohammadimanesh, P. Ghamisi, and S. Homayouni. 2020. “Support Vector Machine versus Random Forest for Remote Sensing Image Classification: A Meta-Analysis and Systematic Review.” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 13:6308–6325. https://doi.org/10.1109/JSTARS.2020.3026724.
  • Strand, G., and T. Nguyen. 1996. Wavelets and Filter Banks. Wellesley, MA: Wellesley-Cambridge Press.
  • Su, H., Q. Liu, and L. Jingsong. 2012. “Boundary Effects Reduction in Wavelet Transform for Time-Frequency Analysis.” WSEAS Transactions on Signal Processing 8 (4): 169–179.
  • Sun, H., J. Zhang, M. Rong, and X. Zhang. 2020. “In-Process Tool Condition Forecasting Based on a Deep Learning Method.” Robotics and Computer-Integrated Manufacturing 64 (January): 101924. https://doi.org/10.1016/j.rcim.2019.101924.
  • Tikhonchuk, V. T. 2020. “Progress and Opportunities for Inertial Fusion Energy in Europe: Progress and Opportunities for IFE.” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 378 (2184): 20200013. https://doi.org/10.1098/rsta.2020.0013.
  • Ueda, Y., J. W. Coenen, G. De Temmerman, R. P. Doerner, J. Linke, V. Philipps, and E. Tsitrone. 2014. “Research Status and Issues of Tungsten Plasma Facing Materials for ITER and Beyond.” Fusion Engineering and Design 89 (7–8): 901–906. https://doi.org/10.1016/j.fusengdes.2014.02.078.
  • Van, S. P., L. Hoang Minh, D. Vi Thanh, T. Duc Dang, H. Huu Loc, and D. Tran Anh. 2020. “Deep Learning Convolutional Neural Network in Rainfall-Runoff Modelling.” Journal of Hydroinformatics 22 (3): 541–561. https://doi.org/10.2166/hydro.2020.095.
  • Wang, J., J. Xie, R. Zhao, L. Zhang, and L. Duan. 2017. “Multisensory Fusion Based Virtual Tool Wear Sensing for Ubiquitous Manufacturing.” Robotics and Computer-Integrated Manufacturing 45:47–58. https://doi.org/10.1016/j.rcim.2016.05.010.
  • Wang, J., J. Yan, L. Chen, R. X. Gao, and R. Zhao. 2019. “Deep Heterogeneous GRU Model for Predictive Analytics in Smart Manufacturing: Application to Tool Wear Prediction.” Computers in Industry 111:1–14. https://doi.org/10.1016/j.compind.2019.06.001.
  • Wu, D., C. Jennings, J. Terpenny, R. X. Gao, and S. Kumara. 2017. “A Comparative Study on Machine Learning Algorithms for Smart Manufacturing: Tool Wear Prediction Using Random Forests.” Journal of Manufacturing Science & Engineering, Transactions of the ASME 139 (7): 1–9. https://doi.org/10.1115/1.4036350.
  • Zheng, Q., M. Yang, J. Yang, Q. Zhang, and X. Zhang. 2018. “Improvement of Generalization Ability of Deep CNN via Implicit Regularization in Two-Stage Training Process.” Institute of Electrical and Electronics Engineers Access 6:15844–15869. https://doi.org/10.1109/ACCESS.2018.2810849.
  • Zhu, K. P., Y. San Wong, and G. Soon Hong. 2009. “Wavelet Analysis of Sensor Signals for Tool Condition Monitoring: A Review and Some New Results.” International Journal of Machine Tools and Manufacture 49 (7–8): 537–553. https://doi.org/10.1016/j.ijmachtools.2009.02.003.