Full article: Study on visual localization and evaluation of automatic freshwater fish cutting system based on deep learning framework

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

Pre-treatment processing technology plays a crucial role in the overall freshwater fish processing procedure, and automatic head and tail cutting stands out as a significant pre-treatment technique within the industry. The system for removing the head and tail of freshwater fish comprised a Cartesian coordinate manipulator, a fish transfer device, a control system, and an image acquisition device. In the vision system, five image segmentation methods were utilized for fish head and tail image segmentation comparison tests. These methods include U-Net (U-shaped Deep Neural Network), DeeplabV3, PSPNet (Pyramid Scene Parsing Network), FastSCNN (Fast Semantic Segmentation Network), and ICNet (Image Cascade Network), all of which were employed to evaluate their performance. Among the tested segmentation methods, the ICNet demonstrated the most excellent segmentation capability. The experimental results indicated a segmentation accuracy of 99.01%, a mean intersection over union (MIoU) of 82.50%, and an image processing time of 15.25 ms. The results showed that the fish head and tail were successfully cut off using this model for recognition with a circular knife. Consequently, the segmentation model employed in the machine vision system within this study has demonstrated successful applicability in automatically cutting the heads and tails of freshwater fish of various sizes.

KEYWORDS:

Introduction

Fish is rich in animal protein and unsaturated fatty acids, and excellent nutrition can be obtained from fish products for consumers in the diet. The health benefits of consuming fish meat create a favorable consumer profile, so worldwide public demand for fish is constantly increasing, and there is a growing interest on fish processing.^{[Citation1–3]} As an important link in freshwater fish processing, pretreatment processing technology affects the modernization process of aquatic product processing. Moreover, the separation of the head and tail from fish body is also an important process in the pretreatment processing, the excellent separation technology can reduce the waste rate of fish meat and increase the income of deep processing of freshwater fish.^[Citation4] At present, manual operations are usually used to remove the head and tail of fish in China in order to ensure the meat yield of different sizes of fish.^{[Citation5–7]} Considering the high labor-intensive work of removing fish heads and tails manually, it should be completely replaced by having a fully intelligent and automated processing system.^[Citation8] Therefore, it is a critical task to obtain the cutting position of the head and tail for the different sizes of fish.

Currently, the automation system is widely used in fish processing, and there have been numerous studies using automatic operation for fish-orientation and cutting.^{[Citation9–12]} For instance, Andrzej et al. demonstrated the potential influence of the fish-orientation system, which provided close connection with the deheading yield, and the straight-cutting simplifies the precise orientation of the fish in relation to the cutting knives.^[Citation13] Alejandro et al. designed a fish bone separator machine, which was smaller in size and output but capable of processing hard-boned fish up to 3 kg in weight, and obtained 13% higher processing yield.^[Citation14] Bhushan et al. developed a meat-bone separator for small scale fish, and evaluated Tilapia fish processing in terms of capacity, yield, percentage yield, bone content, color and power consumption.^[Citation15] Therefore, the use of automated machinery is essential for fish processing production, and present study focused on the processing of freshwater fish using automated processing equipment.

However, there is few literature on the simultaneous cutting of fish heads and tails of freshwater fish of different sizes according to actual need. Most of the studies are focused on the processing of fish heads or tails with little difference in size.^{[Citation16–18]} For example, Liu et al. designed a high-yield head-cutting experimental prototype, which performed pretreatment line operations using shearing technology in the field of agricultural products processing combined with body size parameters of bream (0.4 ± 0.1 kg) and grass carp (1.2 ± 0.3 kg).^[Citation19] Chen et al. reported the mechanical deheading method of typical small marine fish, and the processing methods of toothless disc knife, straight cutting and forward cutting are suitable for lepidotrigla abyssalis, synodus macrops and trachurus japonicus.^[Citation20] Zhao et al. developed an automatic deheading machine for dace, which used a lever-type automatic adjustment mechanism to determine the cutting position of the fish head by contacting the fish body with the roller.^[Citation21] Therefore, the automatic adjustment of the cutting line according to the size of the fish was considered as a key technology in the design of fish cutting machines, which could avoid unnecessary manual waste.

Besides, the machine vision technique is also widely used in food processing, and there have been several researches using this technology for fish processing.^{[Citation12,Citation22–28]} For instance, White et al.described an experiment to identify and measure different species of fish using computer vision techniques. The test results showed that the machine could measure fish with a length deviation of 1.2 mm and had a classification reliability of 99.8% for seven species of fish.^[Citation29] Dowlati et al.reported on the application of machine vision and imaging techniques for fish quality assessment. Machine vision technology could be used to identify the freshness and composition of fish, as well as to assess the size and volume, estimate weight, measure shape parameters, analyze skin and fillets of different color shades, etc.^[Citation30] Hong et al. introduced detection method for the size, shape and color measurement using machine vision systems, and also described information on the color of fillets, flesh, skin and shrimp, as well as color changes in fish that have been specially treated.^[Citation31] All of the above methods used conventional machine vision to measure the length and size, freshness, shape, and color of the fish to obtain the desired target features. Conventional machine vision relies on hand-designed feature extraction methods, a process that requires specialized knowledge and cannot be adapted to complex and diverse scenarios. Traditional machine vision is often sensitive to environmental changes such as lighting, shadows, and occlusion, which need to be taken into account in the design phase and processed accordingly. However, traditional machine vision technology has been developed and applied for a long time, with mature algorithms and tool libraries, making it more stable and reliable. Currently, deep learning is a relatively popular field in machine vision, capable of mimicking the human brain’s mechanism for analyzing and learning images, sounds, and text. Compared to traditional machine learning models, deep learning models do not require careful selection of detection object features. Additionally, they have access to large sample datasets with high generalization ability, resulting in greater accuracy. For instance, Zhu et al. utilized the Mobilenetv3-Small neural network to classify the feeding state of sea bass, achieving an accuracy of 99.60% on the test set. This research provides a crucial reference for achieving efficient and intelligent feeding of sea bass.^[Citation32] Fang et al. utilized an hourglass Net-based deep learning network model to accurately measure the length of catfish, reducing measurement errors to less than 4%.^[Citation33] Convolutional neural networks are able to automatically learn and extract features from input images without the need to manually design a feature extractor, which not only reduces the reliance on domain expertise but also allows the model to adaptively learn the best features based on the data. However, the models typically have high hardware requirements for training, which may limit their application in resource-constrained environments. From these studies, the machine vision technology, especially deep learning, appeared to be suitable for detecting the fish head and tail of freshwater fish.

In the literature review, the cutting technology based on machine vision has been widely used, but little information can be for processing freshwater fish using deep learning. The objective of this project is to develop a new approach to cut freshwater fish, involving the acquisition of freshwater fish for the experiment, the construction of experimental equipment, and the capturing and collection of the freshwater fish images. These images are then utilized to train and evaluate various deep learning algorithms in order to select algorithms with higher accuracy for the vision system. Subsequently, different knives are chosen to cut the fish at the head and tail to verify the design, and the accuracy and segmentation performance of each knife are recorded to provide corresponding recommendations. The crucial aspect of this study is the capacity to cut the fish head and tail in accordance with the size of the fish.

Materials and methods

Experimental materials and mechanisms

Freshwater fish were procured from local farmers’ markets for this experiment. Once slaughtered, the fish underwent the removal of internal organs and washing before being utilized in the study. For the purposes of this research, silver carp, a commonly found fish species, was chosen as the subject. The fish were placed on the conveyor and their heads and tails were automatically removed. Different parts of the fish were collected into designated sorting bins ().

Figure 1. Structure description of freshwater fishes processing system. 1-image acquisition system, 2-rail, 3-controller, 4-bracket.

To mitigate the strenuous labor involved, the Cartesian coordinate manipulator could be employed for automated segmentation rather than relying on manual removal. The freshwater fish processing system encompassed a Cartesian coordinate manipulator, fish conveying device, PC, and image acquisition system (). As shown in , the fish’s body section was preserved to the utmost extent following the removal of the head and tail, resulting in the highest possible meat yield. The accuracy of the cutting line for the fish’s head and tail predominantly hinged upon the reliability of the machine vision system.

Figure 2. Fish head and tail cutting device.

Figure 3. Schematic diagram of fish head and tail.

In order to detect freshwater fish on the conveyor belt, an industrial camera (MV-CA050-20 GM) was positioned 300 mm above the conveyor belt. This camera possessed a high dynamic range, ideal for the prevailing natural lighting conditions. A machine vision lens (MVL-KF1228M-12MP) was utilized to ensure adequate coverage across the conveyor’s width (800 mm). The focal length of the lens was 12 mm, and the image element size of the sensor was 4.8 µm × 4.8 µm. The industrial camera effectively minimized stereo errors, enabling precise localization of the fish bodies, while maintaining the necessary field of view. Consequently, the fish bodies could be accurately identified within the excellent visual environment provided. Subsequently, the industrial camera captured images of the freshwater fish, which were then transmitted to the computer for further processing. Convolutional neural network algorithms were employed to automatically analyze the distinctive features of the fish images, facilitating the identification and localization of the captured freshwater fish. By extracting the fish’s head and tail positions from the image processing results, the corresponding coordinate information was obtained. The pixel coordinates of the cutting point were then converted into actual coordinates and relayed to the programmable logic controller (PLC) register. The computer then converted the data format according to the type of data received by the microcontroller, and sent the data to the microcontroller within the control device by using serial communication. Eventually the motor rotation signal associated with the tool movement was transmitted to the motor driver. Once the initialization was finalized, the cutter would proceed to remove the fish head and tail based on the provided position information. Simultaneously, the recognition results and corresponding coordinate information would be displayed within the software interface.

Dataset description

To mitigate the risk of overfitting and ensure the generalization ability of model training, the fresh silver carps purchased from the market were randomly divided into six groups, each group consisting of 200 fish. The size of each group ranged from 400 mm to 600 mm, including 130 silver carps measuring 400 mm to 500 mm and 70 silver carps measuring 500 mm to 600 mm. The experimental equipment for image acquisition used in the study was shown in . An industrial camera was used to capture 1200 images of chub samples with a size of 2592 × 2048 pixels as the training set. Due to the high resolution of the collected fish images and the large memory occupied, using them as input for model training would significantly increase the number of computational nodes required, resulting in computational overflow and preventing the computer from continuing with the model training. Therefore, following the principle of maintaining a constant ratio, the dimensions of the fish’s original images obtained from the industrial camera were reduced. Subsequently, manual screening was conducted to remove distorted images, and 1000 fish images were selected as the raw dataset. Finally, the dataset was augmented by applying three types of image morphological operations: rotation, translation, and exposure adjustment.^[Citation34] This augmentation technique expanded the manually selected original dataset of 1000 samples to 2000 samples. The labeled images were manually annotated using the labelme(Cross-platform image annotation tool) software. The images of these fish were delineated according to the 6:2:2 principle with reference to the VOC (Visual Object Classes) dataset, 60% of the images were used as training set, 20% were used as a validation set, and the remaining 20% were test set. An example of the data set was shown in .

Figure 4. Example of the original dataset of fish images.

Table 1. Parameters related to the experimental environment.

Download CSV Display Table

Image real-time segmentation network

Deep learning models commonly used in image segmentation include DeepLabv3, U-Net, PSPNet, FCN and so on. DeepLabv3 is suitable for image segmentation in static scenes, U-Net is often applied to image segmentation in medical field, PSPNet is generally used for image segmentation in large-scale scenes, and FCN is often applied to real-time applications on mobile and embedded devices.^{[Citation35–37]} All these models can obtain good segmentation accuracy, but the segmentation time was long and not real-time.^[Citation38] Therefore, the lightweight semantic segmentation model ICNet was developed, which used low resolution to accelerate the capture of semantics, high resolution to obtain details combined with a cascaded network structure to obtain optimal results in efficient time.^[Citation34] ICNet network structure diagram had three branches to feed different input images into the model, as shown in . The first branch downsampled the original image to 1/4 size, and after three downsampling convolutions, the resolution becomed 1/32 of the original image, and after convolution, it used the null convolution layer to extend the perceptual field without reducing the size, and finally outputted the feature map of 1/32 size of the original image. The second branch took 1/2 resolution of the original image as input, and after convolution, it was reduced to 1/8 of the original image to get 1/16 size feature map, and then the feature map extracted from the low-resolution image in the first branch was cascaded through the feature fusion module to get the final output. The third branch took the original image as input, and after convolution, the 1/8 size feature map of the original image was obtained, and then the processed output was fused with the output of the second branch. Therefore, ICNet could refine the segmentation result with low computation, and found a balance between speed and accuracy, which could guarantee the segmentation accuracy and increased the speed. Combining the advantages of ICNet above, the ICNet model seemed to be more suitable for the current study to segment the head and tail of the fish. The current study loaded pre-trained weights based on the Cityscapes(a public dataset) dataset when training the ICNet model, and incorporated channel attention during the downsampling process.^[Citation36] This approach allowed for the weighted processing of the extracted effective features without increasing the network parameters, thereby enhancing the weight of effective features in the network. The incorporation of channel attention also helped to avoid the issue of edge detail information loss that could be caused by convolutional neural networks. During training, the model used the Adam optimizer. The initial learning rate is a relatively important hyperparameter that affects the speed and effectiveness of model convergence and is generally set between 0.01 and 0.001. Setting the learning rate decay coefficient too small will cause the gradient to fall too slowly, while too large will make it difficult for the model to converge. The role of the weight decay coefficient is to reduce the model overfitting problem to a certain extent, generally set between 0.0001 and 0.001. In this paper, according to the performance of the dataset in the pre-training, the initial learning rate was set to 0.001, the learning rate decay coefficient was set to 0.1, the weight decay coefficient was set to 0.0005, the batch size was set to 4, the learning confidence was set to 0.5, and the upper limit of the number of iterations was 20,000 times. In addition, we added a data enhancement module during the model training process to set the optimal model to be automatically saved during the training process and used as the final input model for semantic segmentation of freshwater fish images.

Figure 5. Architecture of three branches for Image Cascade Network.

Accuracy metrics

To evaluate the performance of the models for semantic segmentation of freshwater fish image dataset, two evaluation metrics, PA (pixel accuracy) and MIoU (mean intersection over union), were calculated in this study, and the segmentation results were analyzed and compared with manually labeled images. Pixel accuracy (PA) denoted the ratio of the pixels whose obtained results were exactly consistent with the labels to the total number of pixels in the image.

(1)

PA = \frac{\sum_{i} n_{ii}}{\sum_{i} f_{i}}

(1)

where, $N$ was the number of semantic classes, $n_{ii}$ denoted the number of real pixels belonging to class semantics $i$ , $n_{ij}$ denoted the number of pixels belonging to class semantics $i$ that were predicted to be recognized as class j. $f_{i} = \underset{j}{Σ} n_{ij}$ denoted the sum of pixels belonging to semantic category $i$ . MIoU represented the ratio between intersection sets and the concatenation set, which was often used as the main basis for the performance evaluation of image semantic segmentation models because it directly reflected the segmentation effect with the model and was highly representative.

(2)

MIoU = \frac{1}{N} \frac{\sum_{i} n_{ii}}{(f_{i} + \sum_{j} n_{ji} - n_{ii})}

(2)

Software systems

Compared with the old graphics system GUIDE (Graphical User Interfaces Development Environment), the software development tool (APPdesigner) provided convenient layout design and excellent background model management. This software interface was developed using the Matlab environment. As shown in , the intelligent positioning system of this freshwater fish head and tail contained three major modules, image acquisition module, image positioning module and image information output module.

Figure 6. Freshwater fish head and tail intelligent positioning system interface.

The image acquisition module included capturing photos and loading photos. The parameters of the camera needed to be initialized first, and then the freshwater fish images were acquired. Then the acquired freshwater fish images were displayed in the first image input box. The image positioning module was used to call the ICNet model by clicking the “fish head and fish tail positioning” button, and then the fish head or fish tail was positioned sequentially and displayed in the second and third image boxes. The image information output module was used to display the coordinates of the cut point of the fish head and fish tail. In , X1, Y1 and X2, Y2 showed the coordinates of the position of the two end points of the cut line of the fish head after coordinate transformation, while x1, y1 and x2, y2 showed the coordinates of the position of the two end points of the cut line of the fish tail.

Results and discussion

Model training based on migration learning

The freshwater fish images captured by the vision system didn’t contain semantically labeled, then the pre-trained images should be processed by the manual image annotation to meet the model training requirements. Labelme image labeling tool was used for manually label three types of fish parts: head, tail, and body based on the Cityscapes dataset format. Transfer learning migrated shallow features learned by the model from samples to existing tasks and could solve new problems, which could suppress overfitting and speed up the model convergence ability at small sample datasets. To demonstrate the effect of different deep learning methods for semantic segmentation on fish segmentation, several popular image semantic segmentation models were introduced for the detection of fish head and tail, such as Fast-SCNN, DeepLabV3, PSPNet and U-Net. The overall accuracy, MIoU, and average processing time of single image for these models were shown in . The Fast-SCNN model is a little faster than the ICNet model in terms of detection time, but the MIoU and segmentation accuracy for head and tail are not as good as the ICNet model’s performance. The DeepLabV3, PSPNet, and U-Net models have a smaller difference in MIoU and segmentation accuracy for head and tail than the ICNet model, but have a larger difference in detection time than the ICNet model is large, which leads to their inability to meet the real-time requirements of the production line.Therefore, the ICNet model with real-time characteristics was introduced for segmentation and detection of fish head and tail.

Table 2. Segmentation results of four segmentation models.

Download CSV Display Table

ICNet segmentation results

The ICNet model was used to train and test on a fish dataset, obtaining the MIoU and accuracy for the fish head and tail. The trend of the ICNet network loss function values with the number of iterations was shown in . The loss value decreased rapidly at the beginning of training, and started to decrease slowly with a small oscillation when the number of iterations was about 20, and the loss value basically converged to 0.018 after 52 iterations. The trend of recognition accuracy with the number of iterations during ICNet training was shown in . The recognition accuracy of fish head and fish tail increased rapidly at the beginning of training, and then started to increase slowly with a small oscillation when the number of iterations was about 15, until the accuracy was basically maintained at 99.01% after the number of iterations was 50.

Figure 7. The loss value changes with the number of iterations during ICNet training.

Figure 8. The change curve of accuracy rate during ICNet training.

To demonstrate the real-time capability of this model, the processing time for a single image was recorded and averaged. and display the accuracy, MIoU, and average processing time for a single image in the semantic segmentation of the fish head and tail.

Figure 9. Example of partial segmentation result of validation set.

Table 3. Segmentation results of fish head and tail images based on ICNet.

Download CSV Display Table

According to and , the fish head and tail images were accurately segmented, and each part was clearly distinguished, while the phenomenon of over-segmentation and under-segmentation were rarely present. In 100 test images, the number of images with over-segmentation or under-segmentation is no more than 5. The recognition of the fish head and fish tail edges with complex features was obvious, which might be the deeper convolution layers of the low-resolution branches of ICNet. Then the multi-layer convolution operation ensured the extraction of detailed abstract features. In addition, the multiple upsampling feature fusion of ICNet was also beneficial to improve the recognition accuracy of the model. In the validation set, the overall accuracy and overall MIoU of the ICNet model reached 99.01% and 82.50%, respectively, and the average processing time of a single image was 15.25 ms, indicating that ICNet could achieve semantic segmentation of fish head and tail images while having a certain degree of real-time performance.

Algorithm comparison test

The training parameters of the above five models were kept consistent during the training of the dataset and automatically saved as the optimal model, and the validation set was tested after training testing. As shown in , the segmentation accuracy and MIoU of the fish image semantic segmentation model based on U-Net were the highest, reaching 99.15% and 83.01%, respectively. They were only 0.03%, 0.14%, 0.5%, 1.04% and 0.27%, 0.51%, 1.05%, 4.87% higher than DeeplabV3, ICNet, PSPNet and Fast-SCNN, indicating that the segmentation effects of the five segmentation models had no significantly difference, which they could achieve accurate segmentation of the fish head and fish tail. However, the MIoU of Fast-SCNN was lower, which might be due to the shallow depth of Fast-SCNN network and the use of shallow learning down sampling module for multi-branch low-level feature extraction. With the limited scale of fish image data, it was difficult to extract the deep abstract features in the images for network learning, which was thus unfavorable for feature localization at a later stage. In contrast, U-Net, DeeplabV3, PSPNet models extracted rich semantic features and recovered the edge information of objects because of their deeper network levels and codec structure. In addition, DeeplabV3 had a zero-space convolutional pool pyramid structure, and ICNet and PSPNet had pyramid pool modules that enable the model to obtain more contextual information and multi-scale features to ensure segmentation accuracy.

In summary, only four models, U-Net, Deeplabv3, ICNet and PSPNet could meet the requirements of fish image segmentation accuracy. In terms of segmentation real-time performance, the time that consumed for a single image of U-Net, DeeplabV3, ICNet, and PSPNet were 304 ms, 103 ms, 15.25 ms and 658 ms, respectively. Above them, the ICNet took the shortest time, which was 94.98%, 85.19%, and 97.68% shorter than U-Net, Deeplabv3 and PSPNet, respectively. ICNet exhibited a segmentation accuracy of 99.01% and utilized cascaded feature fusion units and cascaded label guidance to iteratively refined and revised segmentation predictions with low computational expense, conserving both memory and time. Conversely, PSPNet employed deep supervised optimization to enhance segmentation accuracy, which resulted in increased network depth and segmentation time. The results indicated that ICNet achieved both high segmentation accuracy and powerful real-time performance, thereby catering to the precision cutting requirements in practical production.

Fish head and tail cutting coordinates acquisition

Before cutting of the fish head and tail, the coordinates of the fish head and tail cutting position need to be obtained by the image processing system. In this study, the coordinates of the fish head and tail cut could be displayed through the software interface. The “Image Acquisition” button was clicked to acquire a freshwater fish image, which was displayed in the input box. After that, the “Fish Head and Tail Positioning” button was clicked, the fish head and tail positions were displayed in the image box after detection of the fish head and tail using the trained neural network model. At last, after clicking the “Cut point coordinates” button, the cut point positions of the fish head and tail were output in the text box, and the overall interface was displayed as shown in . In order to test the reliability of the system, 30 silver carp were purchased from the fresh market as the test material and tested on the prototype. The test results indicated that this system could output the coordinates of the cutting point of the fish head and tail. Based on the location coordinates of the cutting point, the fish head and tail could be completely positioned and removed, which satisfied the requirements of the head and fish tail positioning technique. Therefore, the algorithm proposed in this study was able to perform the expected task.

Figure 10. The display interface of cutting coordinates of fish head and tail.

Evaluation of cutting effect

In this test, 30 silver carps ranging in size from 400 to 600 mm were processed for cutting head and tail under the guidance of the visual system (), and three types of knives were selected, namely, flat knives, circular disc knives, and circular knives with teeth. All the knives were used to cut the fish body from top down with a cutting speed of 200 mm/min, and 10 fish were cut by each knife. Additionally, the five segmentation models proposed in this study were applied to evaluate the segmentation accuracy of these 30 fish, and the results are presented in and .

Figure 11. Results of cutting the fish head and tail with three different knives. (a) circular knife, (b) circular knife with teeth, (c) flat Knife.

Table 4. Segmentation accuracy of different models for different sizes of fish.

Download CSV Display Table

Table 5. Test results of cutting process.

Download CSV Display Table

As shown in the five segmentation models employed in this study demonstrate excellent segmentation accuracy for various fish sizes. Notably, the segmentation of larger fish was facilitated by the more prominent edge features, resulting in significantly higher segmentation accuracy compared to smaller fish. Fast-SCNN exhibited poor performance in this segmentation task due to its use of a global feature extractor and downsampling module instead of a deep convolutional layer in the conventional two-branch architecture. This design choice leaded to a reduced number of network parameters and computations, and a lack of network depth for extracting natural fish features. The U-Net model could not achieve both classification accuracy and localization accuracy simultaneously. Specifically, when the model selected a smaller sensory field to extract features from small-sized fish, the dimensionality reduction of the corresponding pooling layers was reduced, resulting in a decrease in localization accuracy. This was the primary reason for the model’s poor accuracy in segmenting small-sized fish.

The test results indicated that the lowest flesh yield of 66.8% was obtained when cutting with a flat knife, while the highest flesh yield of 69.2% was obtained when cutting with a circular knife, followed by the excellent cutting effect of 69.0% with a toothed circular knife. There was little difference in the cutting effect of round and toothed circular knives, and the flat knives did not cut as well as the round and toothed circular knives.

From Picture 11, we can see that the tail is not completely cut off when the flat knife was used to cut the head and tail of the fish, so the removal rate of the head and tail of the fish using the flat knife is only 90%, while the success rate of using the round knife and the toothed knife to cut the head and tail of the fish is 100%, and the head and tail of the fish were completely cut off.The possible reason was that the cutting speed of the flat knife cutting was slow and the cutting force required would be larger, so the cutting state was not stable. While the round knife and the circular knife with teeth were fixed on the motor, the motor speed was higher, the cutting speed was far greater than the flat knife, so its cutting quality was high and stable. The round knife with teeth have a tearing effect when cutting fish due to the presence of teeth, so the section was not as high quality as the round knife cutting. Consequently, based on an overall evaluation of flesh yield and cutting impact, this study proposed the utilization of a round knife for cutting the head and tail of fish. In conclusion, the segmentation algorithm based on deep learning proposed in this study has the potential to improve the freshwater fish processing system. By accurately identifying the boundaries of the fish head and tail, the algorithm can enhance the precision of the cutting process, thereby meeting the processing requirements. This study was conducted with silver carp as the sample for target detection, but the algorithm was also applicable to some other freshwater fish such as perch, carp, bighead carp, etc. Moreover, the vision-guided fish cutting system developed using this algorithm proves to be more effective than the previous mechanized system.^[Citation3]

Conclusion

In this study, U-Net, DeeplabV3, PSPNet, FastSCNN and ICNet were introduced to compare the five image semantic segmentation networks for the fish image data. The difference between U-Net, DeeplabV3, PSPNet and ICNet in terms of segmentation accuracy was not significant, but the single image processing time of ICNet was obviously shorter than the other three networks. Although FastSCNN improved the single image processing time to 13.65 ms compared with ICNet, the MIoU had decreased by 4.36%, and the comparison results showed that ICNet had the best comprehensive segmentation capability. The segmentation accuracy and MIoU of the ICNet method were 99.01% and 82.50%, respectively. The processing time of a single image was 15.25 ms, which indicated that the network had high segmentation accuracy and great real-time performance in the semantic segmentation task for fish images. The experimental method used in this study not only successfully segmented the head and tail of freshwater fish, but also obtained the coordinates of the cut point location. After coordinate conversion, three cutters were used to evaluate the effect of cutting the head and tail of freshwater fish. The results suggested that the cutting effect of using a circular knife at the cutting point acquired by image processing is better than that of a flat knife. Freshwater fish is often cut with a slippery edge phenomenon, the knife slides on the surface of the fish and cannot cut effectively. Toothed knives provide a better grip on the fish surface, reducing slippage and increasing cutting efficiency and speed. Toothed knives also cut the fish more evenly, reducing tearing and unevenness in the cutting process, which helps to maintain the integrity and aesthetics of the fish and improve product quality.With advances in automation and robotics, we may also be able to have automated robot arms equipped with knives or other cutting equipment to accomplish the cutting of fish heads and tails.This study could provide visual guidelines for intelligent cutting robots for freshwater fish and technical support for automated processing of fisheries.In the future, we will optimize the existing algorithms by improving the structure of the model or by 1/2 cropping the input raw image and using three branches for joint training, which will increase the real-time performance and speed of the model. While improving the model, we have upgraded the tool loading part of the device so that it can be loaded with different types of knives which can be changed at any time depending on the type of fish to be cut to increase the accuracy. The methodology used in this paper is equally applicable to other types of fish, where the biological characteristics possessed by the samples remain unchanged regardless of the changes in the samples. The research methodology in this paper is based on recognizing the unchanging biometric features of the samples as a basis, and therefore the methodology has cross-domain applicability in extracting and recognizing the essential features of fish.

Ethics statement

We certify that this is our original scientific research work, and it has not been submitted or published anywhere. The authors are responsible for all the content in the manuscript.

Acknowledgments

This work was supported by Chinese National Natural Science Foundation of China (51905387), Hubei Science and Technology Innovation Major Special Project (2019ABA085) and Scientific Research Project from Department of Education of Hubei Province (D20211601).

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

We certify that the data used in this article were collected from this study and can only be available from the corresponding author upon reasonable request.

Additional information

Funding

The work was supported by the Scientific Research Project from Department of Education of Hubei Province [D20211601]; Chinese National Natural Science Foundation of China [51905387]; Hubei Science and Technology Innovation Major Special Project [2019ABA085].

References

Wang, H.; Wan, P.; Tan, H.; Zong, L. Design and Experiment of On-Line Detection and Classification System for Freshwater Fish Body Quality. J. Huazhong Agric. Univ. 2016, 3, 122–128.
Google Scholar
Hu, X.; Chen, Q.; Shen, J. Optimization of Slitting Cutters Used for Gutting TrachuYUS Japohicus. Trans. Chin. Soc. Agric. Eng. 2014, 12, 270–277.
Google Scholar
Li, K.; Wen, Y. Key Technology of Vision-Guided Automatic Head and Tail Removal System for Freshwater Fish. Food Mach. 2014, 30(5), 141–143.
Google Scholar
Li, L.; Zong, L.; Wang, J.; Cheng, S. Research Status and Development Trend of Massive Freshwater Fish Pretreatment Processing Technology and Equipment. Fish. Modernization. 2010, 37(5), 43–46.
Google Scholar
Zhang, W.; Chen, Q.; Ouyang, J.; Zhou, C. Research Progress on Pretreatment and Processing Technology of Freshwater Fish in China. J Anhui Agric. Sci. 2018, 46(21), 25–28.
Google Scholar
Zhao, Y.; Hu, H.; Jiang, G.; Ge, X. Current Status and Development Trend on National Conventional Freshwater Fishery Industry. Chinese Fish. Econ. 2012, 30(5), 91–99.
Google Scholar
Shen, J. Freshwater Fish Pre-Processing and Equipment in Europe. J. Anhui Agric. Sci. 2010, 38(23), 12491–12495.
Google Scholar
Yuan, Y.; Yuan, Y. M.; Dai, Y.; He, Y.; Gong, Y. Competitive Advantage of Characteristic Freshwater Fish Industry of China. Chinese Agric. Sci. Bull. 2020, 36(35), 127–133.
Google Scholar
Kosmowski, M.; Gerlach, K. The New Method of Setting the Small Fishes Backs in the Desired Direction. J. Food Eng. 2007, 83 (1), 99–105. DOI: 10.1016/j.jfoodeng.2007.01.022.
Web of Science ®Google Scholar
Mendes, B.; Fonseca, P.; Campos, A. Weight-Length Relationships for 46 Fish Species of the Portuguese West Coast. J. Appl. Ichthyol. 2004, 20(5) 355–361. DOI: 10.1111/j.1439-0426.2004.00559.x.
Web of Science ®Google Scholar
Sullivan, K. Fish Harvesting Head with Arm Retraction System. Official Gazette of the United States Patent and Trademark Office Patents. Patent US 10034464, 2018.
Google Scholar
Tang, Y.; Zhou, H.; Wang, H.; Zhang, Y. Fruit Detection and Positioning Technology for a Camellia Oleifera C. Abel Orchard Based on Improved YOLOv4-Tiny Model and Binocular Stereo Visio. Expert Syst. Appl. 2023, 211(57), 174. DOI: 10.1016/j.eswa.2022.118573.
Google Scholar
Andrzej, D. The Effect of Cutting and Fish-Orientation Systems on the Deheading Yield of Carp. Int. J. Food Sci. Technol. 2018, 43, 1688–1692.
Google Scholar
Alejandro, B.; Aníbal, M.; María, A.; Aurora, Z. Design and Testing of a Fish Bone Separator Machine. Int. J. Food Eng., 2020, 100(3), 474–479. DOI: 10.1016/j.jfoodeng.2010.04.034.
Google Scholar
Bhushan, R.; Bibwe, S.; Udaykumar, R.; Nidoni, M.; Anantachar, B. Development of Meat-Bone Separator for Small Scale Fish Processing. J. Food Sci. Technol. 2013, 50(4), 763–769. DOI: 10.1007/s13197-011-0381-5.
PubMed Web of Science ®Google Scholar
Dowgiallo, A.; Dutkiewicz, D. Possibilities of Utilizing the Differences of Fish Tissues Stiffness in the Mechanization of Cyprinid Deheading. J. Food Eng. 2007, 83, 111–115. DOI: 10.1016/j.jfoodeng.2007.01.028.
Web of Science ®Google Scholar
Zhang, Y.; Zhang, K.; Hu, Z. Research on Head Orientation Equipment of Freshwater Fish. Proceedings of the 2020 3rd International Conference on E-Business, Information Management and Computer Science, 2020, 12, 477–481.
Google Scholar
Tang, Y.; Huang, Z.; Chen, Z.; Chen, M.; Zhou, H.; Zhang, H.; Sun, J. Novel Visual Crack Width Measurement Based on Backbone Double-Scale Features for Improved Detection Automation. Eng. Struct. 2023, 274(141), 296. DOI: 10.1016/j.engstruct.2022.115158.
Google Scholar
Liu, J.; Zhang, F.; Wan, P.; Tan, H. Freshwater Fish Pneumatic Machinery to Head-Cutting Method Research. Food Mach. 2017, 33(1), 87–92.
Google Scholar
Chen, Q.; Shen, J.; Fu, R.; Tan, J.; Zhang, J. Research on the Mechanical Deheading Method of Typical Small Marine Fish. Fish. Modernization. 2012, 39(5), 38–42.
Google Scholar
Zhao, Z. Mechanized Processing of Dace. Fish. Modernization. 2005, 337, 36–37.
Google Scholar
Qin, H.; Li, X.; Liang, J.; Peng, Y.; Zhang, C. Deep Fish: Accurate Underwater Live Fish Recognition with a Deep Architecture. Neurocomputing. 2016, 187, 49–58. DOI: 10.1016/j.neucom.2015.10.122.
Web of Science ®Google Scholar
Sharmin, I.; Islam, N. F.; Jahan, I.; Joye, T. A.; Rahman, M. R.; Habib, M. T. Machine Vision Based Local Fish Recognition. Appl. Sci. 2019, 1(12), 1529. DOI: 10.1007/s42452-019-1568-z.
Google Scholar
Sung, H.; Park, M. K.; Choi, J. W. Automatic Grader for Flatfishes Using Machine Vision. Int. J. Control Autom. Syst. 2020, 18(12), 3073–3082. DOI: 10.1007/s12555-020-0007-7.
Web of Science ®Google Scholar
Wei, Y.; Zheng, D.; Hu, L.; Zhan, J. Research on Intelligent Bait Casting Method Based on Machine Vision Technology. Adv. Mater. Res. 2015, 8, 1871–1874. DOI: 10.4028/www.scientific.net/AMR.1073-1076.1871.
Google Scholar
Qian, Z.; Wang, S.; Cheng, X.; Chen, Y. An Effective and Robust Method for Tracking Multiple Fish in Video Image Based on Fish Head Detection. BMC Bioinf. 2016, 17(1), 251. DOI: 10.1186/s12859-016-1138-y.
PubMedGoogle Scholar
Singh, A.; Gupta, H.; Srivastava, A.; Joshi, R. C.; Dutta, M. K. A Novel Pilot Study on Imaging-Based Identification of Fish Exposed to Heavy Metal (Hg) Contamination. J. Food Process. Preserv. 2021, 45(6), e15571. DOI: 10.1111/jfpp.15571.
Web of Science ®Google Scholar
Sabat, M.; Kotwaliwale, N.; Chakraborty, S. K.; Kumar, M. Long Short-Term Memory Based Real-Time Monitoring of Potato Slice Drying Using Image Chromatic Features. J. Food Process. Preserv. 2022, 10(26), e17232. DOI: 10.1111/jfpp.17232.
Google Scholar
White, D. J.; Svellingen, C.; Strachan, N. J. C. Automated Measurement of Species and Length of Fish by Computer Vision. Fish. Res. 2006, 80(2), 203–210. DOI: 10.1016/j.fishres.2006.04.009.
Web of Science ®Google Scholar
Dowlati, M.; Mohtasebi, S.; Guardia, M. Application of Machine-Vision Techniques to Fish-Quality Assessment. Trac-Trends Anal. Chem. 2012, 11(40), 168–179. DOI: 10.1016/j.trac.2012.07.011.
Google Scholar
Hong, H. M.; Yang, X. L.; You, Z. H.; Cheng, F. Visual quality detection of aquatic products using machine vision. Aquacult. Eng. 2014, 12(63), 62–71. DOI: 10.1016/j.aquaeng.2014.10.003.
Google Scholar
Zhu, M.; Zhang, Z.; Huang, H.; Chen, Y.; Liu, Y.; Dong, T. Classification of Feeding Status of Bass Based on Lightweight Neural Network MobileNetv3-Small. J. Agric. Eng. 2021, 37(19), 165–172.
Google Scholar
Fang, S. Research on Deep Learning Based Method for Measuring Fish Phenotype Data. Zhejiang Univ. 2021, (01). DOI: 10.27461/d.cnki.gzjdx.2021.002100.
Google Scholar
Zhao, S.; Wang, S.; Bai, Y.; Hao, Z.; Tu, S. Real-Time Semantic Segmentation of Sheep Skeleton Images Based on Generative Adversarial Networks and ICNet. J. Agric. Mach. 2021, 52(2), 329–339. DOI: 10.1155/2021/8847984.
Google Scholar
Guo, J.; Xin, Y.; Xie, Q. Improved DeepLabv3+ for Remote Sensing Image Building Segmentation. Laser J. 2023, 1, 1–10.
Google Scholar
Ren, W.; Han, X.; Zhong, Y.; He, F. A Photovoltaic Panel Image Segmentation Method Based on Improved U-Net Network. J. Shaanxi Univ. Sci.Technol. 2023, 41(2), 155–161.
Google Scholar
Zhang, Z.; Qian, Q.; Zhou, Z.; Hu, X. Real-Time Segmentation Algorithm for Crack Images Based on Improved Fast-SCNN. Appl. Opt. 2023, 44(3), 539–547. DOI: 10.5768/JAO202344.0302001.
Google Scholar
Chen, L. C.; Papandreou, G.; Kokkinos, I. Deeplab: Semantic Image Segmentation with Deep Convalutional Nets, Atrous Convolution, and Fully Connected, Crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40(4), 834–848. DOI: 10.1109/TPAMI.2017.2699184.
PubMed Web of Science ®Google Scholar

Study on visual localization and evaluation of automatic freshwater fish cutting system based on deep learning framework

ABSTRACT

Introduction