Full article: Apollon Mark: bounce mark visualization system for ball sports judgement using prediction-based preceding mirror control

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Video Assistant Referee (VAR) is used for line judging in tennis to improve the fairness of a game. However, currently, popular VAR systems are based on three-dimensional trajectories and do not take actual ball bounce marks into account, which may lead to misjudgment due to misalignment. Therefore, in this paper, we propose Apollon Mark, a bounce mark visualization system with preceding mirror control based on high-speed drop location prediction. Based on high-speed measurement of three-dimensional position and motion prediction, the proposed system enables a high-resolution camera to capture the ground texture both before and after the ball drops on the ground by directing the angle of view of the camera to the predicted drop location of the ball before it drops. The system then enables bounce mark visualization by simple differential processing between the before and after images with a template-matching image processing algorithm. We have quantitatively evaluated the validity of the proposed system through the simulation of drop location prediction for real-scale tennis ball trajectories and validated the bounce mark visualization algorithm under a dynamic angle of view using galvanometer mirrors.

Keywords:

1. Introduction

Tennis and football, which have become popular among many people in recent years, are activities in which players manipulate a ball within a predefined area. In these ball sports, the positional relationship between the ball in motion and the lines indicating the inside and outside of the area is important. It is essential that the in-out judgment by the lines be accurate to maintain the fairness of the game and be instantaneous not to interfere with the smooth progress of the game. In general, referees are the ones that make the in-out decision of the ball by the line based on their visual judgment. However, due to the characteristics of tennis games, where the maximum ball speed is as high as 260 km/h, the human eye is limited in its ability to accurately and quickly discriminate in and out in real-time. There is a report that 8.2% of line calls involving a ball within 100 mm of the court line are incorrect in in-out decisions made with the naked eye of referees [Citation1]. It is very difficult to eliminate the possibility of such misjudgment with the human eye because millimetre-order accuracy is required.

With the recent development of sensing technology, there is a growing demand for Video Assistant Referee (VAR), that is assistant referee technique utilizing cameras and inertial sensors. In VAR in the case of tennis, a Hawk-Eye tennis system [Citation2,Citation3] has been widely introduced, which is a camera-based three-dimensional measurement system of a ball trajectory. In the rules of tennis, the in-out judgment is based on the presence or absence of contact between the ball and the line on the ground. However, the Hawk-Eye [Citation2,Citation3] makes the line judgment using the three-dimensional trajectory of the ball and does not measure the actual contact area on the ground (i.e. the ball's bounce mark), which may result in misjudgment [Citation4]. Thomas et al. [Citation3] reports that the Hawk-Eye causes errors in the range of 2.6 mm on average, which can be due to errors in the ball detection position on the camera, including changes in camera position over time and inaccuracies.

In VAR in the case of football, an Inertial Measurement Unit (IMU) embedded in a soccer ball is used [Citation5] in addition to the Hawk-Eye especially in FIFA World Cup 2022. However, such inertial sensors also measure a three-dimensional trajectory of the ball and do not consider the actual contact area on the ground.

In this study, we propose a bounce mark visualization system named “Apollon Mark,” which predicts ball drop location based on high-speed vision, controls rotating mirrors preceded by the ball bouncing to capture high-resolution images on the ground at the predicted ball drop location, and visualizes the bounce mark with a template-matching-based algorithm. A concept of the proposed system is shown in Figure . As for differences from our previous publication [Citation6], we have added a practical analysis of drop location prediction with simulation assuming the actual scale and speed of a tennis court. We have also added a new visualization algorithm based on template-matching image processing under an oscillatory angle of view with the rotating mirrors.

Figure 1. A concept of the proposed system.

2. Related work

2.1. Line judgment technology

Referee assistance systems are being introduced to prevent human errors in line judgment in many sports events to conduct fair games. In particular, systems using multiple cameras (e.g. Hawk-Eye [Citation2,Citation3,Citation7]), which are utilized in many official tennis matches including Grand Slam events, have been developed and introduced. The Hawk-Eye in tennis competitions is a system that realizes accuracy with an average error of 2.6 mm using multiple cameras with a maximum of 340 fps. The basic principle of operation of the Hawk-Eye [Citation2] is as follows. (1) Recognizes white lines on a tennis court, estimates a self-position of each camera, and tracks two-dimensional trajectories of the ball. (2) Recovers a three-dimensional trajectory of the ball from the trajectories tracked with respective cameras and determines impact points (e.g. bounce locations) based on the three-dimensional trajectory. The impact point is determined using a linear Kalman filter. (3) Visualizes the three-dimensional trajectory of the ball on three-dimensional graphics based on the estimated trajectory and impact point. Therefore, the line judgment by Hawk-Eye is based on a calculation of the relationship between the three-dimensional trajectory of the ball and the position of the white line.

Similarly, ball position measurement technology with a built-in IMU [Citation5] is also used in football. The operating principle of the IMU is that it measures the change in motion with a MEMS inertial sensor and calculates position information with temporal integral. In principle, it can calculate only relative positions and has a problem of drift due to the accumulation of integral errors, but it can be used for line judgment in conjunction with a camera system (e.g. Hawk-Eye) that can measure absolute positions.

However, these technologies measure the three-dimensional trajectory of the ball, not the actual contact surface between the ball and the ground. Therefore, measurement errors in the three-dimensional trajectory may cause discrepancies between the line judgment result by the system and the actual ball bounce mark left on the ground (especially on clay courts).

2.2. Visualization technology

In general, physical visualization refers to the processing of an invisible object so that it becomes visible. Specifically, physical contact (e.g. fingerprints) and temporal localized wear and tear (e.g. cracks in tunnels) are being developed as visualization techniques for criminal investigation [Citation8] and infrastructure inspection [Citation9], respectively. Representing a specific image by actively manipulating the surface of an object can also be called a type of visualization. A robot technology has also been developed to draw images of shading caused by differences in the direction of hairs on a carpet [Citation10].

The proposed visualization is significantly different in the following two points from these visualization techniques (inspection and drawing) achieved by irradiation of special light, spraying of chemical agents, and physical contact. (1) Due to the characteristics of tennis, rapid visualization is necessary not to disturb the game and to cope with changes in the ground environment due to the movement of players. Therefore, the proposed system deals with higher dynamics and shorter time scales. (2) The purpose of visualization is not to detect the presence or absence of marks but to determine their location with high accuracy. By measuring the three-dimensional trajectory of the ball in real-time, it is possible to determine the approximate location of the mark and even obtain information on the ground before the mark occurs.

2.3. Real-time feedback control technology

Using the three-dimensional motion of the ball predicted by using the Kalman filter [Citation2], techniques have been developed to utilize visual feedback information based on the predicted information by combining visual equipment such as projectors and head-mounted displays [Citation11,Citation12]. However, these studies utilize low-speed sensing (60–120 fps), and it is difficult to maintain low latency when actuating optics (i.e. rotating galvanometer mirrors) toward the drop location of the fast-moving tennis ball.

As a high-speed feedback control that does not use prediction such as the Kalman filter, Okumura et al. have developed a system that continuously captures high-resolution images of a ball in motion using high-speed image processing at 1000 fps and high-speed rotating mirror control [Citation13]. High-speed visual feedback at 1000 fps with predictive trajectory instruction allows continuous high-resolution imaging of even transonic flying objects [Citation14]. These are examples of ultra-low latency visual feedback control. However, since the angle of view is controlled to match the high-speed motion of the ball and not its drop location, it is difficult to capture the ground at the moment the ball drops with high resolution and clarity due to motion blur caused by the ball's motion.

3. Apollon Mark: bounce mark visualization system

3.1. System overview

Apollon Mark, the proposed system in this paper, visualizes ball bounce marks based on an appropriate measurement and control strategy for assisting line judgment in ball sports. Figure shows the system configuration and Figure shows the processing flow. An overview of the processing flow in Figure is as follows.

Figure 2. System configuration.

Figure 3. System processing flow.

Section 3.2	Measures the three-dimensional position of the ball with high speed and low latency by two-dimensional ball position measurement and triangulation using multiple high-speed cameras (500 fps or higher).
Section 3.3	Applies a linear Kalman filter [Citation11,Citation12] to the measured three-dimensional position information of the ball to estimate the ball's position, velocity, and acceleration in real-time.
Section 3.4	Predicts the drop time and drop location based on the equation of motion.
Section 3.5	Directs the angle-of-view of a high-resolution camera to the predicted drop location with the angle control of low-latency and high-speed rotating mirrors (i.e. galvanometer mirrors), and captures high-resolution images of the ground at the predicted drop location before and after the ball drops.
Section 3.6	Selects appropriate image frames before and after the drop from the captured high-resolution images using template-matching and calculating the difference in the same region and visualizes the bounce mark of the ball while suppressing the effect of the oscillation of the angle of view.

The above processing flow enables sufficient visualization not only on clay courts, where the bounce mark of a ball is easily visible even after a certain period of time but also in ground environments such as grass courts, where the mark is not clearly visible.

3.2. Triangulation with high-speed cameras

First, the system measures the three-dimensional position of a ball at high speed. It calculates the two-dimensional position of the ball with high speed and low latency with high-speed cameras (500 fps or more) using contour detection and centre-of-gravity detection [Citation13]. It then estimates the three-dimensional position of the ball from the two-dimensional position information of each camera using triangulation [Citation15]. These processes are basically the same as in Hawk-Eye [Citation2], with some differences in speed.

3.3. Motion state estimation by Kalman filter

The system estimates the ball's motion state (position, velocity, and acceleration) to predict the ball's drop location from the measured three-dimensional position of the ball. With a linear Kalman filter [Citation2,Citation11], the motion of an object, including its internal state, is sequentially estimated from the latest measurement information. Based on the previous study [Citation11], the linear Kalman filter is applied to the three-dimensional position of the ball to predict its position $p_{t}$ , velocity ${\dot{p}}_{t}$ , and acceleration ${\ddot{p}}_{t}$ . $p_{t}$ refers to the value of the X coordinate (horizontal direction), Y coordinate (vertical direction), or Z coordinate (depth direction) at time t.

We assume the parabolic motion of the ball here and define state and observation equations as follows. (1) $\begin{aligned} p_{t + 1} = (\begin{matrix} 1 & Δ t & \frac{1}{2} {Δ t}^{2} \\ 0 & 1 & Δt \\ 0 & 0 & 1 \end{matrix}) p_{t} + (\begin{matrix} 0 \\ 0 \\ 1 \end{matrix}) v, \end{aligned}$ (1) (2) $\begin{aligned} q_{t} = (1 0 0) p_{t} + w . \end{aligned}$ (2) $p_{t} = (p_{t}, \dot{p_{t}}, \ddot{p_{t}})^{T}$ is the state vector. $q_{t}$ is the three-dimensional position measured by multiple high-speed cameras, v is the system noise, w is the observation noise, and $Δt$ is the sampling time.

A linear Kalman filter is selected for the proposed system instead of other forms (e.g. an unscented Kalman filter) as computation speed is prioritized over spatial accuracy. The relatively lower accuracy is absorbable in a margin of the angle of the high-resolution camera for visualization described in Sections 3.5 and 3.6.

3.4. Drop location prediction

Among the motion information estimated with the Kalman filter, we use the Y coordinate (vertical direction) position $y_{t}$ , velocity ${\dot{y}}_{t}$ , and acceleration ${\ddot{y}}_{t}$ to solve the following quadratic equation and predict the estimated time $t_{e}$ until the ball drops. (3) $\begin{aligned} y_{g} = y_{t_{c}} + {\dot{y}}_{t_{c}} (t_{e} - t_{c}) + \frac{1}{2} {\ddot{y}}_{t_{c}} (t_{e} - t_{c})^{2}, \end{aligned}$ (3) (4) $\begin{aligned} t_{e} = \frac{- {\dot{y}}_{t_{c}} \pm \sqrt{{\dot{y}}_{t_{c}}^{2} - 2 {\ddot{y}}_{t_{c}} (y_{t_{c}} - y_{g})}}{{\ddot{y}}_{t_{c}}} + t_{c} . \end{aligned}$ (4) $t_{c}$ represents the current time, and $y_{g}$ represents the height of the ground (). Then, the estimated drop location $x_{e}, z_{e}$ is calculated using the estimated drop time $t_{e}$ . (5) $\begin{aligned} x_{e} = x_{t_{c}} + {\dot{x}}_{t_{c}} (t_{e} - t_{c}) + \frac{1}{2} {\ddot{x}}_{t_{c}} (t_{e} - t_{c})^{2}, \end{aligned}$ (5) (6) $\begin{aligned} z_{e} = z_{t_{c}} + {\dot{z}}_{t_{c}} (t_{e} - t_{c}) + \frac{1}{2} {\ddot{z}}_{t_{c}} (t_{e} - t_{c})^{2} . \end{aligned}$ (6) The high-speed and low-latency processing of Sections 3.2, 3.3, and 3.4 leads to the preceding rotating mirror control of Section 3.5.

Figure 4. Illustration of $y_{g}$ and $t_{e}$ .

3.5. Preceding mirror control toward drop location

When capturing an object in motion with a camera, there is a trade-off between angle of view and resolution due to the limitation of the number of pixels in the image sensor. In a Pan-Tilt camera, which resolves this trade-off, the method of rotating a lightweight mirror in front of the camera has an advantage in terms of speed because of the inertial issue when the camera itself is moved [Citation13].

The proposed system utilizes the high-speed mirror control technology of Okumura et al. [Citation13] and Sueishi et al. [Citation14] to control the line-of-sight direction of the high-resolution camera at high speed toward the ball drop location predicted in Section 3.4. This technique is a developed version of the predictive trajectory instruction based on the constant velocity linear motion assumed for the trajectory of a high-speed flying object used in Sueishi et al. [Citation14]. The high-speed, low-latency prediction and mirror response enable us to capture images of the ground before and after the ball drops, leading to the bounce mark visualization described in Section 3.6.

3.6. Bounce mark visualization

Bounce marks are visualized by selecting image frames before and after the bounce using template-matching image processing and calculating the difference between them in the high-resolution image sequence obtained by the preceding mirror control (Section 3.5). Figure shows an example of a template region used for visualization and image selection. Theoretically, we assume that the physical contact of the ball changes the fine textures on the ground, such as turf, in the same way as the pattern the carpet in Yamamoto et al. [Citation10]. For textures such as turf, it is difficult to extract marks only from the post-bounce image, and this strategy simplifies visualization by preparing a pre-bounce image of similar resolution and calculating the difference between the two.

Figure 5. Template matching (red frames) and image selection algorithm for visualization.

Since the drop location prediction and mirror control make the angle of view of the high-resolution camera dynamic and oscillatory, we cannot apply a simple differencing procedure as in the fixed angle of view evaluated in the previous study [Citation6]. Therefore, assuming sufficient texture on the ground, a template-matching algorithm is applied to keep extracting the same region of the ground, allowing comparison and difference computation of the ground during the image sequence. The calculated template similarity drops significantly if the ball enters into the angle of view during the application of template-matching, so we consider the image with the minimum similarity as the bounce time. For balls of various speeds, to automatically select images at times far enough away from the bounce to avoid the shadow of the ball itself, image frames at times when the time derivative of the similarity is below the threshold λ are selected and set as pre- and post-bounce images. Then, the proposed system realizes the bounce mark visualization by differencing the template-matching regions before and after the bounce and adjusting the brightness value.

4. Experimental evaluation

We conducted two types of evaluation experiments. One is about the accuracy of high-speed drop location prediction using numerical simulation, and the other is about the performance of visualization of bounce marks by applying a template-matching algorithm.

4.1. Tennis ball simulation trajectory

To apply the actual tennis court scale and velocity in the performance evaluation of the ball drop location prediction, we generated trajectories of a tennis ball by numerical simulation with reference to the literature [Citation16]. This model [Citation16] considers three different effects on the ball: gravity, drag force due to air ( $D = - D_{L} (v) v / v$ where $v$ is ball velocity), and Magnus force due to rotation ( $M = M_{L} w / w \times v / v$ where $w$ is angular velocity). In particular, for a ball velocity v m/s, the magnitude of the drag force $D_{L} (v)$ and the Magnus force $M_{L} (v)$ are expressed as follows. (7) $\begin{aligned} D_{L} (v) = C_{D} \frac{1}{2} \frac{π d^{2}}{4} ρ v^{2}, \end{aligned}$ (7) (8) $\begin{aligned} M_{L} (v) = C_{M} \frac{1}{2} \frac{π d^{2}}{4} ρ v^{2}, \end{aligned}$ (8) (9) $\begin{aligned} C_{D} = 0.508 + {(\frac{1}{22.053 + 4.196 {(\frac{v}{w})}^{5 / 2}})}^{2 / 5}, \end{aligned}$ (9) (10) $\begin{aligned} C_{M} = \frac{1}{2.022 + 0.981 (\frac{v}{w})} . \end{aligned}$ (10) The coefficients $C_{D}$ and $C_{M}$ are reported to depend only on the variable v/w [Citation16], and the variable w = 20 m/s is assumed to be constant as the rotational deceleration of the ball is negligible. We set the tennis ball diameter to d = 68 mm, the air density to $ρ = 1.29$ kg/m $^{3}$ , the gravitational acceleration to g = 9.81 m/s $^{2}$ , and the tennis ball mass to m = 57 g. With these settings, the ball trajectory can be reproduced by determining the ball's ejection position, ejection velocity, and ejection angle.

In the simulations in this paper, as in the literature [Citation16], we set up three types of trajectories: (1) in a vacuum (“Vacuum”), (2) no spin in air (“Air”), and (3) topspin in air (“Air-Spin”). With the height of the ejection position h = 1 m and the ejection velocity $v_{0} = 50$ m/s (=180 km/h), we set the ejection angles to (1) 0.23 $^{\circ}$ , (2) 1.58 $^{\circ}$ , and (3) 9.72 $^{\circ}$ so the ball will land at 23.6 m, which is close to the depth distance of the tennis court. Figure shows the three types of trajectories. The trajectories of “Air” and “Air-Spin” are above 0.914 m, the height of the net in the centre of the tennis court, while it is not for “Vacuum.”

Figure 6. Three types of simulated ball trajectories.

4.2. Drop location prediction for simulated trajectories

The accuracy of the drop location prediction was evaluated by adding noise (standard deviation: 1–5 mm) to each XYZ coordinate value for the three three-dimensional simulation trajectories generated in Section 4.1 in the scale of the real tennis court. We applied a Kalman filter with a standard deviation of 1 m/s for the system noise v and 2.6 mm for the observation noise w for 1000 trials. It has been reported that the response of galvanometer mirrors can be approximated by the second-order delay system is shown in the following equation [Citation14]. (11) $H_{m} (s) = \frac{ω^{2}}{s^{2} + 2 ξωs + ω^{2}},$ (11) ξ represents a damping factor and ω represents an internal frequency. Therefore, we also included the results of applying two types of second-order delay systems to the predicted drop location in the evaluation target to evaluate not only the predicted drop location based on the Kalman filter (“Kalman”) but also the response of the preceding mirror control using galvanometer mirrors (“Mirror 0” and “Mirror 1”). Figure shows the step responses of two types of second-order delay systems that mimic galvanometer mirrors. We set $ξ_{0} = 0.783$ and $ω_{0} = 717$ as the “Mirror 0” condition (the same as the actual galvanometer mirror setting used in Sueishi et al. [Citation14]) and $ξ_{0} = 1.8$ and $ω_{0} = 717$ as the “Mirror 1” condition, respectively. “Mirror 0” indicates a fast response allowing overshoot, while “Mirror 1” indicates a slow response with no overshoot.

Figure 7. Step responses of galvanometer mirrors.

Figure shows an example of the estimated drop time $t_{e}$ , the estimated drop location in X-coordinate (horizontal, the ball motion direction) $x_{e}$ , and the same in Z-coordinate (depth, perpendicular to the ball motion direction) $z_{e}$ for the “Air-Spin” trajectory. Figure shows the mean and standard deviation of the operable time for mirror response convergence for the three types of trajectories and the three types of responses to the predicted drop location in X-coordinate over 1000 trials. We defined the operable time for mirror response convergence in this paper as the time that the error of the estimated drop location relative to the ground truth of the drop location remains within a certain threshold (34 mm, the radius of a tennis ball in this case).

Figure 8. Kalman filter and mirror response results for “Air-Spin” trajectory.

Figure 9. Operable time for simulated trajectories and responses.

Figure shows a bias in the estimated drop location in the X direction, the ball motion direction, compared to the Z direction. This is thought to be due to the mismatch between the parabolic motion assumed by the Kalman filter and the change in trajectory caused by the ball's air resistance and spin. No significant change in the predicted drop location due to the response of the galvanometer mirror was observed. Considering that the measurement error of Hawk-Eye is reported to be about 2.6 mm [Citation3], Figure confirms that there is an operable time of about 50 ms or more, except for the “Vaccum” trajectory, which is not normally assumed. As shown in Figure , the galvanometer mirror has a sufficient response speed for 50 ms. Therefore, it is expected to be possible to capture a high-resolution image of the ball just before the ball drops (i.e. the ball drop location is within the angle-of-view) with a reasonable operable time in the preceding mirror control for ball speeds as high as 180 km/h. In Figure , the “Air” trajectory has the longest operable time. This is thought to be due to the fact that the “Vacuum” trajectory requires a very low ejection angle to drop at the same location, shortening the time-of-flight and causing the operable time to decrease. As for the “Air-Spin” trajectory, the reason is considered to be the mismatch between the parabolic motion of the Kalman filter and the orbit change due to air resistance and spin, as mentioned above. As to the mirror response, the operable time was slightly smaller for “Mirror 0” with overshoot allowed and slightly larger for “Mirror 1” with no overshoot, but no significant difference was observed.

4.3. Bounce mark visualization with dynamic angle of view

We conducted an evaluation experiment of visualization image processing based on template-matching for an image sequence before and after a ball drops, which can be captured based on the preceding mirror control. Figure shows the experimental system. We recorded image sequences of a tennis ball (radius 34 mm) free-falling from a fixed height and bouncing on artificial turf with a high-resolution camera through galvanometer mirrors with a constant periodic rotational motion. The high-resolution camera was a Basler ace acA2040-120um (100 fps, $2048 \times 1536$ px, pixel size 3.45 $μ$ m, monochrome, analogue gain 10 dB, camera lens focal length 80 mm). The galvanometer mirrors were Cambridge Technology M3 (optical scanning angle $\pm$ 30 $^{\circ}$ , input voltage $\pm$ 3 V), and we used a function generator (Tektronix AFG1022) to control the mirror angles. The artificial turf was Diatex Diamond Turf SN7 ( $300 \times 300$ mm, turf 7 mm long), a commercial product for tennis competitions, and was installed at a distance of 3.5 m from the galvanometer mirrors. We also used two flicker-free lights (Kino Flo Diva Lite 400) to ensure sufficient brightness for an environment similar to an outdoor setting.

Figure 10. Experimental environment for visualization.

In this experiment, to not only evaluate visualization performance at a fixed angle of view as in the previous research [Citation6] but also quantitatively verify the dynamic characteristics of the angle of view, we used an angle of view with a circular motion by inputting sinusoidal waveforms to each of the galvanometer pan and tilt mirrors. We capture images under the following four conditions by adjusting the frequency and amplitude of the circular motion and the camera's exposure time settings.

Condition 1	high frequency, small amplitude, short exposure time
Condition 2	low frequency, large amplitude, short exposure time
Condition 3	high frequency, large amplitude, short exposure time
Condition 4	low frequency, large amplitude, long exposure time

10 Hz was the high frequency, and 1 Hz was the low frequency. We set the large amplitude at 56 mV in the function generator, which corresponds to the amplitude equivalent to the ball radius at 3.5 m, the shooting distance in this experiment. The small amplitude was set to 6 mV, about one-tenth. We set the similarity threshold λ to 0.02, the short exposure time to 1000 μs, and the long exposure time to 5000 μs.

Figure shows the similarity of template-matching for the four conditions. A steep decrease in the similarity was observed when the ball dropped. We also confirmed the motion blur effects in the fluctuations during the section with high similarity, especially in “Condition 3,” with high-speed angle of view motion, and “Condition 4,” with long exposure time.

Figure 11. Temporal changes of template-matching similarity.

Figure shows the extracted and brightness-enhanced difference images (i.e. visualization result) by the template-matching algorithm. We adjusted the brightness values for visibility. Bounce marks were clearly visible in “Condition 1.” Similarly, we observed bounce marks in “Condition 2” and minute noise throughout the image. In “Condition 3,” we could observe only a minute difference at the drop location, confirming that the motion blur had a significant effect. “Condition 4” also showed changes in the drop location, and we also observed overall noise caused by the large motion blur effect. While this experiment showed that visualization based on the template-matching algorithm was possible, it also confirmed the effect of motion blur caused by the dynamic angle of view.

Figure 12. Selected “before” images and brightness-enhanced difference images. “Before” images except “Condition 4” are corrected for 5 $\times$ brightness for visibility.

Figure 12. Selected “before” images and brightness-enhanced difference images. “Before” images except “Condition 4” are corrected for 5× brightness for visibility.

5. Discussion and conclusion

In this paper, we propose Apollon Mark, a bounce mark visualization system that actively utilizes preceding mirror control based on the drop location prediction for high-speed tennis balls. We evaluated the performance of the drop location prediction based on a Kalman filter and second-order delay system of rotational mirror responses, not only for the laboratory-scale as in the previous study [Citation6] but also for the real-scale using the numerical simulation for tennis ball trajectories, and confirmed that the angle of view of the high-resolution camera has an operable time of more than 50 ms to capture the drop location. We also confirmed that a bounce mark visualization algorithm that applies template-matching to a dynamic angle of view works well if large motion blur is avoided.

Figure shows an example of a system setup for an actual tennis court size. As the optical scanning angle of the galvanometer mirrors used in this study is $\pm$ 30 $^{\circ}$ and one side of a tennis court is approximately 12 m square, two galvanometer mirrors can cover one tennis court if the high-resolution camera and galvanometer mirrors are installed at a height of 11 m. This is about three times the distance of 3.5 m in Section 4.3, and the camera lens focal length should be set to 240 mm ( $= 80 mm \times 3$ ), which is a realistically feasible focal length, to visualize the bounce mark with the equivalent angle of view as in Section 4.3. The operable time of 50 ms in Section 4.2 enables a 60 fps (every 17 ms) camera to capture 2–3 images and a 100 fps (every 10 ms) camera to 4–5 images when capturing images just before the bounce for the visualization in Section 4.3. Therefore, it can be expected that the preceding mirror control of Section 4.2 and the bounce mark visualization of Section 4.3 can work sequentially in time, thereby demonstrating the validity of the proposed system.

Figure 13. Practical arrangement for actual tennis court.

There are some limitations in the proposed system as follows. (1) Sufficient brightness of the environment or sufficient sensitivity of the image sensor is required to reduce the effects of motion blur in visualization image processing. While future improvements in image sensors can be expected to increase sensitivity, in general sunlight is very bright, so the system is expected to operate well in outdoor environments. In fact, the system could visualize ball marks in the experiment using a pair of flicker-free lights whose total nominal brightness value is around 5000 lx at 1 m away, and the brightness in cloudy weather easily exceeds 5000 lx. (2) Depending on the arrangement of the system, the distance from the camera to the subject may vary significantly, requiring large focus control, but this can be handled by combining a high-speed variable-focus liquid lens [Citation17]. (3) Depending on the sun and lighting conditions, shadows of the ball and players may appear in the pre- and post-bounce images used for visualization, but this may be handled using brightness-normalized difference image processing. Also, adding sophisticated image processing as post-processing would help mitigate the impacts of obstacles reflected in the image. (4) The depth of field of the telephoto lens connected to the high-resolution camera for visualization is shallow, and the camera should be installed above the court, as shown in Figure , as the ball diameter may exceed the depth of field of the lens when taking images diagonally from the point close to the ground.

In future studies, we will consider coordination with high-speed focus control using a liquid variable focus lens to enable various arrangements of the proposed system on a tennis court, and develop an algorithm to automate the parameter setting such as the similarity threshold λ as well as compare the positional relationship between bounce marks and lines considering shading in line judgment as a practical Video Assistant Referee. We will then link the respective components as a whole system and develop it on an actual scale, leading to the realization of a practical referee assistance system.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Notes on contributors

Himari Tochioka

Himari Tochioka received an M.F.A. degree in intermedia art from Tokyo University of the Arts in 2017 and a B.F.A. degree in oil painting from Kyoto Seika University in 2014. She also studied at the University of Edinburgh through an exchange program. She worked as an intern at Gwangju Biennale. She was an Academic Project Support Staff at the University of Tokyo from 2018 to 2024. She is currently a Project Member at Tokyo University of Science. Her current research interests include visualization projects that cross art and science disciplines, ranging from optical phenomena to interdisciplinary approaches such as intermedia art.

Tomohiro Sueishi

Tomohiro Sueishi received his B.E. degree in information physics and his M.E. and Ph.D. degrees in information science and technology from the University of Tokyo, in 2012, 2014, and 2017, respectively. He was a JSPS Research Fellow (DC1) from 2014 to 2017, a Project Assistant Professor at the University of Tokyo from 2017 to 2020, and a Project Lecturer at the University of Tokyo from 2020 to 2024. He is currently a Junior Associate Professor at Tokyo University of Science. His current research interests include high-speed tracking, high-speed optical control, and dynamic vision system.

Masatoshi Ishikawa

Masatoshi Ishikawa received the B.E., M.E., and Dr. Eng. degrees in mathematical engineering and information physics from the University of Tokyo, Japan, in 1977, 1979, and 1988, respectively. After he worked at Industrial Products Research Institute, Tsukuba, Japan, he moved to the University of Tokyo in 1989. He was a vice-president and an executive vice-president of the University of Tokyo, from 2004 to 2005, and from 2005 to 2006, respectively. Now he is the president of Tokyo University of Science from 2022. His current research interests include sensor fusion, high-speed vision, high-speed intelligent robots, visual feedback, and dynamic interaction. He was the president of SICE in 2011 and the president of IMEKO from 2018 to 2021.

References

Mather G. Perceptual uncertainty and line-call challenges in professional tennis. Proc Roy Soc B: Biol Sci 2008;275(1643):1645–1651. doi: 10.1098/rspb.2008.0211
PubMed Web of Science ®Google Scholar
Owens N, Harris C, Stennett C. Hawk-eye tennis system. In: International Conference on Visual Information Engineering; 2003, p. 182–185. Guildford.
Google Scholar
Thomas G, Gade R, Moeslund TB, et al. Computer vision for sports: current applications and research topics. Comput Vis Image Underst. 2017;159:3–18. doi: 10.1016/j.cviu.2017.04.011
Web of Science ®Google Scholar
Collins H. The philosophy of umpiring and the introduction of decision-aid technology. J Philos Sport. 2010;37(2):135–146. doi: 10.1080/00948705.2010.9714772
Web of Science ®Google Scholar
Blauberger P, Marzilger R, Lames M. Validation of player and ball tracking with a local positioning system. Sensors. 2021;21(4):1–13.
Web of Science ®Google Scholar
Tochioka H, Sueishi T, Ishikawa M. Bounce mark visualization system for ball sports judgement using high-speed drop location prediction and preceding mirror control.In: SICE Annual Conference 2023; Tsu, Japan, 2023. p. 784–789.
Google Scholar
Pingali GS, Opalach A, Jean YD, et al. Instantly indexed multimedia databases of real world events. IEEE Trans Multimed. 2002;4(2):269–282. doi: 10.1109/TMM.2002.1017739
Web of Science ®Google Scholar
Hazarika P, Russell DA. Advances in fingerprint analysis. Angew Chem Intern Ed. 2012;51(15):3524–3531. doi: 10.1002/anie.v51.15
PubMed Web of Science ®Google Scholar
Hayakawa T, Moko Y, Morishita K, et al. Tunnel lining surface monitoring system deployable at maximum vehicle speed of 100 km/h using view angle compensation based on self-localization using white line recognition. J Robot Mechatron. 2022;34(5):997–1010. doi: 10.20965/jrm.2022.p0997
Web of Science ®Google Scholar
Yamamoto T, Sugiura Y. A robotic system for images on carpet surface. Graph Vis Comput. 2022;6:200045. doi: 10.1016/j.gvc.2022.200045
Google Scholar
Koike H, Yamaguchi H. LumoSpheres: real-time tracking of flying objects and image projection for a volumetric display. In: Proceedings of the 6th Augmented Human International Conference, Singapore; 2015. p. 93–96.
Google Scholar
Itoh Y, Orlosky J, Kiyokawa K, et al. Laplacian vision: augmenting motion prediction via optical see-through head-mounted displays. In: Proceedings of the 7th Augmented Human International Conference; 2016. Geneva, Switzerland, Article No. 16, p. 1–8.
Google Scholar
Okumura K, Yokoyama K, Oku H, et al. 1 ms auto pan-tilt–video shooting technology for objects in motion based on saccade mirror with background subtraction. Adv Robot. 2015;29(7):457–468. doi: 10.1080/01691864.2015.1011299
Web of Science ®Google Scholar
Sueishi T, Ishii M, Ishikawa M. Tracking background-oriented schlieren for observing shock oscillations of transonic flying objects. Appl Opt. 2017 May;56(13):3789–3798. doi: 10.1364/AO.56.003789
PubMed Web of Science ®Google Scholar
Hartley R, Zisserman A. Multiple view geometry in computer vision. Cambridge, UK: Cambridge University Press; 2003.
Google Scholar
Klvaňa F. Trajectory of a spinning tennis ball. Berlin Heidelberg: Springer; 1995.
Google Scholar
Sueishi T, Ogawa T, Yachida S, et al. Continuous high-resolution observation system using high-speed gaze and focus control with wide-angle triangulation. In: High-Speed Biomedical Imaging and Spectroscopy V; Vol. 11250. San Francisco, California, United States, SPIE; 2020. p. 50–59.
Google Scholar

Apollon Mark: bounce mark visualization system for ball sports judgement using prediction-based preceding mirror control

Abstract

1. Introduction