Full article: Interval-based KKT framework for support vector machines and beyond

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Our article proves inequalities for interval optimization and shows that feasible and descent directions do not intersect in constrained cases. Mainly, we establish some new interval inequalities for interval-valued functions by defining LC-partial order. We use LC-partial order to study Karush–Kuhn–Tucker (KKT) conditions and expands Gordan's theorems for interval linear inequality systems. By applying Gordan's theorem, we can determine the best outcomes for interval optimization problems (IOPs) that have constraints, such as Fritz John and KKT conditions. The optimality conditions are observed with inclusion relations rather than equality. We can use the KKT condition for binary classification with interval data and support vector machines(SVMs). We present some examples to illustrate our results.

Keywords:

2020 MSC:

1. Introduction

Optimization theory is applied in a variety of industries, including engineering. Data collection and quantification are crucial in modelling an optimization problem. An optimization model's data is typically derived through measurements or observations. Frequently, acquired data sets are published with an error percentage or imprecision. Fuzzy numbers or intervals are appropriate representations for such data. As a result, the various parameters/coefficients used for the formulation of the modelled constraint and objective functions using obtained data, are converted to intervals or fuzzy numbers.

Optimization is the process of determining the best available values across a collection of inputs to maximize or minimize an objective function. Whether utilizing supervised or unsupervised learning, all deep learning models require some variance in optimization.

IOPs have been the subject of various investigations. Based on the worst- or best-case situations, many current strategies optimize the objective function's upper or lower function or their mean. Consequently, conventional methods of optimization can address the resulting problem of IOP, transforming it into single-valued optimization problems based on worst- or best-case situations and numerous current strategies that optimize the objective function's upper or lower function or their mean. IOPs have also been subjected to the KKT optimality conditions. Wu, did a lot of research on it [Citation1–3].

Using a generalized derivative, Chalco-Cano [Citation4] proposed KKT optimality conditions for IOPs. The generalized derivatives and the partial ordering of intervals were utilized by Singh et al. [Citation5,Citation6] to define KKT conditions for the IOPs combining upper and lower functions. It is important to note that current research on IOP optimality conditions used algebraic manipulations rather than a geometrical analysis of an optimal point in order to generalize existing optimality conditions.

Mathematical and theoretical modelling are crucial components of optimization problems [Citation7]. Practically, determining the coefficients of an objective function as a real number is generally difficult. Since a wide range of real-world problems can involve data imprecision due to measurement errors or other unanticipated circumstances. Robust optimization [Citation8] and interval-valued optimization are two methods of deterministic optimization that deal with unknown data.

Slowinski and Teghem compared two different optimization problems for multi-objective programming problems [Citation9]. The KKT optimality criteria have been studied for over a century and play a significant role in optimization theory. Wu [Citation1], Dar, and Singh [Citation6] developed KKT optimality conditions for optimization problems with interval-valued objective functions and constraints. Lodwick and Chalco-Cano [Citation4] used the generalized derivative to examine the interval-valued optimization problem's KKT optimality criteria.

SVMs are a type of machine learning technique that may be used to classify and predict data. Training and testing data are required for the development of a SVM [Citation10]. The user sorts the training data into the appropriate categories. An optimization problem is used to create a model using this data. This model generates a hyperplane that divides the testing data set into the right categories in a linear manner.

SVMs are cutting-edge machine-learning approaches with a foundation in structural risk minimization [Citation11,Citation12]. The structural risk minimization theory states that a function from a function class has a low expectation of risk on all data from an underlying distribution if that a function has a low empirical risk on a particular data set sampled from that distribution and the function class has a low level of complexity, as determined by Vapnik [Citation13]. By utilizing the fact that a bigger margin correlates with a smaller fat-shattering dimension for the particular function classes, the well-known large margin technique in SVMs [Citation14] fundamentally limitize the complexity of the function class.

The derivative is the most commonly used term in classical optimization theory. In constrained optimization problems, it is useful for studying optimality criteria and duality theorems. H-derivative is a well-known notion. But the H-derivative has limitations. Luciano and Stefanini [Citation15] proposed the gH-derivative in 2009 to address the shortcomings of the H-derivative. These IVF versions have been widely used by optimal problem researchers. For example, Wu [Citation3] used H-derivative to examine the KKT conditions for nonlinear IOPs.

It is impossible to overestimate the importance of derivatives in IOP problems that are nonlinear. Wu [Citation1–3] explored interval-valued nonlinear programming problems and showed how the H-derivative may be used in interval-valued KKT optimization problems. The gH-differentiability was also extended to learn interval-valued KKT optimality conditions, according to Chalco-Cano [Citation4].

Ghosh et al. used extended KKT conditions for IOPs in their study [Citation16] to apply them to interval-valued SVMs using LU-partial order. In [Citation17,Citation18], researchers recently discussed a generalized interval-valued portfolio optimization problem. It is well known that a set of all compact intervals $I$ is a partially ordered set. In [Citation19], Younus et al. defined several partial orders of $I$ and obtained relationships between them. They have shown that LC-partial order is not equivalent to LU-partial order. However, LC-partial order implies LU-partial order. For some interesting results, see, also, Dastgeer et al. [Citation20]. Building on the previous studies, we extended the results of [Citation16] for LC-partial order and identified some variations. Motivated by gH-differentiability of interval-valued functions and latest LC -partial order, we discuss all optimality conditions and SVM problem under gH-differentiability and LC-partial order. Which generalized many results in the literature.

We would now like to outline the contributions of this work.

Interval optimization inequalities: The article introduces and proves inequalities specifically tailored for IOPs. This suggests a departure from traditional optimization techniques and an exploration of methods suited to handling intervals.

Non-intersecting directions: In constrained scenarios, the article demonstrates that feasible and descent directions do not intersect. This observation likely has implications for optimization algorithms and could lead to more efficient optimization strategies.

LC-partial order: The authors introduce the concept of LC-partial order, a new mathematical framework that appears to be a key component of their approach. This concept may have applications beyond the specific problem discussed in the article.

Extension of Gordan's theorems: The article extends Gordan's theorems to interval linear inequality systems. This extension could have broader implications in mathematical theory and its application to optimization.

Inclusion-based optimality conditions: Instead of traditional equality-based optimality conditions, the article suggests the use of inclusion relations for constrained IOPs. This shift in perspective may lead to new insights and methods for solving such problems.

Application to binary classification: The article highlights the application of KKT conditions for binary classification with interval data and support vector machines. This application demonstrates the practical relevance of the theoretical developments presented in the article.

Illustrative examples: The authors provide examples to illustrate their results, which can help readers to understand the practical implications and potential applications of their findings.

Section 2 provides the basic concepts, definitions, and notations used in the article. The KKT and Fritz John's criteria for IOPs are derived in Section 3 along with extended Gordan's theorems. For both constrained and unconstrained IOPs, we develop the optimality conditions. In Section 4, we apply the optimality conditions given in Section 3 to solve the SVM classification problem on the interval-valued data set. We provide an example of the generated classifier. We give a graphical representation of the classification problem. We also present the conclusion and future scope in Section 5.

2. Preliminaries

Notations

We used the following notations throughout this article:

All the capital and bold letters denote the interval-valued functions or intervals.
$I$ represents the set of all bounded and closed intervals in $R$ and $I^{n}$ denotes the interval-valued vectors.
Sets are represented by ordinary capital letters.
$C ⊖_{gH} D$ is the gH-difference between two intervals C and D.
$C \oplus D$ signifies interval addition.
$C ⊖ D$ is the subtraction of two intervals C and D.
$k ⊙ C$ represents the scalar multiplication of an interval C.
$0_{v}^{n}$ denotes the interval vector with n components.
$| K |$ represents the cardinality of the set K.

Definition 2.1

The addition and subtraction of intervals $C = [\underline{c}, \bar{c}]$ and $D = [\underline{d}, \bar{d}]$ are defined by $\begin{aligned} C \oplus D & = [\underline{c} + \underline{d}, \bar{c} + \bar{d}], \\ C ⊖ D & = [\underline{c} - \underline{d}, \bar{c} - \bar{d}] . \end{aligned}$ Similarly, for scalar multiplication $k ⊙ C = {\begin{array}{cc} [k \underline{c}, k \bar{c}] & if k \geq 0 \\ [k \bar{c}, k \underline{c}] & if k < 0, \end{array}$ where k is the real constant. It can be seen that the definition of interval-difference has the following two limitations

(i) $C ⊖ C \neq {0}$ , and

(ii) for $A = C ⊖ D$ , the relation $C = D \oplus A$ does not necessarily hold.

Definition 2.2

[21]

Let $C$ and $D$ be two intervals in $I$ . The gH-difference between two intervals is defined by $\begin{aligned} A & = C ⊖_{gH} D \\ = [min {\underline{c} - \underline{d}, \bar{c} - \bar{d}}, max {\underline{c} - \underline{d}, \bar{c} - \bar{d}}] . \end{aligned}$

Definition 2.3

Let $A \subseteq R^{n}$ and $x_{0}$ be an interior point of A such that $x_{0} = (x_{1}^{0}, x_{2}^{0}, x_{3}^{0}, \dots, x_{n}^{0})$ and there exist $h \in R^{n}$ such that $x_{0} + h \in A$ . Let $F : A ⟶ I$ be a function such that we define $Ψ_{j} (x_{j}) = F (x_{1}^{0}, x_{2}^{0}, x_{3}^{0}, \dots, x_{j - 1}^{0}, x_{j}^{0}, x_{j + 1}^{0}, \dots, x_{n}^{0})$ if $lim_{h ⟶ 0} \frac{Ψ_{j} (x_{j}^{0} + h_{j}) ⊖_{gH} Ψ_{j} (x_{j}^{0})}{h_{j}}$ exists. Then $F$ is said to have the jth gH-partial derivative at $x_{0}$ and It is represented by $D_{j} F (x_{0})$ , $j = 1, 2, \dots, n$ . The gH-partial derivatives of $F$ at $x_{0}$ can be written as $\begin{aligned} D_{j} F (x_{0}) & = [min {\frac{\partial \underline{F}}{\partial x_{j}} (x_{0}), \frac{\partial \bar{F}}{\partial x_{j}} (x_{0})}, \\ max {\frac{\partial \underline{F}}{\partial x_{j}} (x_{0}), \frac{\partial \bar{F}}{\partial x_{j}} (x_{0})}], j = 1, 2, \dots, n . \end{aligned}$

Definition 2.4

Consider an interval-valued function $F$ . The gH-gradient of $F$ at any point $x_{0} \in A$ is a vector defined by $\begin{aligned} \nabla F (x_{0}) & = (D_{1} F (x_{0}), D_{2} F (x_{0}), D_{3} F (x_{0}), \\ \dots {, D_{n} F (x_{0}))}^{T} . \end{aligned}$

Definition 2.5

Consider a function $F : A \to I$ . $F$ is called gH-differentiable at $x_{0} \in A$ if there exist two functions (interval-valued) $J (x_{0}; k)$ and $G_{x_{0}} (k) : R^{n} \to I$ such that $F (x_{0} + k) ⊖_{gH} F (x_{0}) = G_{x_{0}} (k) \oplus | | k | | ⊙ J (x_{0}; k)$ for $| | k | | < δ$ for some $δ > 0$ , where $lim_{| | k | | ⟶ 0} J (x_{0}; k) = 0$ and $G_{x_{0}} (k)$ is a function such that $\begin{aligned} G_{x_{0}} (a x_{1} \oplus a x_{2}) \\ = a ⊙ G_{x_{0}} (x_{1}) \oplus a ⊙ G_{x_{0}} (x_{2}), \\ \forall x_{1}, x_{2} \in A and a \in R . \end{aligned}$

Definition 2.6

[Citation19]

For two intervals $A = [\underline{a}, \bar{a}]$ and $B = [\underline{b}, \bar{b}]$ , we define LC-partial order as: $A \leq_{LC} B$ , if $\underline{a} \leq \underline{b} and C (A) \leq C (B),$ where $C (A) = \frac{\underline{a} + \bar{a}}{2}, and C (B) = \frac{\underline{b} + \bar{b}}{2} .$

Definition 2.7

A vector y which gives the minimum value of the objective function $b^{T} y$ in an optimization problem over the set of vectors satisfying the constraints $By = d,$ $y \geq 0$ , is called an optimal solution.

Lemma 2.8

For any $C$ and $D$ in $I$ such that $C \leq_{LC} D$ , then $C ⊖_{gH} D \leq_{LC} 0$ .

Proof.

Let $C$ and $D$ be two intervals in $I$ such that $C = [\underline{c}, \bar{c}] and D = [\underline{d}, \bar{d}] .$ Suppose that $C \leq_{LC} D$ , then $[\underline{c}, \bar{c}] \leq_{LC} [\underline{d}, \bar{d}] .$ It implies that (1) $\underline{c} \leq \underline{d}$ (1) and (2) $\frac{\underline{c} + \bar{c}}{2} \leq \frac{\underline{d} + \bar{d}}{2} .$ (2) As we know that $C ⊖_{gH} D = [min {\underline{c} - \underline{d}, \bar{c} - \bar{d}}, max {\underline{c} - \underline{d}, \bar{c} - \bar{d}}] .$ Case 1: If $\bar{c} - \bar{d} \leq \underline{c} - \underline{d}$ , then $C ⊖_{gH} D = [\bar{c} - \bar{d}, \underline{c} - \underline{d}]$ and from inequality (Equation1(1) $\underline{c} \leq \underline{d}$ (1) ), we have $\bar{c} - \bar{d} \leq \underline{c} - \underline{d} \leq 0.$ It implies that (3) $\bar{c} - \bar{d} \leq 0.$ (3) From (Equation2(2) $\frac{\underline{c} + \bar{c}}{2} \leq \frac{\underline{d} + \bar{d}}{2} .$ (2) ) $\frac{\underline{c} + \bar{c}}{2} - \frac{\underline{d} + \bar{d}}{2} \leq 0,$ it follows (4) $\frac{\bar{c} - \bar{d} + \underline{c} - \underline{d}}{2} \leq 0.$ (4) From (Equation3(3) $\bar{c} - \bar{d} \leq 0.$ (3) ) and (Equation4(4) $\frac{\bar{c} - \bar{d} + \underline{c} - \underline{d}}{2} \leq 0.$ (4) ) $C ⊖_{gH} D \leq_{LC} 0.$

Case 2: If $\underline{c} - \underline{d} \leq \bar{c} - \bar{d}$ , then $C ⊖_{gH} D = [\underline{c} - \underline{d}, \bar{c} - \bar{d}] .$ From (Equation1(1) $\underline{c} \leq \underline{d}$ (1) ) (5) $\underline{c} - \underline{d} \leq 0,$ (5) from (Equation5(5) $\underline{c} - \underline{d} \leq 0,$ (5) ) and (Equation4(4) $\frac{\bar{c} - \bar{d} + \underline{c} - \underline{d}}{2} \leq 0.$ (4) ) $C ⊖_{gH} D \leq_{LC} 0 .$ This completes the proof.

Remark 2.9

Let $C = [\underline{c}, \bar{c}]$ and $D = [\underline{d}, \bar{d}]$ such that $\begin{aligned} Len (C) & := \bar{c} - \underline{c} \\ Len (D) & := \bar{d} - \underline{d} . \end{aligned}$ We know from [22]: $C ⊖_{gH} D = {\begin{array}{cc} [\underline{c} - \underline{d}, \bar{c} - \bar{d}] & if Len (C) \geq Len (D), \\ [\bar{c} - \bar{d}, \underline{c} - \underline{d}] & if Len (C) < Len (D) . \end{array}$ If $Len (C) \geq Len (D)$ , then $C ⊖_{gH} D \leq_{LC} 0 \Rightarrow C \leq_{LC} D$ and if $Len (C) < Len (D)$ , $C ⊖_{gH} D \leq_{LC} 0 \Rightarrow C \leq_{LC} D .$

3. KKT conditions under LC-partial order

Definition 3.1

Let A be a convex subset of $R^{n}$ . We say that an interval-valued function $F : A \to I$ is LC-convex on A, if for any $x_{1}$ and $x_{2}$ in A $\begin{aligned} F (λ x_{1} + (1 - λ) x_{2}) \\ \leq_{LC} λ ⊙ F (x_{1}) \oplus (1 - λ) ⊙ F (x_{2}), \forall λ \in [0, 1] . \end{aligned}$

Theorem 3.2

Let $F$ be gH-differentiable at $x_{0}$ . Then $G_{x_{0}} (k)$ exists for every k in $R^{n}$ and $G_{x_{0}} (k) = k^{T} ⊙ \nabla F (x_{0}) .$

Proof.

See [23].

Theorem 3.3

Let A be a non-empty open convex subset of $R^{n}$ and $F : A \to I$ be gH-differentiable at any $x \in A$ . Then $F$ is LC-convex on A if and only if $\begin{aligned} {(x_{2} - x_{1})}^{T} ⊙ \nabla F (x_{1}) \\ \leq_{LC} F (x_{2}) \oplus (- 1) ⊙ F (x_{1}), \forall x_{1}, x_{2} \in A . \end{aligned}$

Proof.

Let $F$ be LC-convex on A, and any $x_{1}, x_{2} \in A$ . Then, for $k = x_{2} - x_{1}$ and $0 < α_{0} < 1$ , $\begin{aligned} F (x_{1} + α_{0} k) & = F (x_{1} + α_{0} (x_{2} - x_{1})) \\ = F ((1 - α_{0}) x_{1} + α_{0} x_{2}) \\ \leq_{LC} (1 - α_{0}) ⊙ F (x_{1}) \oplus α_{0} ⊙ F (x_{1} + k) . \end{aligned}$ By Lemma 2.8, we have $\begin{aligned} F (x_{1} + α_{0} k) ⊖_{gH} F (x_{1}) \\ \leq_{LC} (1 - α_{0}) ⊙ F (x_{1}) \oplus α_{0} ⊙ F (x_{1} + k) ⊖_{gH} F (x_{1}) \\ \leq_{LC} (1 - α_{0} - 1) ⊙ F (x_{1}) \oplus α_{0} ⊙ F (x_{1} + k) \\ \leq_{LC} - α_{0} ⊙ F (x_{1}) \oplus α_{0} ⊙ F (x_{1} + k) . \end{aligned}$ Hence, the above inequality can be written as (6) $\frac{1}{α_{0}} (F (x_{1} + α_{0} k) ⊖_{gH} F (x_{1})) \leq_{LC} F (x_{2}) \oplus (- 1) F (x_{1}) .$ (6) As $α_{0} ⟶ 0^{+}$ , then by Definition 2.5 and Theorem 3.2 $F (x_{1} + k) ⊖_{gH} F (x_{1}) = G_{x_{1}} (k) \oplus | | k | | ⊙ J (x_{1}; k),$ where $G_{x_{1}} (k) = k^{T} ⊙ \nabla F (x_{1})$ and $lim_{| | k | | ⟶ 0} J (x_{1}; k) = 0.$ Therefore $F (x_{1} + k) ⊖_{gH} F (x_{1}) = k^{T} ⊙ \nabla F (x_{1}),$ Let $k = x_{2} - x_{1}$ in the inequality (Equation6(6) $\frac{1}{α_{0}} (F (x_{1} + α_{0} k) ⊖_{gH} F (x_{1})) \leq_{LC} F (x_{2}) \oplus (- 1) F (x_{1}) .$ (6) ), we get ${(x_{2} - x_{1})}^{T} ⊙ \nabla F (x_{1}) \leq_{LC} F (x_{2}) \oplus (- 1) F (x_{1}),$ which is the desired result.

Now for the converse part, let ${(x_{2} - x_{1})}^{T} ⊙ \nabla F (x_{1}) \leq_{LC} F (x_{2}) \oplus (- 1) F (x_{1})$ be true for any $x_{1}$ and $x_{2}$ in A.

Then, for any $0 \leq β \leq 1$ , we denote $x_{β} = β x_{1} + (1 - β) x_{2} .$ Hence, the following inequalities hold true (7) $\begin{aligned} (1 - β) ⊙ [{(x_{1} - x_{2})}^{T} ⊙ \nabla F (x_{1})] \\ \leq_{LC} F (x_{1}) \oplus (- 1) F (x_{β}) \end{aligned}$ (7) and (8) $β ⊙ [{(x_{2} - x_{1})}^{T} ⊙ \nabla F (x_{1})] \leq_{LC} F (x_{2}) \oplus (- 1) F (x_{β}) .$ (8) Multiplying inequality (Equation7(7) $\begin{aligned} (1 - β) ⊙ [{(x_{1} - x_{2})}^{T} ⊙ \nabla F (x_{1})] \\ \leq_{LC} F (x_{1}) \oplus (- 1) F (x_{β}) \end{aligned}$ (7) ) by β and inequality (Equation8(8) $β ⊙ [{(x_{2} - x_{1})}^{T} ⊙ \nabla F (x_{1})] \leq_{LC} F (x_{2}) \oplus (- 1) F (x_{β}) .$ (8) ) by $(1 - β)$ , we get (9) $\begin{aligned} β (1 - β) ⊙ [- k^{T} ⊙ \nabla F (x_{1})] \\ \leq_{LC} β ⊙ F (x_{1}) \oplus (- β) F (x_{β}) \end{aligned}$ (9) and (10) $\begin{aligned} (1 - β) β ⊙ [k^{T} ⊙ \nabla F (x_{1})] \\ \leq_{LC} (1 - β) F (x_{2}) \oplus (- (1 - β)) F (x_{β}) . \end{aligned}$ (10) By adding inequalities (Equation9(9) $\begin{aligned} β (1 - β) ⊙ [- k^{T} ⊙ \nabla F (x_{1})] \\ \leq_{LC} β ⊙ F (x_{1}) \oplus (- β) F (x_{β}) \end{aligned}$ (9) ) and (Equation10(10) $\begin{aligned} (1 - β) β ⊙ [k^{T} ⊙ \nabla F (x_{1})] \\ \leq_{LC} (1 - β) F (x_{2}) \oplus (- (1 - β)) F (x_{β}) . \end{aligned}$ (10) ), we obtain $\begin{aligned} 0 & \leq_{LC} (β ⊙ F (x_{1}) \oplus (1 - β) ⊙ F (x_{2})) \\ \oplus (- 1) ⊙ F (x_{β}) . \end{aligned}$ By rearranging the above inequality, we obtain $F (x_{β}) \leq_{LC} β ⊙ F (x_{1}) \oplus (1 - β) ⊙ F (x_{2}) .$ Substituting the value of $x_{β}$ in the above inequality, we have $\begin{aligned} F (β x_{1} + (1 - β) x_{2}) \\ \leq_{LC} β ⊙ F (x_{1}) \oplus (1 - β) ⊙ F (x_{2}) . \end{aligned}$ The arbitrariness of $β \in [0, 1]$ proves that $F$ is LC-convex on A.

Theorem 3.4

Consider an interval-valued function $F : R^{n} \to I$ , which is gH-differentiable at $x_{0}$ . If a vector $v \in R^{n}$ which satisfies $v^{T} ⊙ \nabla F (x_{0}) <_{LC} 0$ , then there exists $δ > 0$ such that for each $β \in (0, δ)$ , $F (x_{0} + β v) <_{LC} F (x_{0}) .$

Proof.

As $F$ is gH-differentiable at $x_{0}$ , from Definition 2.5 and Theorem 3.2, we have $\begin{aligned} F (x_{0} + k) ⊖_{gH} F (x_{0}) \\ = k^{T} ⊙ \nabla F (x_{0}) \oplus | | k | | ⊙ J (x_{0}; k), \end{aligned}$ where $J (x_{0}; k) ⟶ 0$ as $| | k | | ⟶ 0$ . By replacing $k = β v$ , for $β > 0$ , we get $\begin{aligned} F (x_{0} + β v) \\ = F (x_{0}) \oplus β v^{T} ⊙ \nabla F (x_{0}) \oplus | β | | | v | | ⊙ J (x_{0}; β v) . \end{aligned}$ Since, $v^{T} ⊙ \nabla F (x_{0}) <_{LC} 0$ and $J (x_{0}; β v) ⟶ 0$ as $β ⟶ 0^{+}$ , we have $F (x_{0} + β v) <_{LC} F (x_{0})$ for each $β \in (0, δ)$ , for some $δ > 0$ .

Definition 3.5

Let $F : R^{n} \to I$ be an interval-valued function. If $F$ is gH-differentiable at $x_{0}$ , the set of descent directions at $x_{0}$ is given by the set $F^{'} (x_{0}) = {v \in R^{n} : v^{T} ⊙ \nabla F (x_{0}) <_{LC} 0} .$ For any v in $F^{'} (x_{0})$ , $β v \in F^{'} (x_{0})$ for all $β > 0$ , the set $F^{'} (x_{0})$ is said to be the cone of descent directions.

Definition 3.6

For a non-empty set $A \in R^{n}$ and $x_{0} \in A$ , the cone of feasible directions of A at $x_{0}$ is given by $\begin{aligned} R^{'} (x_{0}) & = {v \in R^{n} : v \neq 0, x_{0} + β v \in A \forall β \\ \in (0, δ) for some δ > 0} . \end{aligned}$

Definition 3.7

A feasible solution $x^{*} \in A$ is said to be an efficient solution of the IOP (11) $min_{x \in A \subseteq R^{n}} F (x)$ (11) if there does not exist any $x \in A$ in $N_{δ} (x^{*})$ such that $F (x) <_{LC} F (x^{*})$ , where $N_{δ} (x^{*})$ is a δ-neighbourhood of $x^{*}$ . If a solution $x^{*}$ is an efficient solution, then we say that $F (x^{*})$ is a non-dominated solution to the IOP.

Theorem 3.8

For a non-empty set $A \subseteq R^{n}$ , let us consider the following IOP $min_{x \in A \subseteq R^{n}} F (x),$ where $F : R^{n} \to I$ . If $F$ is gH-differentiable at $x_{0} \in A$ and $x_{0}$ is a local efficient solution to the IOP (Equation11(11) $min_{x \in A \subseteq R^{n}} F (x)$ (11) ), then $F^{^{'}} (x_{0}) \cap R^{^{'}} (x_{0}) = ϕ$ .

Proof.

Contrarily, suppose that $F^{^{'}} (x_{0}) \cap R^{^{'}} (x_{0}) \neq ϕ$ and v be an element in $F^{^{'}} (x_{0}) \cap R^{^{'}} (x_{0})$ . Then, by Theorem 3.4 there exists $δ_{1} > 0$ such that $F (x_{0} + β v) <_{LC} F (x_{0}) for each β \in (0, δ_{1}) .$ By Definition 3.6, there exists $δ_{2} > 0$ such that $x_{0} + β v \in A for each β \in (0, δ_{2}) .$ Let us define $δ := min (δ_{1}, δ_{2}) > 0$ , we note that $\forall β \in (0, δ)$ , $x_{0} + β v \in A and F (x_{0} + β v) <_{LC} F (x_{0}) .$ It is a contradiction to $x_{0}$ a local efficient solution. Hence, $F^{'} (x_{0}) \cap R^{'} (x_{0}) = ϕ . ■$

Next example illustrates the necessary condition given in Theorem 3.8

Example 3.9

Let $A \subset R^{2}$ be the set ${(y_{1}, y_{2}) | 1 \leq y_{1} \leq 2, 1 \leq y_{2} \leq 2}$ . Let the IOP (12) $min_{y \in A} F (y_{1}, y_{2}),$ (12) where $F (y_{1}, y_{2}) = [\underline{F} (y_{1}, y_{2}), \bar{F} (y_{1}, y_{2})]$ . Furthermore, $\begin{aligned} \underline{F} (y_{1}, y_{2}) & = 2 + 3 {(y_{1} - 1)}^{2} + 3 {(y_{2} - 2)}^{2}, \\ \bar{F} (y_{1}, y_{2}) & = 6 + 6 {(y_{1} - 1)}^{2} + 6 {(y_{2} - 1)}^{2} . \end{aligned}$

The lower and upper functions are shown in the above figure. It is verified that $y_{0} = (1, 1.5) \in A$ is an efficient point. The cone of feasible directions at $y_{0}$ is given by $\begin{aligned} R^{'} (x_{0}) & = {(v_{1}, v_{2}) \neq (0, 0) : \\ (1 + β v_{1}, 1.5 + β v_{2}) \in A, \\ \forall β \in (0, δ) for some δ > 0} . \\ = {(v_{1}, v_{2}) \neq (0, 0) : v_{1} \geq 0} . \end{aligned}$ The gH-partial derivatives of $F$ at $y_{0}$ are $\begin{aligned} D_{1} F (y_{0}) & = [min {\frac{\partial \underline{F}}{\partial y_{1}} (y_{0}), \frac{\partial \bar{F}}{\partial y_{1}} (y_{0})}, \\ max {\frac{\partial \underline{F}}{\partial y_{1}} (y_{0}), \frac{\partial \bar{F}}{\partial y_{1}} (y_{0})}] \\ = (0, 0) \\ and D_{2} F (y_{0}) & = [min {\frac{\partial \underline{F}}{\partial y_{2}} (y_{0}), \frac{\partial \bar{F}}{\partial y_{2}} (y_{0})}, \\ max {\frac{\partial \underline{F}}{\partial y_{2}} (y_{0}), \frac{\partial \bar{F}}{\partial y_{2}} (y_{0})}] \\ = (- 3, 6) . \end{aligned}$ The cone of descent directions at $y_{0}$ is given as $\begin{aligned} F^{'} (x_{0}) & = {(v_{1}, v_{2}) \in R^{2} : (v_{1}, v_{2}) ⊙ \nabla F (x_{0}) <_{LC} 0} \\ = {(v_{1}, v_{2}) \in R^{2} : v_{1} ⊙ D_{1} F (y_{0}) \oplus v_{2} \\ ⊙ D_{2} F (y_{0}) <_{LC} 0} \\ = {(v_{1}, v_{2}) \in R^{2} : v_{2} ⊙ (- 3, 6) <_{LC} 0} \\ = [- 3 v_{2}, 6 v_{2}] <_{LC} 0 \\ ⟹ - 3 v_{2} < 0 and \frac{- 3 v_{2} + 6 v_{2}}{2} < 0 \\ ⟹ - 3 v_{2} < 0 and \frac{3 v_{2}}{2} < 0 \\ ⟹ v_{2} > 0 and v_{2} < 0 \end{aligned}$ which is not possible. Hence, $F^{'} (x_{0}) = ϕ .$ Therefore, $R^{'} (x_{0}) \cap F^{'} (x_{0}) = ϕ .$

Lemma 3.10

Let us consider the set $R = {x \in A : H_{j} (x) \leq_{LC} 0 for j = 1, 2, \dots, k}$ for the interval-valued functions $H_{j} : R^{n} \to I$ , where $A \neq ϕ$ is an open set in $R^{n}$ . Let $x_{0} \in R$ and $J (x_{0}) = {j : H_{D_{v}^{p}}^{j} (x_{0}) = 0}$ . Suppose $H_{j}$ to be gH-differentiable at $x_{0} \forall j \in J (x_{0})$ and gH-continuous for $j \notin J (x_{0})$ , we define $H^{'} (x_{0}) = {v : v^{T} ⊙ \nabla H_{j} (x) \leq_{LC} 0, \forall j \in J (x_{0})} .$ Then, $H^{'} (x_{0}) \subseteq R^{'} (x_{0}),$ where $\begin{aligned} R^{'} (x_{0}) & = {v \in R^{n} : v \neq 0, x_{0} + β v \in A \forall β \\ \in (0, δ) for some δ > 0} . \end{aligned}$

Proof.

Consider v to be an element in $H^{^{'}} (x_{0})$ . As $x_{0} \in A$ and A is an open set, there exists $δ_{0} > 0$ such that $x_{0} + β v \in A for β \in (0, δ_{0}) .$ For each $j \notin J (x_{0})$ , as $H_{j}$ is gH-continuous at $x_{0}$ $H_{j} (x_{0} + β v) = H_{j} (x_{0}) \oplus L^{j} (x_{0}; β v),$ where $L^{j} (x_{0}; β v) ⟶ 0$ as $| | v | | ⟶ 0$ . Since $H_{j} (x) \leq_{LC} 0$ for $j \notin J (x_{0})$ , there exists $δ_{j} > 0$ such that $H_{j} (x_{0} + β v) \leq_{LC} 0 for β \in (0, δ_{j}) and j \notin J (x_{0}) .$ As we know that $v \in H^{^{'}} (x_{0})$ , for each $j \in J (x_{0})$ there exists $δ_{j} > 0$ such that by Theorem 3.4 $H_{j} (x_{0} + β v) \leq_{LC} H_{j} (x_{0}) = 0, \forall β \in (0, δ_{j}) .$ Suppose $δ = min {δ_{0}, δ_{1}, δ_{2}, \dots, δ_{k}}$ . It is evident that $δ > 0$ . From the above inequalities, we note that the points of the form $x_{0} + β v$ belong to R for each $β \in (0, δ)$ . Therefore, $v \in R^{^{'}} (x_{0})$ . Hence, $H^{'} (x_{0}) \subseteq R^{'} (x_{0}) . ■$

Theorem 3.11

Let $A \neq ϕ$ be an open set in $R^{n}$ . Let us consider an IOP $min F (x)$ such that $\begin{array}{l} H_{j} (x) \leq_{LC} 0 for j = 1, 2, \dots, k \\ x \in A \end{array}},$ where $F : R^{n} \to I$ and $H_{j} : R^{n} \to I$ for $j = 1, 2, \dots, k$ . For a feasible point $x_{0}$ , define $J (x_{0}) = {j : H_{j} (x_{0}) = 0}$ . Consider at $x_{0}$ , $F$ and $H_{j}$ , $, j \in J (x_{0})$ , be gH-differentiable, and for $j \notin J (x_{0})$ , $H_{j}$ be gH-continuous. If $x_{0}$ is an efficient solution of the IOP, then $F^{'} (x_{0}) \cap H^{'} (x_{0}) = ϕ,$ where $F^{'} (x_{0}) = {v : v^{T} ⊙ \nabla F_{j} (x) \leq_{LC} 0}$ and $H^{'} (x_{0}) = {v : v^{T} ⊙ \nabla H_{j} (x) \leq_{LC} 0 for each j \in J (x_{0})} .$

Proof.

By using Theorem 3.8 and Lemma 3.10, we can conclude that, $x_{0}$ is a local efficient solution $\begin{aligned} ⟹ F^{'} (x_{0}) \cap R^{^{'}} (x_{0}) = ϕ \\ ⟹ F^{'} (x_{0}) \cap H^{^{'}} (x_{0}) = ϕ . ■ \end{aligned}$

Theorem 3.12

For an interval-valued vector $B_{v}^{n} = (b_{j})_{n \times 1}$ in $I^{n}$ , only one of these given systems has a solution:

$x^{T} ⊙ B_{v}^{n} <_{LC} 0$ for some $x = (x_{j})_{n \times 1} \in R^{n}$ .
$0_{v}^{n} \in x ⊙ B_{v}^{n}$ for some $x \in R$ , x>0.

Proof.

Consider $(i)$ is true. Let us prove that $(ii)$ must be false. Contrarily, let $(ii)$ be also true. Since $(i)$ is true, we have $x_{0}^{T} ⊙ B_{v}^{n} <_{LC} 0 for some x_{0} \in R^{n}$ consequently, $y ⊙ (x_{0}^{T} ⊙ B_{v}^{n}) <_{LC} 0 for all y \in R, y > 0,$ it can also be written as (13) $x_{0}^{T} ⊙ (y ⊙ B_{v}^{n}) <_{LC} 0 for all y \in R, y > 0.$ (13) As $(ii)$ is also true, then $0_{v}^{n} \in (y_{0} ⊙ B_{v}^{n}) for some y_{0} \in R, y_{0} > 0$ also, we have (14) $0 \in x^{T} ⊙ (y_{0} ⊙ B_{v}^{n}) for all x \in R^{n} .$ (14) As (Equation13(13) $x_{0}^{T} ⊙ (y ⊙ B_{v}^{n}) <_{LC} 0 for all y \in R, y > 0.$ (13) ) and (Equation14(14) $0 \in x^{T} ⊙ (y_{0} ⊙ B_{v}^{n}) for all x \in R^{n} .$ (14) ) cannot be true together, so we have contradiction here. Hence, if $(i)$ is true, $(ii)$ cannot be true. For the other case, Let us suppose that $(i)$ is false. We prove that $(ii)$ is true. Contrarily, Let us suppose $(ii)$ is false. Therefore, $\begin{aligned} 0_{v}^{n} \in y ⊙ B_{v}^{n} \forall y \in R, y > 0 \\ \Rightarrow 0_{v}^{n} \in B_{v}^{n} . \end{aligned}$ Consequently, (15) $\begin{aligned} (\exists) j \in {1, 2, \dots, n} such that 0 \notin b_{j}, \\ (\exists) j \in {1, 2, \dots, n} such that b_{j} <_{LC} 0 or 0 <_{LC} b_{j} . \end{aligned}$ (15) Let the sets $K = {k : 0 \in b_{k}, k \in {1, 2, \dots, n}}$ and $I = {i : 0 \notin b_{i}, i \in {1, 2, \dots, n}} .$ By (Equation15(15) $\begin{aligned} (\exists) j \in {1, 2, \dots, n} such that 0 \notin b_{j}, \\ (\exists) j \in {1, 2, \dots, n} such that b_{j} <_{LC} 0 or 0 <_{LC} b_{j} . \end{aligned}$ (15) ) it can be seen that $I \neq ϕ$ . Also $K \cup I = {1, 2, \dots, n}$ and $K \cap I = ϕ$ . Let us create a vector $x_{0} = (x_{1}^{0}, x_{2}^{0}, \dots, x_{n}^{0})^{T}$ such that $x_{j}^{0} = {\begin{cases} 0 & if j \in K \\ 1 & if j \in I and b_{j} <_{LC} 0 \\ - 1 & if j \in I and 0 <_{LC} b_{j} . \end{cases}$ For this $x_{0} \in R^{n}$ , we can see that $\sum_{i \in I} x_{i}^{0} ⊙ b_{i} \oplus su m_{k \in K} x_{k}^{o} ⊙ b_{k} <_{LC} 0 .$ More generally, (16) $x_{0}^{T} ⊙ B_{v}^{n} <_{LC} 0 .$ (16) As $(i)$ is false, then $x^{T} ⊙ B_{v}^{n} <_{LC} 0 for no y \in R^{n} .$ Which is a contradiction to (Equation16(16) $x_{0}^{T} ⊙ B_{v}^{n} <_{LC} 0 .$ (16) ). So (ii) must be true. Which completes the proof.

Theorem 3.13

If $y_{0}$ is a local efficient solution to the following IOP $min_{y \in R^{n}} G (y),$ then $0_{v}^{n} \in \nabla G (y_{0})$ , where ${G : R}^{n} ⟶ I$ is gH-differentiable at $y_{0}$ ,

Proof.

By using Definition 3.4 and Theorem 3.5, if $y_{0}$ is a local efficient solution, then $G^{^{'}} (y_{0}) = ϕ$ . Consequently, $v^{T} ⊙ \nabla G (x_{0}) <_{LC} 0 for no v \in R^{n} .$ From Theorem 3.12 with $B_{v}^{n} = \nabla G (x_{0})$ , $(\exists)$ $y_{0} \in R^{n}$ , $y_{0} > 0$ , such that $\begin{aligned} 0_{v}^{n} \in y_{0} ⊙ \nabla G (y_{0}) \\ ⟹ 0_{v}^{n} \in \nabla G (y_{0}) . ■ \end{aligned}$

Remark 3.14

It is very interesting to observe that the optimality condition in the above theorem is not an equality relation $\nabla G (y_{0}) = 0_{v}^{n}$ but an inclusion relation $0_{v}^{n} \in \nabla G (y_{0})$ . Inclusion relations are less restrictive and more correct than equality relations. For example if $\nabla G (y_{0}) = ([0, 0], [- 2, 5])$ then $\nabla G (y_{0}) \neq 0_{v}^{2}$ but $0_{v}^{2} \in \nabla G (y_{0})$ .

Theorem 3.15

For an interval-valued matrix $B = (b_{ij})_{m \times n}$ , where $b_{ij} \in I$ , only one of the given systems has a solution:

$B ⊙ x <_{LC} 0_{v}^{n}$ for some $x = (x_{i})_{m \times 1} \in R^{m}$
$0_{v}^{m} \in B ⊙ y$ for some $0 \neq y = (y_{i})_{n \times 1} \in R^{n}$ , $\forall y_{i} \geq 0$ .

Proof.

Consider $(i)$ is true. Let us prove that $(ii)$ is false. Contrarily, consider $(ii)$ is also true. Since, (i) is true, we have $B^{T} ⊙ x_{0} <_{LC} 0_{v}^{n} for some x_{0} = (x_{1}^{0}, x_{2}^{0}, \dots, x_{m}^{0}) \in R^{m} .$ Consequently, $\begin{aligned} y^{T} ⊙ (B^{T} ⊙ x_{0}) \\ <_{LC} 0, \forall y \neq 0, y = {(y_{i})}_{n \times 1} \in R^{n}, y_{i} \geq 0. \end{aligned}$ Then, it can be written as (17) $\begin{aligned} {(B ⊙ y)}^{T} ⊙ x_{0} \\ <_{LC} 0, \forall y \neq 0, y = {(y_{i})}_{n \times 1} \in R^{n}, y_{i} \geq 0. \end{aligned}$ (17) As we considered $(ii)$ is also true, then for some non-zero $y_{0} = (y_{i}^{0})_{n \times 1} \in R^{n}$ , where $y_{i}^{0} \geq 0$ . Now we have (18) $0_{v}^{m} \in B ⊙ y_{0} .$ (18) Consider $z = B ⊙ y_{0} = (z_{1}, z_{2}, \dots, z_{m})^{T}$ . It implies that $z \in I^{m}$ and ${(B ⊙ y_{0})}^{T} ⊙ x_{0} = \sum_{i = 1}^{m} x_{i}^{0} ⊙ z_{i} .$ From (Equation18(18) $0_{v}^{m} \in B ⊙ y_{0} .$ (18) ), we have $\begin{aligned} {0 \in z}_{i}, \forall i = 1, 2, \dots, m \\ \Rightarrow 0 \in x_{i}^{0} ⊙ z_{i} \forall i = 1, 2, \dots, m . \end{aligned}$ Then, it can be written as (19) $0 \in {(B ⊙ y_{0})}^{T} ⊙ x_{0} .$ (19) As (Equation17(17) $\begin{aligned} {(B ⊙ y)}^{T} ⊙ x_{0} \\ <_{LC} 0, \forall y \neq 0, y = {(y_{i})}_{n \times 1} \in R^{n}, y_{i} \geq 0. \end{aligned}$ (17) ) and (Equation19(19) $0 \in {(B ⊙ y_{0})}^{T} ⊙ x_{0} .$ (19) ) cannot be true together, so here is a contradiction. Hence, (ii) cannot be true if $(i)$ is true.

For this, Let us suppose that $(i)$ is not true. We shall prove that $(ii)$ is true. If $(i)$ is not true, then (20) $B^{T} ⊙ x <_{LC} 0_{v}^{n} for no x \in R^{m} .$ (20) Let us suppose, contrarily, that $(ii)$ is also false. Then, $\begin{aligned} 0_{v}^{m} & \in B ⊙ y, \forall y \neq 0, y = {(y_{i})}_{n \times 1} \in R^{n}, \\ for all y_{i} \geq 0. \end{aligned}$ Consequently, $(\exists) i \in {1, 2, \dots, m} such that 0 \notin z_{i}$ it implies that (21) $(\exists) i \in {1, 2, \dots, m} such that z_{i} < 0 or 0 \leq z_{i},$ (21) where $B ⊙ y = (z_{1}, z_{2}, \dots, z_{m})^{T}$ . Let the sets $K = {k : 0 \in z_{k}, k \in {1, 2, \dots, m}}$ and $J = {j : 0 \notin z_{j}, j \in {1, 2, \dots, m}} .$ By (Equation21(21) $(\exists) i \in {1, 2, \dots, m} such that z_{i} < 0 or 0 \leq z_{i},$ (21) ), it can be seen that $J \neq ϕ$ . Also $K \cup J = {1, 2, \dots, n}$ and $K \cap J = ϕ$ . Let us create a vector $x_{0} = (x_{1}^{0}, x_{2}^{0}, \dots, x_{m}^{0})^{T} \in R^{n}$ by $x_{i}^{0} = {\begin{cases} 0 & if i \in K \\ 1 & if i \in J and z_{i} < 0 \\ - 1 & if i \in J and 0 < z_{i} \end{cases}$ For this $x_{0} \in R^{m}$ , we notice that $\sum_{j \in J} x_{j}^{0} ⊙ z_{j} \oplus \sum_{k \in K} x_{k}^{o} ⊙ z_{k} <_{LC} 0,$ which is equivalent to (22) $\begin{aligned} y^{T} ⊙ (B^{T} ⊙ x_{0}) \\ <_{LC} 0, \forall y \neq 0, y = {(y_{i})}_{n \times 1} \in R^{n}, y_{i} \geq 0 . \end{aligned}$ (22) The inequality (Equation22(22) $\begin{aligned} y^{T} ⊙ (B^{T} ⊙ x_{0}) \\ <_{LC} 0, \forall y \neq 0, y = {(y_{i})}_{n \times 1} \in R^{n}, y_{i} \geq 0 . \end{aligned}$ (22) ) can be true only when $B^{T} ⊙ x_{0} <_{LC} 0$ . The inequalities (Equation20(20) $B^{T} ⊙ x <_{LC} 0_{v}^{n} for no x \in R^{m} .$ (20) ) and (Equation22(22) $\begin{aligned} y^{T} ⊙ (B^{T} ⊙ x_{0}) \\ <_{LC} 0, \forall y \neq 0, y = {(y_{i})}_{n \times 1} \in R^{n}, y_{i} \geq 0 . \end{aligned}$ (22) ) are contradictory, so $(i)$ and $(ii)$ cannot hold together. Hence, $(ii)$ must be true. This completes the proof.

Theorem 3.16

Let $A \neq ϕ$ be a set in $R^{n}$ ; $F : R^{n} ⟶ I$ and $J_{k} : R^{n} ⟶ I$ for $k = 1, 2, \dots, m$ . Consider IOP (23) ${\begin{cases} min F (x) \\ such that J_{k} (x) \leq_{LC} 0, k = 1, 2, \dots, m \\ x \in A . \end{cases}$ (23) Let $x_{0}$ be a feasible point of IOP (Equation23(23) ${\begin{cases} min F (x) \\ such that J_{k} (x) \leq_{LC} 0, k = 1, 2, \dots, m \\ x \in A . \end{cases}$ (23) ), we define $K (x_{0}) = {k : J_{k} (x_{0}) = 0} .$ Suppose, $F$ and $J_{k}$ are gH-differentiable at $x_{0}$ for $k \in K (x_{0})$ and gH-continuous for $k \notin K (x_{0})$ . If $x_{0}$ is a local efficient point of (Equation23(23) ${\begin{cases} min F (x) \\ such that J_{k} (x) \leq_{LC} 0, k = 1, 2, \dots, m \\ x \in A . \end{cases}$ (23) ), then there exist constants $w_{0}$ and $w_{k}$ for $k \in K (x_{0})$ such that ${\begin{cases} 0_{v}^{n} \in (w_{0} ⊙ \nabla F (x_{0}) \oplus \sum_{k \in K (x_{0})} w_{k} ⊙ \nabla J_{k} (x_{0})), \\ w_{0} \geq 0, w_{k} \geq 0 for k \in K (x_{0}), \\ (w_{0}, w_{K}) \neq (0, 0_{v}^{| K (x_{0}) |}), \end{cases}$ where, $w_{K}$ is the vector whose components are $w_{k}$ for $k \in K (x_{0})$ . Furthermore, if $J_{k}$ $\forall k \notin K (x_{0})$ are also gH-differentiable at $x_{0}$ , then there exist $w_{1}, w_{2}, \dots, w_{m}$ constants such that ${\begin{cases} 0_{v}^{n} \in (w_{0} ⊙ \nabla F (x_{0}) \oplus \sum_{k = 1}^{m} w_{k} ⊙ \nabla J_{k} (x_{0})), \\ w_{k} ⊙ J_{k} (x_{0}) = 0, k = 1, 2, \dots, m, \\ w_{0} \geq 0, w_{k} \geq 0, k = 1, 2, \dots, m, \\ (w_{0}, w) \neq (0, 0_{v}^{m}) \end{cases}$ where, w is the vector $(w_{1}, w_{2}, \dots, w_{m})$ .

Proof.

As $x_{0}$ is a local efficient point of (Equation23(23) ${\begin{cases} min F (x) \\ such that J_{k} (x) \leq_{LC} 0, k = 1, 2, \dots, m \\ x \in A . \end{cases}$ (23) ), by Theorem 3.11, we get $F^{'} (x_{0}) \cap J^{'} (x_{0}) = ϕ$ or we can say that $∄$ $v \in R^{n}$ such that (24) $\begin{aligned} v^{T} ⊙ \nabla F (x_{0}) \\ <_{LC} 0 and v^{T} ⊙ \nabla J_{k} (x_{0}) <_{LC} 0, \forall k \in K (x_{0}) . \end{aligned}$ (24) Let $B$ be the matrix such that $B = {[\nabla F (x_{0}), {[\nabla J_{k} (x_{0})]}_{k \in K (x_{0})}]}_{n \times (1 + | K (x_{0}) |)} .$ By (Equation24(24) $\begin{aligned} v^{T} ⊙ \nabla F (x_{0}) \\ <_{LC} 0 and v^{T} ⊙ \nabla J_{k} (x_{0}) <_{LC} 0, \forall k \in K (x_{0}) . \end{aligned}$ (24) ) we notice that $B^{T} ⊙ v \leq_{LC} 0_{v}^{1 + | K (x_{0}) |} for no v \in R^{n} .$ Now by Theorem 3.15, $(\exists) q \neq 0$ , $q = {(q_{k})}_{| K (x_{0}) + 1 | \times 1} \in R^{| K (x_{0}) |}, q_{i} \geq 0$ such that $0_{v}^{n} \in B ⊙ q .$ Consider q of the form (25) $q = {[\begin{array}{l} w_{0} \\ w_{k} \end{array}]}_{k \in K (x_{0})}$ (25) by putting (Equation25(25) $q = {[\begin{array}{l} w_{0} \\ w_{k} \end{array}]}_{k \in K (x_{0})}$ (25) ) in $0_{v}^{n} \in B ⊙ q$ , we have $\begin{aligned} 0_{v}^{n} & \in {[\nabla F (x_{0}), {[\nabla J_{k} (x_{0})]}_{k \in K (x_{0})}]}_{n \times (1 + | K (x_{0}) |)} \\ ⊙ {[\begin{array}{l} w_{0} \\ w_{k} \end{array}]}_{k \in K (x_{0})} \end{aligned}$ by simplifying the above expression, we have ${\begin{cases} 0_{v}^{n} \in (w_{0} ⊙ \nabla F (x_{0}) \oplus \sum_{k \in K (x_{0})} w_{k} ⊙ \nabla J_{k} (x_{0})), \\ w_{0} \geq 0, w_{k} \geq 0 for k \in K (x_{0}), \\ (w_{0}, w_{K}) \neq (0, 0, \dots, 0) . \end{cases}$ The first part of the theorem is proved here.

For the second part, suppose for $k \in K (x_{0})$ , $J_{k} (x_{0}) = 0$ . Consequently, $w_{k} ⊙ J_{k} (x_{0}) = 0 .$ If $J_{k}$ , $\forall k \notin K (x_{0})$ are also gH-differentiable at $x_{0}$ , then by setting $w_{k} = 0$ for $k \notin K (x_{0})$ we have ${\begin{cases} 0_{v}^{n} \in (w_{0} ⊙ \nabla F (x_{0}) \oplus \sum_{k = 1}^{m} w_{k} ⊙ \nabla J_{k} (x_{0})), \\ w_{k} ⊙ J_{k} (x_{0}) = 0, k = 1, 2, \dots, m, \\ w_{0} \geq 0, w_{k} \geq 0, k = 1, 2, \dots, m, \\ (w_{0}, w) \neq (0, 0_{v}^{m}) \end{cases}$ which completes our proof.

Definition 3.17

The collection of n interval vectors ${(Y_{v}^{k})_{1}, (Y_{v}^{k})_{2}, \dots, (Y_{v}^{k})_{n}}$ is called linearly independent if for n real numbers $a_{1}, a_{2}, \dots, a_{n}$ $\begin{aligned} 0_{v}^{k} & \in a_{1} ⊙ {(Y_{v}^{k})}_{1} \oplus a_{2} ⊙ {(Y_{v}^{k})}_{2}, \dots, \\ a_{n} ⊙ {(Y_{v}^{k})}_{n} iff a_{i} = 0 \forall i = 1, 2, \dots, n \end{aligned}$ otherwise dependent.

Example 3.18

Let $(Y_{v}^{2})_{1} = {(1, 3), (5, 7)}$ and $(Y_{v}^{2})_{2} = {(2, 3), (4, 7)}$ .

Now we have $\begin{aligned} a_{1} ⊙ {(Y_{v}^{2})}_{1} \oplus a_{2} ⊙ {(Y_{v}^{2})}_{2} \\ = a_{1} ⊙ {(1, 3), (5, 7)} \oplus a_{2} ⊙ {(2, 3), (4, 7)} . \end{aligned}$ and $0_{v}^{2} \in a_{1} ⊙ {(Y_{v}^{2})}_{1} \oplus a_{2} ⊙ {(Y_{v}^{2})}_{2} for a_{1} = a_{2} = 0.$ So, it is a linearly independent set of interval vectors.

Theorem 3.19

Let $A \neq ϕ$ be a subset of $R^{n}$ and $F : R^{n} ⟶ I$ and $J_{k} : R^{n} ⟶ I$ for $k = 1, 2, \dots, m$ be IVFs. Let $x_{0}$ be a feasible point of the following IOP ${\begin{cases} min F (x) \\ such that J_{k} (x) \leq_{LC} 0, k = 1, 2, \dots, m \\ x \in A . \end{cases}$ We define $K (x_{0}) = {k : J_{k} (x_{0}) = 0} .$ Suppose, $F$ and $J_{k}$ are gH-differentiable at $x_{0}$ for $k \in K (x_{0})$ and gH-continuous for $k \notin K (x_{0})$ . Consider the interval vectors ${\nabla J_{k} (x_{0}) for k \in K (x_{0})}$ are linearly independent and if $x_{0}$ is an efficient solution then there exist scalars $w_{k} \forall k \in K (x_{0})$ such that ${\begin{cases} 0_{v}^{n} \in (\nabla F (x_{0}) \oplus \sum_{k \in K (x_{0})} w_{k} ⊙ \nabla J_{k} (x_{0})), \\ w_{k} \geq 0, \forall k \in K (x_{0}) . \end{cases}$ Furthermore, if $J_{k}$ for $k \notin K (x_{0})$ are also gH-differentiable at $x_{0}$ , then there exist $w_{1}, w_{2}, \dots, w_{m}$ constants such that ${\begin{cases} 0_{v}^{n} \in (\nabla F (x_{0}) \oplus \sum_{k = 1}^{m} w_{k} ⊙ \nabla J_{k} (x_{0})), \\ w_{k} ⊙ J_{k} (x_{0}) = 0, k = 1, 2, \dots, m, \\ w_{k} \geq 0, k = 1, 2, \dots, m . \end{cases}$

Proof.

By Theorem 3.16, $(\exists) w_{0}$ and $w_{k}^{^{'}} \forall k \in K (x_{0})$ , where $w_{0}$ and $w_{k}^{^{'}}$ are real constants and not all zeros, such that ${\begin{cases} 0_{v}^{n} \in (w_{0} ⊙ \nabla F (x_{0}) \oplus \sum_{k \in K (x_{0})} w_{k}^{'} ⊙ \nabla J_{k} (x_{0})) \\ w_{0} \geq 0, w_{k}^{'} \geq 0 \forall k \in K (x_{0}) . \end{cases}$ We need to have $w_{0} > 0$ . Because in another case, the set ${\nabla J_{k} (x_{0}) for k \in K (x_{0})}$ will become linearly dependent. We define $w_{k} = \frac{w_{k}^{^{'}}}{w_{0}}$ then, $w_{k} \geq 0$ for all $k \in K (x_{0})$ and $0_{v}^{n} \in (\nabla F (x_{0}) \oplus \sum_{k \in K (x_{0})} w_{k} ⊙ \nabla J_{k} (x_{0})) .$ For $k \in K (x_{0})$ , $J_{k} (x_{0}) = 0$ . Thus, $0 \in w_{k} ⊙ J_{k} (x_{0})$ . If the functions $J_{k}$ for $k \notin K (x_{0})$ are also gH-differentiable at $x_{0}$ , then by setting $w_{k} = 0$ for $k \notin K (x_{0})$ we have $0_{v}^{n} \in (\nabla F (x_{0}) \oplus \sum_{k = 1}^{m} w_{k} ⊙ \nabla J_{k} (x_{0}))$ which proves the second part of our theorem.

To illustrate the necessary conditions given in Theorems 3.16 and 3.19, let consider a detail example.

Example 3.20

Consider the IOP with feasible point $y_{0} = (0, 2)$ $\begin{aligned} min F (y_{1}, y_{2}) \\ = [- 4, - 1] ⊙ y_{1}^{2} \oplus [- 1, 0] ⊙ y_{2}^{3} \oplus [- 3, - 2] \\ ⊙ y_{2}^{2} \oplus [0, 1] ⊙ (y_{1}^{2} y_{2}) \end{aligned}$ such that $\begin{aligned} J_{1} (y_{1}, y_{2}) \\ = [- 3, 2] ⊙ y_{1} \oplus [- 3, - 2] ⊙ y_{2} ⊖_{gH} [- 6, - 4] \leq_{LC} 0 \\ J_{2} (y_{1}, y_{2}) \\ = [0, 1] ⊙ y_{1}^{2} \oplus [- 6, - 4] ⊙ y_{2} ⊖_{gH} [- 2, - 1] \leq_{LC} 0. \end{aligned}$ In this IOP, the functions $F$ , $J_{1}$ , and $J_{2}$ are gH-differentiable on $R^{2}$ . We notice that, at $y_{0}$ $\begin{aligned} J_{1} (y_{0}) & = (0, 0) \leq_{LC} 0 \\ J_{2} (y_{0}) & = [- 10, - 7] \leq_{LC} 0. \end{aligned}$ Now, we find gH-partial derivatives of $F (y_{1}, y_{2})$ by using Definition 2.3, $\begin{aligned} D_{1} F (y_{1}, y_{2}) & = [- 4, - 1] ⊙ 2 y_{1} \oplus [0, 1] ⊙ (2 y_{1} y_{2}) \\ and D_{2} F (y_{1}, y_{2}) & = [- 1, 0] ⊙ 3 y_{2}^{2} \\ \oplus [- 3, - 2] ⊙ 2 y_{2} \oplus [0, 1] ⊙ (y_{1}^{2}) \end{aligned}$ Thus, $K (x_{0}) = {1}$ . We notice that $\begin{aligned} \nabla F (y_{0}) & = {(D_{1} F (y_{0}), D_{2} F (y_{0}))}^{T} \\ = {([0, 0], [- 24, - 8])}^{T} . \\ \nabla J_{1} (y_{0}) & = {(D_{1} J_{1} (y_{0}), D_{2} J_{1} (y_{0}))}^{T} \\ = {([- 3, 2], [- 3, - 2])}^{T} . \\ \nabla J_{2} (y_{0}) & = {(D_{1} J_{2} (y_{0}), D_{2} J_{2} (y_{0}))}^{T} \\ = {([0, 0], [- 6, - 4])}^{T} . \end{aligned}$ We don't actually need $\nabla J_{2} (y_{0})$ because $K (x_{0}) = {1}$ . Therefore only $\nabla J_{1} (y_{0})$ is enough. Now, the conclusions in Theorem 3.16 hold for $w_{0} = 2$ , $w_{1} = 1$ and $w_{2} = 0$ as $\begin{aligned} 0_{v}^{n} & \in (2 ⊙ {([0, 0], [- 24, - 8])}^{T} \\ \oplus 1 ⊙ {([- 3, 2], [- 3, - 2])}^{T}) \\ ⟹ 0_{v}^{n} \in ({([0, 0], [- 48, - 16])}^{T} \\ \oplus {([- 3, 2], [- 3, - 2])}^{T}) . \end{aligned}$ And that of Theorem 3.19 hold for $w_{0} = 1$ , $w_{1} = 1$ and $w_{2} = 0$ as $\begin{aligned} 0_{v}^{n} & \in ({([0, 0], [- 24, - 8])}^{T} \\ \oplus 1 ⊙ {([- 3, 2], [- 3, - 2])}^{T}) \\ ⟹ 0_{v}^{n} \in ({([0, 0], [- 24, - 8])}^{T} \\ \oplus {([- 3, 2], [- 3, - 2])}^{T}) . \end{aligned}$

Theorem 3.21

Let $A \neq ϕ$ be an open convex set such that $F : A \subset R^{n} ⟶ I$ and $J_{i} : A ⟶ I$ , $i = 1, 2, \dots, m$ be gH-differentiable LC-convex functions on A. Let $x_{0}$ be a feasible point of the IOP ${\begin{cases} min F (x) \\ such that J_{k} (x) \leq_{LC} 0, k = 1, 2, \dots, m \\ x \in A . \end{cases}$ If there exist $w_{1}, w_{2}, \dots, w_{m}$ scalars such that ${\begin{cases} 0_{v}^{n} \in (\nabla F (x_{0}) \oplus \sum_{k = 1}^{m} w_{k} ⊙ \nabla J_{k} (x_{0})), \\ w_{k} ⊙ J_{k} (x_{0}) = 0, k = 1, 2, \dots, m, \\ w_{k} \geq 0, k = 1, 2, \dots, m . \end{cases}$ Then $x_{0}$ is an efficient solution of the IOP.

Proof.

By supposition, for every $x \in A$ satisfying $J_{k} (x) \leq_{LC} 0, \forall k = 1, 2, \dots, m$ . We have $\begin{aligned} 0_{v}^{n} \in {(\nabla F (x_{0}) \oplus \sum_{k = 1}^{m} w_{k} ⊙ \nabla J_{k} (x_{0}))}^{T} (x - x_{0}) \\ = \nabla F {(x_{0})}^{T} (x - x_{0}) \oplus \sum_{k = 1}^{m} w_{k} ⊙ \nabla J_{k} {(x_{0})}^{T} (x - x_{0}) . \end{aligned}$ By using Theorem 3.3 $\begin{aligned} \nabla F {(x_{0})}^{T} (x - x_{0}) \oplus \sum_{k = 1}^{m} w_{k} ⊙ \nabla J_{k} {(x_{0})}^{T} (x - x_{0}) \leq_{LC} \\ (F (x) \oplus (- 1) ⊙ F (x_{0})) \\ \oplus \sum_{k = 1}^{m} w_{k} ⊙ (J_{k} (x) \oplus (- 1) ⊙ J_{k} (x_{0})) \\ \leq_{LC} (F (x) \oplus (- 1) ⊙ F (x_{0})) . \end{aligned}$ Therefore, for every $x \in A$ , $\begin{aligned} either 0_{v}^{n} \in (F (x) \oplus (- 1) ⊙ F (x_{0})) \\ or 0_{v}^{n} \leq_{LC} (F (x) \oplus (- 1) ⊙ F (x_{0})) . \end{aligned}$ In either case, $x_{0}$ is an efficient solution to the considered IOP. Here is the proof.

4. Applications of KKT conditions in SVM

Let us consider a binary classification problem. For a data set $S = {(u_{i}, v_{i}) | u_{i} \in R^{n}, v_{i} \in {- 1, 1}, i = 1, 2, \dots, k},$ the problem of classifying data using SVMs is identical to the optimization problem below: (26) ${\begin{cases} min F (z, d) = \frac{1}{2} ‖ z ‖^{2} \\ such that v_{i} (z^{T} u_{i} + d) \geq 1, i = 1, 2, \dots, m, \end{cases}$ (26) where $z \in R^{n}$ is the weight vector and $d \in R$ is the bias.

The constraints specify that the data points must be on opposite sides of the separating hyperplanes $z^{T} u_{i} + d = \pm 1$ . There is uncertainty and imprecision in the data set in many classification problems. This might be due to errors in measurement, implementation, and so on.

For example, weather problems include interval-type data because we cannot find the weather condition for an instant of time, it is always for some duration and other circumstances during that time duration. And the inequality (Equation26(26) ${\begin{cases} min F (z, d) = \frac{1}{2} ‖ z ‖^{2} \\ such that v_{i} (z^{T} u_{i} + d) \geq 1, i = 1, 2, \dots, m, \end{cases}$ (26) ) does not deal with interval-type data. As a result, the SVM problem is adjusted for the interval-valued data set ${(U_{i}, v_{i}) | U_{i} \in I^{n}, v_{i} \in {- 1, 1}, i = 1, 2, \dots, k}$ by ${\begin{cases} min F (z, d) = \frac{1}{2} ‖ z ‖^{2} \\ such that v_{i} (z^{T} U_{i} + d) \geq [1, 1], i = 1, 2, \dots, k . \end{cases}$ By adjusting the above problem accordingly, we get (27) ${\begin{cases} min F (z, d) = \frac{1}{2} ‖ z ‖^{2} \\ such that H_{i} (z, d) = [1, 1] ⊖_{gH} v_{i} \oplus (z^{T} ⊙ U_{i} \oplus d) \\ \leq_{LC} 0, i = 1, 2, \dots, k . \end{cases}$ (27) We can see that $F$ and $H_{i}$ are both gH-differentiable and LC-convex functions. These functions of gH-gradients are as follows: $\begin{aligned} \nabla F (z, d) & = {(D_{1} F (z, d), D_{2} F (z, d))}^{T} = {(z, 0)}^{T} \\ and \nabla H_{i} (z, d) & = {(D_{1} H_{i} (z, d), D_{2} H_{i} (z, d))}^{T} \\ = {(- v_{i} ⊙ U_{i}, - v_{i})}^{T}, \end{aligned}$ where $D_{1}$ and $D_{2}$ are the gH-partial derivatives corresponding to z and d, respectively.

By Theorem 3.19, for an efficient point $(z^{*}, d^{*})$ of (Equation27(27) ${\begin{cases} min F (z, d) = \frac{1}{2} ‖ z ‖^{2} \\ such that H_{i} (z, d) = [1, 1] ⊖_{gH} v_{i} \oplus (z^{T} ⊙ U_{i} \oplus d) \\ \leq_{LC} 0, i = 1, 2, \dots, k . \end{cases}$ (27) ) there exist non-negative constants $w_{1}, w_{2}, \dots, w_{k}$ such that (28) $0_{v}^{n + 1} \in ({(z, 0)}^{T} \oplus \sum_{i = 1}^{k} w_{i} ⊙ {(- v_{i} ⊙ U_{i}, - v_{i})}^{T})$ (28) and (29) $0 = w_{i} ⊙ H_{i} (z^{*}, d^{*}), i = 1, 2, \dots, k .$ (29) The condition (Equation28(28) $0_{v}^{n + 1} \in ({(z, 0)}^{T} \oplus \sum_{i = 1}^{k} w_{i} ⊙ {(- v_{i} ⊙ U_{i}, - v_{i})}^{T})$ (28) ) can be written as $0_{v}^{n} \in ([z^{*}, z^{*}] \oplus \sum_{i = 1}^{k} (- w_{i} v_{i}) ⊙ U_{i})$ and $\sum_{i = 1}^{k} w_{i} v_{i} = 0.$ The points in the data $U_{i}$ for which $w_{i} \neq 0$ are called support vectors. From (Equation29(29) $0 = w_{i} ⊙ H_{i} (z^{*}, d^{*}), i = 1, 2, \dots, k .$ (29) ), we notice that corresponding to any $w_{i} > 0$ , we have $H_{i} (z^{*}, d^{*}) = 0$ . Therefore, with respect to $z^{*}$ , the value of the bias $d^{*}$ is such a value that $H_{i} (z^{*}, d^{*}) = 0 \forall i = 1, 2, \dots, k$ for which $w_{i} > 0$ .

Since, the functions $F (z, d)$ and $H_{i} (z^{*}, d^{*})$ are gH -differentiable and LC-convex, by using Theorems 3.19 and 3.21, the collection of conditions that we must solve in order to find efficient solutions of the SVM IOP (Equation27(27) ${\begin{cases} min F (z, d) = \frac{1}{2} ‖ z ‖^{2} \\ such that H_{i} (z, d) = [1, 1] ⊖_{gH} v_{i} \oplus (z^{T} ⊙ U_{i} \oplus d) \\ \leq_{LC} 0, i = 1, 2, \dots, k . \end{cases}$ (27) ) are (30) ${\begin{cases} 0_{v}^{n} \in ([z, z] \oplus \sum_{i = 1}^{k} (- w_{i} v_{i}) ⊙ U_{i}) \\ \sum_{i = 1}^{k} w_{i} v_{i} = 0 \\ and 0 = w_{i} ⊙ H_{i} (z^{*}, d^{*}), i = 1, 2, \dots, k . \end{cases}$ (30) For any of the values of z that fulfill the condition in (Equation30(30) ${\begin{cases} 0_{v}^{n} \in ([z, z] \oplus \sum_{i = 1}^{k} (- w_{i} v_{i}) ⊙ U_{i}) \\ \sum_{i = 1}^{k} w_{i} v_{i} = 0 \\ and 0 = w_{i} ⊙ H_{i} (z^{*}, d^{*}), i = 1, 2, \dots, k . \end{cases}$ (30) ), let us define the set of possible values of the bias as (31) $\cap_{i : w_{i} > 0} {d | H_{i} (z, d) = 0} .$ (31) By using any solution $\tilde{z}$ and $\tilde{d}$ of (Equation30(30) ${\begin{cases} 0_{v}^{n} \in ([z, z] \oplus \sum_{i = 1}^{k} (- w_{i} v_{i}) ⊙ U_{i}) \\ \sum_{i = 1}^{k} w_{i} v_{i} = 0 \\ and 0 = w_{i} ⊙ H_{i} (z^{*}, d^{*}), i = 1, 2, \dots, k . \end{cases}$ (30) ) and (Equation31(31) $\cap_{i : w_{i} > 0} {d | H_{i} (z, d) = 0} .$ (31) ), a classifying hyperplane and the SVM classifier function are given respectively $\begin{aligned} {\tilde{z}}^{T} U + \tilde{d} & = 0 and \\ p^{*} (U) & = sign ({\tilde{z}}^{T} U + \tilde{d}) . \end{aligned}$

Example 4.1

Let us consider the interval data set $\begin{aligned} U_{1} & = [[3, 4], [1, 2]], v_{1} = 1, \\ U_{2} & = [[4, 5], [2, 3]], v_{1} = 1, \\ U_{3} & = [[5, 6], [1, 2]], v_{1} = 1, \\ U_{4} & = [[0, 1], [1, 2]], v_{1} = - 1, \\ U_{5} & = [[1, 2], [2, 3]], v_{1} = - 1, \\ U_{6} & = [[0, 2], [3, 4]], v_{1} = - 1. \end{aligned}$ Let us find a classifying hyperplane for the above data set.

For finding a classifying hyperplane, the possible solution $(z, d)$ to (Equation30(30) ${\begin{cases} 0_{v}^{n} \in ([z, z] \oplus \sum_{i = 1}^{k} (- w_{i} v_{i}) ⊙ U_{i}) \\ \sum_{i = 1}^{k} w_{i} v_{i} = 0 \\ and 0 = w_{i} ⊙ H_{i} (z^{*}, d^{*}), i = 1, 2, \dots, k . \end{cases}$ (30) ) with respective $w_{i}, s$ are required.

For the choice $(w_{1}, w_{2}, w_{3}, w_{4}, w_{5}, w_{6}) = (1, 0, 0, 0, 0, 1)$ , we notice that we have $\sum_{i = 1}^{6} w_{i} v_{i} = 0.$ According to the first condition (Equation30(30) ${\begin{cases} 0_{v}^{n} \in ([z, z] \oplus \sum_{i = 1}^{k} (- w_{i} v_{i}) ⊙ U_{i}) \\ \sum_{i = 1}^{k} w_{i} v_{i} = 0 \\ and 0 = w_{i} ⊙ H_{i} (z^{*}, d^{*}), i = 1, 2, \dots, k . \end{cases}$ (30) ), the values of $U_{i}$ for which $w_{i} \neq 0$ are $U_{1}$ and $U_{6}$ . The condition in (Equation30(30) ${\begin{cases} 0_{v}^{n} \in ([z, z] \oplus \sum_{i = 1}^{k} (- w_{i} v_{i}) ⊙ U_{i}) \\ \sum_{i = 1}^{k} w_{i} v_{i} = 0 \\ and 0 = w_{i} ⊙ H_{i} (z^{*}, d^{*}), i = 1, 2, \dots, k . \end{cases}$ (30) ) becomes (32) $\begin{aligned} 0_{v}^{n} \in ([z, z] \oplus (- 1) (1 \times 1) ⊙ U_{1} \oplus (- 1) (- 1 \times 1) ⊙ U_{6}) \\ ⟹ 0_{v}^{n} \in ([z, z] \oplus (- 1) ⊙ U_{1} \oplus U_{6}) \\ or [z, z] \in (- 1) ⊙ ((- 1) ⊙ U_{1} \oplus U_{6}), \end{aligned}$ (32) where $\begin{aligned} (- 1) ⊙ U_{1} & = (- 1) ⊙ [[3, 4], [1, 2]] \\ = [[- 4, - 3], [- 2, - 1]] \\ (- 1) ⊙ U_{1} \oplus U_{6} & = [[- 4, - 3], [- 2, - 1]] \\ \oplus [[0, 2], [3, 4]] \\ = [[- 4, - 1], [1, 3]] \\ (- 1) ⊙ ((- 1) ⊙ U_{1} \oplus U_{6}) & = (- 1) ⊙ [[- 4, - 1], [1, 3]] \\ = [[1, 4], [- 3, - 1]] . \end{aligned}$ Substituting in (Equation32(32) $\begin{aligned} 0_{v}^{n} \in ([z, z] \oplus (- 1) (1 \times 1) ⊙ U_{1} \oplus (- 1) (- 1 \times 1) ⊙ U_{6}) \\ ⟹ 0_{v}^{n} \in ([z, z] \oplus (- 1) ⊙ U_{1} \oplus U_{6}) \\ or [z, z] \in (- 1) ⊙ ((- 1) ⊙ U_{1} \oplus U_{6}), \end{aligned}$ (32) ), we have $z \in [[1, 4], [- 3, - 1]] .$ As $z \in R^{n}$ , and we are working on $2 D$ so, n = 2. As a result $z \in R^{2}$ , which means $z = [z_{1}, z_{2}]$ . The above condition becomes $1 \leq z_{1} \leq 4 and - 3 \leq z_{2} \leq - 1.$ We choose $z_{1}^{*} = 1$ and $z_{2}^{*} = - 1$ . Corresponding to this $z^{*} =$ $(z_{1}^{*}, z_{2}^{*}) = (1, - 1)$ , from (Equation31(31) $\cap_{i : w_{i} > 0} {d | H_{i} (z, d) = 0} .$ (31) ) and the condition 3 in (Equation30(30) ${\begin{cases} 0_{v}^{n} \in ([z, z] \oplus \sum_{i = 1}^{k} (- w_{i} v_{i}) ⊙ U_{i}) \\ \sum_{i = 1}^{k} w_{i} v_{i} = 0 \\ and 0 = w_{i} ⊙ H_{i} (z^{*}, d^{*}), i = 1, 2, \dots, k . \end{cases}$ (30) ), the set of possible values for the bias d is as follows $\begin{aligned} \cap_{i = 1, 6} {d \in R | H_{i} (z^{*}, d) = 0} \\ = {d \in R | H_{1} (z^{*}, d) = 0} \\ \cap {d \in R | H_{6} (z^{*}, d) = 0} . \end{aligned}$ Since, $H_{i} (z, d) = [1, 1] ⊖_{gH} v_{i} ⊙ (z^{T} ⊙ U_{i} \oplus d) \leq_{LC} 0$ , $i = 1, 2, \dots, k$ . Then, for i = 1, we have $H_{1} (z^{*}, d) = [1, 1] ⊖_{gH} 1 ⊙ ((1, - 1) ⊙ [[3, 4], [1, 2]] \oplus d)$ upon simplification and setting $H_{1} (z^{*}, d) = 0$ we get $[min (0 - d, - 2 - d), max (0 - d, - 2 - d)] = 0 .$ By applying the definition of gH-difference there are 2 cases by the first Lemma, that is $2 - d \leq - 1 - d$ or $2 - d \geq - 1 - d$ . But it does not matter because of equality with $0$ $\begin{aligned} 0 - d = 0 and - 2 - d = 0 \\ ⟹ d = 0 and d = - 2. \end{aligned}$ For $H_{1} (z^{*}, d) = 0$ , we have (33) $d \in [- 2, 0] .$ (33) Similarly, for i = 6 $\begin{aligned} H_{6} (z^{*}, d) & = [1, 1] ⊖_{gH} (- 1) \\ ⊙ ((1, - 1) ⊙ [[0, 2], [3, 4]] \oplus d) \end{aligned}$ upon simplification, we get $H_{6} (z^{*}, d) = [1, 1] ⊖_{gH} [4 - d, 1 - d] .$ For $H_{6} (z^{*}, d) = 0$ , we have $\begin{aligned} [min (- 3 + d, 0 + d), max (- 3 + d, 0 + d)] = 0 \\ ⟹ - 3 + d = 0 and 0 + d = 0 \\ ⟹ d = 3 and d = 0. \end{aligned}$ This implies that (34) $d \in [0, 3] .$ (34) Now, we have $\begin{aligned} \cap_{i = 1, 6} {d \in R | H_{i} (z^{*}, d) = 0} \\ = {d \in R | d \in H_{1} (z^{*}, d) = 0} \\ \cap {d \in R | d \in H_{6} (z^{*}, d) = 0} . \end{aligned}$ Substituting (Equation33(33) $d \in [- 2, 0] .$ (33) ) and (Equation34(34) $d \in [0, 3] .$ (34) ) in the above expression $\begin{aligned} = {d \in R | d \in [- 2, 0]} \cap {d \in R | d \in [0, 3]} \\ ⟹ d = 0. \end{aligned}$ Therefore, corresponding to $z_{1}^{*} = 1$ and $z_{2}^{*} = - 1$ , the expression for the classifying hyperplane (see Figure ) is as follows $\begin{aligned} z_{1}^{*} u_{1} + z_{2}^{*} u_{2} + d = 0, d = 0 \\ u_{1} - u_{2} = 0. \end{aligned}$ Similarly, we can find equations for other hyperplanes.

Figure 1. The objective function $F (y_{1}, y_{2})$ of example.

Figure 2. The hyperplane represents $u_{1} - u_{2} = 0$ .

Now, we give a graphical representation of the hyperplane by the following figure.

By translating the above equation, we can get more equations for classifying hyperplane by the following Figure . The blocks in Figure show the interval data set considered in the example. The red hyperplane is the best classifying hyperplane which represents the equation $2 u_{1} - 2 u_{2} + 1 = 0$ and the blue dotted lines are margin lines that represent the equations $u_{1} - u_{2} = 0$ and $u_{1} - u_{2} + 1 = 0$ . And the strip, which is formed by the equations $u_{1} - u_{2} = 0$ and $u_{1} - u_{2} + 1 = 0$ , contains all the hyperplanes that classify the given data.

Figure 3. An interval-valued classifying hyperplane.

5. Conclusion

In this article, we considered the constrained IOP for characterizing the efficient solution. We generalized Gordan's theorems to get the Fritz–John condition for IOPs. We also derived an extension to KKT's necessary optimality condition for LC-partial order. The derived conditions formed the basis of our expression for SVMs, and we demonstrated how SVMs classify data using KKT conditions. To get hyperplane in interval-valued vectors is not a simple task. In Example 4.1, we only consider a very small data set. For a large data set, it can be more complex. Also in this article, we only tackle a non-overlapping data set. For overlapping data set, it will not work. Therefore, in the future, SVM regression analysis might apply to the SVM classifier and also one can discuss overlapping data set and also generalized this idea for multi-classification problem.

Authors' contributions

All authors contributed equally to this article. They have read and approved the final manuscript.

Disclosure statement

The authors confirm that there are no known conflicts of interest associated with this publication.

References

Wu HC. The Karush–Kuhn–Tucker optimality conditions in multiobjective programming problems with interval-valued objective functions. Eur J Oper Res. 2009;196:49–60. doi: 10.1016/j.ejor.2008.03.012
Web of Science ®Google Scholar
Wu HC. On interval-valued nonlinear programming problems. J Math Anal Appl. 2008;338:299–316. doi: 10.1016/j.jmaa.2007.05.023
Web of Science ®Google Scholar
Wu HC. The Karush–Kuhn–Tucker optimality conditions in an optimization problem with interval-valued objective function. Eur J Oper Res. 2007;176:46–59. doi: 10.1016/j.ejor.2005.09.007
Web of Science ®Google Scholar
Chalco-Cano Y, Lodwick WA, Rufian-Lizana A. Optimality conditions of type KKT for optimization problem with interval-valued objective function via generalized derivative. Fuzzy Optim Decis Mak. 2013;12:305–322. doi: 10.1007/s10700-013-9156-y
Web of Science ®Google Scholar
Singh D, Dar B, Kim DS. KKT optimality conditions in interval valued multiobjective programming with generalized differentiable functions. Eur J Oper Res. 2016;254:29–39. doi: 10.1016/j.ejor.2016.03.042
Web of Science ®Google Scholar
Singh D, Dar BA, Goyal A. KKT optimality conditions for interval valued optimization problems. J Nonlinear Anal Optim. 2014;5:91–103.
Google Scholar
Slyn'ko VI, Tunç C. Instability of set differential equations. J Math Anal Appl. 2018;467:935–947. doi: 10.1016/j.jmaa.2018.07.048
Web of Science ®Google Scholar
Ben-Tal A, El Ghaoui L, Nemirovski A. Robust optimization. Princeton (NJ): Princeton University Press; 2009. (Princeton Series in Applied Mathematics).
Google Scholar
Slowinski R, Teghem J. Stochastic versus fuzzy approaches to multiobjective mathematical programming under uncertainty. Edited by Roman Słowiński and Jacques Teghem. Theory and Decision Library. Series D: System Theory, Knowledge Engineering and Problem Solving, 6. Kluwer Academic Publishers Group, Dordrecht, 1990.
Google Scholar
Snyder CThe optimization behind support vector machines and an application in handwriting recognition; 2016; p. 1–15.
Google Scholar
Shawe-Taylor J, Bartlett PL, Williamson RC, Anthony M. Structural risk minimization over data-dependent hierarchies. IEEE Trans Inform Theory. 1998;44:1926–1940. doi: 10.1109/18.705570
Web of Science ®Google Scholar
Vapnik V. Estimation of dependences based on empirical data. Translated from the Russian by Samuel Kotz. Springer Series in Statistics. Springer-Verlag, New York–Berlin, 1982.
Google Scholar
Vapnik VN. The nature of statistical learning theory. New York: Springer-Verlag; 1995.
Google Scholar
Boser BE, Guyon IM, Vapnik VN. A training algorithm for optimal margin classifiers. Proceedings of the fifth annual workshop on Computational learning theory. 1992; p. 144–152.
Google Scholar
Stefanini S, Bede B. Generalized Hukuhara differentiability of interval-valued functions and interval differential equations. Nonlinear Anal. 2009;71:1311–1328. doi: 10.1016/j.na.2008.12.005
Web of Science ®Google Scholar
Ghosh D, Singh A, Shukla KK, et al. Extended Karush–Kuhn–Tucker condition for constrained interval optimization problems and its application in support vector machines. Inform Sci. 2019;504:276–292. doi: 10.1016/j.ins.2019.07.017
Web of Science ®Google Scholar
Debnath AK, Ghosh D. Generalized-Hukuhara penalty method for optimization problem with interval-valued functions and its application in interval-valued portfolio optimization problems. Oper Res Lett. 2022;50:602–609. doi: 10.1016/j.orl.2022.08.010
Web of Science ®Google Scholar
Kumar K, Ghosh D, Kumar G. Weak sharp minima for interval-valued functions and its primal-dual characterizations using generalized Hukuhara subdifferentiability. Soft Comput. 2022;26:10253–10273. doi: 10.1007/s00500-022-07332-0
Web of Science ®Google Scholar
Younus A, Nisar O. Convex optimization of interval valued functions on mixed domains. Filomat. 2019;33:1715–1725. doi: 10.2298/FIL1906715Y
Web of Science ®Google Scholar
Dastgeer Z, Younus A, Tunç, C. Distinguishability of the descriptor systems with regular pencil. Linear Algebra Appl. 2022;652:82–96. doi:10.1016/j.laa.2022.07.004
Web of Science ®Google Scholar
Chalco-Cano Y, Rufián-Lizana A, Román-Flores H, et al. Calculus for interval-valued functions using generalized Hukuhara derivative and applications. Fuzzy Sets Syst. 2013;219:49–67. doi: 10.1016/j.fss.2012.12.004
Web of Science ®Google Scholar
Tao J, Zhang ZH. Properties of interval vector-valued arithmetic based on gH-difference. Math Comput. 2015;4:7–12.
Google Scholar
Ghosh D. Newton method to obtain efficient solutions of the optimization problems with interval-valued objective functions. J Appl Math Comput. 2017;53:709–731. doi: 10.1007/s12190-016-0990-2
Web of Science ®Google Scholar

Interval-based KKT framework for support vector machines and beyond

Abstract

1. Introduction

2. Preliminaries

[21]

[Citation19]

3. KKT conditions under LC-partial order

4. Applications of KKT conditions in SVM

5. Conclusion

Authors' contributions

Disclosure statement

References

Information for

Open access

Opportunities

Help and information

Interval-based KKT framework for support vector machines and beyond

Abstract

1. Introduction

2. Preliminaries

[21]

[Citation19]

3. KKT conditions under LC-partial order

4. Applications of KKT conditions in SVM

5. Conclusion

Authors' contributions

Disclosure statement

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date