Misplaced Pages

Minimum relevant variables in linear system

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

Minimum relevant variables in linear system (Min-RVLS) is a problem in mathematical optimization. Given a linear program, it is required to find a feasible solution in which the number of non-zero variables is as small as possible.

The problem is known to be NP-hard and even hard to approximate.

Definition

A Min-RVLS problem is defined by:

  • A binary relation R, which is one of {=, ≥, >, ≠};
  • An m-by-n matrix A (where m is the number of constraints and n the number of variables);
  • An m-by-1 vector b.

The linear system is given by: A x R b. It is assumed to be feasible (i.e., satisfied by at least one x). Depending on R, there are four different variants of this system: A x = b, A x ≥ b, A x > b, A x ≠ b.

The goal is to find an n-by-1 vector x that satisfies the system A x R b, and subject to that, contains as few as possible nonzero elements.

Special case

The problem Min-RVLS was presented by Garey and Johnson, who called it "minimum weight solution to linear equations". They proved it was NP-hard, but did not consider approximations.

Applications

The Min-RVLS problem is important in machine learning and linear discriminant analysis. Given a set of positive and negative examples, it is required to minimize the number of features that are required to correctly classify them. The problem is known as the minimum feature set problem. An algorithm that approximates Min-RVLS within a factor of O ( log ( m ) ) {\displaystyle O(\log(m))} could substantially reduce the number of training samples required to attain a given accuracy level.

The shortest codeword problem in coding theory is the same problem as Min-RVLS when the coefficients are in GF(2).

Related problems

In minimum unsatisfied linear relations (Min-ULR), we are given a binary relation R and a linear system A x R b, which is now assumed to be infeasible. The goal is to find a vector x that violates as few relations as possible, while satisfying all the others.

Min-ULR is trivially solvable, since any system with real variables and a finite number of inequality constraints is feasible. As for the other three variants:

  • Min-ULR are NP-hard even with homogeneous systems and bipolar coefficients (coefficients in {1,-1}).
  • The NP-complete problem Minimum feedback arc set reduces to Min-ULR, with exactly one 1 and one -1 in each constraint, and all right-hand sides equal to 1.
  • Min-ULR are polynomial if the number of variables n is constant: they can be solved polynomially using an algorithm of Greer in time O ( n m n / 2 n 1 ) {\displaystyle O(n\cdot m^{n}/2^{n-1})} .
  • Min-ULR are linear if the number of constraints m is constant, since all subsystems can be checked in time O(n).
  • Min-ULR is polynomial in some special case.
  • Min-ULR can be approximated within n + 1 in polynomial time.
  • Min-ULR are minimum-dominating-set-hard, even with homogeneous systems and ternary coefficients (in {−1,0,1}).
  • Min-ULR cannot be approximated within a factor of 2 log 1 ε n {\displaystyle 2^{\log ^{1-\varepsilon }n}} , for any ε > 0 {\displaystyle \varepsilon >0} , unless NP is contained in DTIME( n polylog ( n ) {\displaystyle n^{\operatorname {polylog} (n)}} ).

In the complementary problem maximum feasible linear subsystem (Max-FLS), the goal is to find a maximum subset of the constraints that can be satisfied simultaneously.

  • Max-FLS can be solved in polynomial time.
  • Max-FLS is NP-hard even with homogeneous systems and bipolar coefficients.
  • . With integer coefficients, it is hard to approximate within m ε {\displaystyle m^{\varepsilon }} . With coefficients over GF, it is q-approximable.
  • Max-FLS and Max-FLS are NP-hard even with homogeneous systems and bipolar coefficients. They are 2-approximable, but they cannot be approximated within any smaller constant factor.

Hardness of approximation

All four variants of Min-RVLS are hard to approximate. In particular all four variants cannot be approximated within a factor of 2 log 1 ε n {\displaystyle 2^{\log ^{1-\varepsilon }n}} , for any ε > 0 {\displaystyle \varepsilon >0} , unless NP is contained in DTIME( n polylog ( n ) {\displaystyle n^{\operatorname {polylog} (n)}} ). The hardness is proved by reductions:

  • There is a reduction from Min-ULR to Min-RVLS. It also applies to Min-RVLS and Min-RVLS, since each equation can be replaced by two complementary inequalities.
  • There is a reduction from minimum-dominating-set to Min-RVLS.

On the other hand, there is a reduction from Min-RVLS to Min-ULR. It also applies to Min-ULR and Min-ULR, since each equation can be replaced by two complementary inequalities.

Therefore, when R is in {=,>,≥}, Min-ULR and Min-RVLS are equivalent in terms of approximation hardness.

References

  1. ^ Amaldi, Edoardo; Kann, Viggo (December 1998). "On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems". Theoretical Computer Science. 209 (1–2): 237–260. doi:10.1016/S0304-3975(97)00115-1.
  2. Garey, Michael R.; Johnson, David S. (1979). Computers and Intractability: A Guide to the Theory of NP-completeness. W. H. Freeman. ISBN 978-0-7167-1044-8.
  3. Koehler, Gary J. (November 1991). "Linear Discriminant Functions Determined by Genetic Search". ORSA Journal on Computing. 3 (4): 345–357. doi:10.1287/ijoc.3.4.345.
  4. Van Horn, Kevin S.; Martinez, Tony R. (January 1994). "The minimum feature set problem". Neural Networks. 7 (3): 491–494. doi:10.1016/0893-6080(94)90082-5.
  5. ^ Amaldi, Edoardo; Kann, Viggo (August 1995). "The complexity and approximability of finding maximum feasible subsystems of linear relations". Theoretical Computer Science. 147 (1–2): 181–210. doi:10.1016/0304-3975(94)00254-G.
  6. ^ Sankaran, Jayaram K (February 1993). "A note on resolving infeasibility in linear programs by constraint relaxation". Operations Research Letters. 13 (1): 19–20. doi:10.1016/0167-6377(93)90079-V.
  7. Greer, R. (2011). Trees and Hills: Methodology for Maximizing Functions of Systems of Linear Relations. Elsevier. ISBN 978-0-08-087207-0.
  8. Arora, Sanjeev; Babai, László; Stern, Jacques; Sweedyk, Z (April 1997). "The Hardness of Approximate Optima in Lattices, Codes, and Systems of Linear Equations". Journal of Computer and System Sciences. 54 (2): 317–331. doi:10.1006/jcss.1997.1472.
Categories: