Posted by: samuelmarkreid | March 17, 2012

Convex Geometry I: Hyperplanes

This entry is the first post in a series on convex geometry, and in particular convex bodies, that aims to give a summary of the introductory ideas in discrete geometry necessary to begin research on convex bodies and other topics in convex geometry.

\mathbb{R}^d is the set of all d-tuples x=(\alpha_{1},...,\alpha_{d}) of real numbers \alpha_{1},...,\alpha_{d}, and the elements of \mathbb{R}^d are called vectors, with the zero vector denoted 0. That is to say, \mathbb{R}^d is a set which contains, as its elements, points x in d-space with a real number \alpha_{i}, 1 \leq i \leq d that represents the coordinates of the point. In this way, we identify with each element in \mathbb{R}^d a unique point in d-space. By exploring the ways in which certain subsets of \mathbb{R}^d interact with each other, the wonderful structures studied in convex geometry emerge. Toward this end, the background terminology necessary to define a hyperplane in \mathbb{R}^d is now presented.

Generally, a hyperplane is a certain subspace of \mathbb{R}^d with the property that it separates \mathbb{R}^d into two parts (called halfspaces) such that the line connecting any two points lying on opposite sides of the hyperplane must intersect the hyperplane. For d=2 and d=3, a hyperplane is a line and a plane, respectively. To define such a property formally requires the extension of linear concepts, such as linear combinations, linear subspaces, and linear functions, to the affine versions of each. Before discussing these extensions of linear concepts, the definitions and basic results in linear algebra necessary to define the linear concepts are presented.

For scalars \lambda_{1},...,\lambda_{n} \in \mathbb{R} and vectors x_{1},...,x_{2} \in \mathbb{R}^d, a vector \alpha \in \mathbb{R}^d is a linear combination when \alpha = \lambda_{1}x_{1} + ... + \lambda_{n}x_{n}. So, a linear combination of vectors from \mathbb{R}^d is a vector formed by summing scalar multiples of other vectors in \mathbb{R}^d. Linear combinations immediately apply to defining a linear subspace of \mathbb{R}^d; that is to say, a set such that L \subseteq \mathbb{R}^d and \lambda_{1}x_{1} + \lambda_{2}x_{2} \in L, for all x_{1},x_{2} \in L and \lambda_{1},\lambda_{2} \in \mathbb{R}. Intuitively, a linear subspace can be thought of as a set which, given the basic operations over the field (in this case \mathbb{R}^d), keeps to itself; that is, any linear combination of vectors from L stays in L.

Considering that a linear subspace is a subset of \mathbb{R}^d, it will inherit certain global properties, such as the operations over the field, but how is it possible to characterize the difference between a linear subspace and the entire space? While there are many answers to this question, the general approach is to classify the “important vectors” by which the linear subspace can be generated by linear combinations. In order to single out such vectors, the notion of linear independence, which states that no vector is a linear combination of the other vectors in the linear subspace, is necessary. In general, a set of vectors (x_{1},...,x_{n}) are linearly independent if and only if \lambda_{1}x_{1} + ... + \lambda_{n}x_{n} = 0 implies \lambda_{1}=...=\lambda_{n}=0, for all \lambda_{i} \in \mathbb{R}, 1 \leq i \leq n. With this notion of linear independence the vectors necessary to characterize a linear subspace are narrowed down to a set of linearly independent vectors. Yet, the linear subspace in which these linearly independent vectors are contained still needs to “capture” the entire linear subspace, and so there must exist a linear combination of the linearly independent set of vectors for any vector in the linear subspace. That is, for a set of vectors x_{1},...,x_{n} \in \mathbb{R}^d, the linear hull, span\{x_{1},...,x_{n}\}, is the set of all linear combinations of vectors from the linear subspace. Therefore, the “important vectors” referred to earlier, the basis vectors, are a set of linearly independent vectors which span the linear subspace. Formally, a linear basis for a linear subspace L \subseteq \mathbb{R}^d is a linearly independent set of vectors (x_{1},...,x_{n}) from L such that L = \text{span}\{x_{1},..,x_{n}\}. Resultantly, our characterization of any linear subspace is the dimension, written dim(L)=n, as the number of basis vectors in a linear basis for L.

For motivating the generalization of linear concepts to affine concepts, there must be a coherent picture for the distinction. What, in general, is the difference between a linear subspace and an affine subspace of \mathbb{R}^d? Well, it was defined that a linear subspace L has the property that \lambda_{1}x_{1} + \lambda_{2}x_{2} \in L, for all \lambda_{1},\lambda_{2} \in \mathbb{R}; this means that for \lambda_{1}=0, \lambda_{2}=0, the linear combination 0x_{1} + 0x_{2} = 0 \in L. So, any linear subspace must contain the zero vector whereas for affine subspaces this is the not the case. An affine subspace A \subseteq \mathbb{R}^d can be thought of as a translation of a linear subspace A = L + x, where x \in \mathbb{R}^d. To capture this property, let A \subseteq \mathbb{R}^d be an affine subspace if and only if \lambda_{1}x_{1} + \lambda_{2}x_{2} \in A, for all x_{1},x_{2} \in A, \lambda_{1},\lambda_{2} \in \mathbb{R} such that \lambda_{1} + \lambda_{2} = 1. This gives rise to the definition of an affine combination, which is a linear combination \lambda_{1}x_{1} + ... + \lambda_{n}x_{n}, with the additional condition that \lambda_{1}+...+\lambda_{n}=1. These conditions on an affine subspace and affine combinations that the coefficients sum to one may seem strange, but are necessary in order to discuss vector quantities independent of the need for an origin. To reify this necessity in \mathbb{R}^3, let x_{1}=(1,3,2), x_{2}=(0,1,-3), x_{3}=(1,0,2) and let the coefficients of the affine combination be \lambda_{1}=\frac{1}{2}, \lambda_{2}=\frac{1}{4}, \lambda_{3}=\frac{1}{4}. For the origin at (0,0,0), \lambda_{1}x_{1} + \lambda_{2}x_{2} + \lambda_{3}x_{3} = (\frac{3}{4},\frac{7}{4},\frac{3}{4}), while for an origin at (1,0,-4), it is still the case that \lambda_{1}x_{1} + \lambda_{2}x_{2} + \lambda_{3}x_{3} = (\frac{3}{4},\frac{7}{4},\frac{3}{4}). It seems that affine combinations are truly independent of the origin!

As in the linear case, the affine hull of a set of points (x_{1},...,x_{n}) is the set of all affine combinations, denoted \text{aff}\{x_{1},...,x_{n}\}. Analogously to linear independence, affine independence is defined in terms of a condition on affine combinations of points equal to zero with an additional constraint that ensures the affine hull is the smallest affine subspace which contains the points. A set of points (x_{1},...,x_{n}) is affinely independent if and only if for \lambda_{1}+...+\lambda_{n}=0, the linear combination \lambda_{1}x_{1} + ... + \lambda_{n}x_{n} = 0 implies that \lambda_{1}=...=\lambda_{n}=0, for all \lambda_{i} \in \mathbb{R}, 1 \leq i \leq n. Hence, the affine basis of an affine space A is an affinely independent set of points (x_{1},...,x_{n}) from A such that A=\text{aff}\{x_{1},...,x_{n}\}. Since affine combinations can be thought of linear combinations without reference to the origin, the affine independence of n-points is equivalent to the linear independence of (n-1)-vectors by fixing one point as the origin in the affine case. It is then no surprise that the dimension of a non-empty affine space A is the dimension of the linear subspace L such that A=L+x. That is, dim(A)=n-1 if and only if n is the largest non-negative integer such that there is an affinely independent set of n-points from A. While the jump from linear concepts to affine concepts may be overwhelming at first, the intuitive notion of a hyperplane reifies much of what has been discussed into a simple geometric picture.

At long last, for an affine space A with dim(A)=n, a hyperplane, H \subset A, is an (n-1)-dimensional affine subspace of A. As desired, this definition fits with our understanding of a line and a plane in \mathbb{R}^2 and \mathbb{R}^3. The fundamental property of a hyperplane is that it splits the space it is contained in into two halfspaces which can be considered open or closed depending on if the halfspaces contain the hyperplane. Explicitly, a hyperplane is a set of points and can be represented as such by H(y,\alpha)=\{x \in \mathbb{R}^d | \langle x,y\rangle = \alpha\}, where y \in \mathbb{R}^d \setminus \{0\}, \alpha \in \mathbb{R}, and \langle x,y \rangle denotes the standard inner product or dot product between x and y. Thus, the closed halfspaces in \mathbb{R}^d bounded by the hyperplane H(y,\alpha) are K(\pm y,\pm \alpha)=\{x \in \mathbb{R}^d | \langle x,y\rangle \leq \alpha \}, with open halfspaces given by strictly less-than \alpha. Topologically, the boundary of a closed halfspace is a hyperplane, the interior of a closed half space is the closed half space without the hyperplane, and the closure of the interior of a closed halfspace is the closed halfspace. The idea of a hyperplane is foundational in the study of convex geometry as it illustrates powerful ideas, such as duality and polarity, concretely, while proving useful in functional analysis and numerous other fields of mathematics.