http://cogs.csustan.edu/~tom/linearalgebra
Santa Fe Institute
Complex Systems Summer School
Our general topics: Top
(ex): exercises.
Why linear algebra Top
Beyond that, linear algebra courses are often the transition from lower division mathematics courses such as calculus, probability/statistics, and elementary differential equations, which typically focus on specific problem solving techniques, to the more theoretical axiomatic and proof oriented upper division mathematics courses.
I am going to stay with a generally abstract, axiomatic presentation of the basics of linear algebra. (But I'll also try to provide some practical advice along the way ... :)




For u, v, w Î V, and a, b Î F, these operations satisfy the properties:


Exercises: Vector spaces Top
Examples of vector spaces Top


When we multiply by a scalar a Î R, we get




C^{0}(R) thus becomes a real vector space, where each continuous function is a vector in the space.
Exercises: Examples of vector spaces Top


Subspaces Top
Thus, when we inherit the operations from V, we will have that





Exercises: Subspaces Top


From here on, we'll assume that U and V are vector spaces over a field F.

This motivates a general definition: a set S of vectors in a vector space V is called linearly dependent if, for some n > 0, and distinct v_{1}, v_{2}, ¼v_{n} Î S, there exist a_{1}, a_{2}, ¼, a_{n} Î F, not all 0, with


A useful way to think about this is that a set S of vectors is linearly independent if no individual one of the vectors is linearly dependent on a finite number of the rest of them.
One of the hardest parts of doing mathematics is developing your mathematical intuition. It is tempting to imagine that intuition is what you have before you know anything, but that is nonsense. Intuition is just the automatic part of your knowledge, derived from your past experience. Becoming better at mathematics involves learning new mathematics, and then integrating that new knowledge into your intuition. Doing that takes care, precision, and lots of practice!
Exercises: Linear dependence and independence Top


Then we call S a basis for the vector space V.


We define dim({0}) = 0.
Note that if a vector space V has a finite basis of size n, then every basis for V contains n vectors, and thus the definition makes sense.
For example, dim(F^{n}) = n for any field F and n > 0.
We also have that dim(F^{n}[x]) = n + 1.
Each b_{i} is either the nonzero coefficient corresponding with the ith element of S from the unique representation described above, or 0 if the basis element does not appear there. In the infinite case, only finitely many of the b_{i} are nonzero.
We represent 0 by (0, 0, ¼, 0) in the finite case, and by (0, 0, ¼) in the infinite case.


Find a basis for U.

An equivalent pair of conditions is that T(u_{1} + u_{2}) = T(u_{1}) + T(u_{2}), and T(au) = aT(u).











Recall that (S °T)(u) = S(T(u)).
Given particular bases for U, V, and W, the matrix representation of S °T is the matrix product [S °T] = [S][T]. We usually abbreviate S °T as ST.
It is worth noting that unless W Í U, it doesn't even make sense to talk about T °S.






In general, for a vector space V, a linear transformation T : V ® F is called a linear functional. The study of these transformations is called functional analysis.





Monomorphisms are nice because the subspace im(T) Ì V looks just like U.
Epimorphisms are nice, because the algebraic properties of V will be reflected back in U.
A bijective morphism whose inverse also preserves algebraic structure is called an isomorphism. In linear algebra, we have the nice property that if a linear transformation is bijective, then its inverse is also linear, and thus it is an isomorphism.
Pf.: Suppose T is a monomorphism. We know that for every linear transformation, T(0) = 0. Then, since T is a monomorphism, we know that if T(u) = 0 = T(0), it must be that u = 0. Thus ker(T) = {0}.
On the other hand, suppose that ker(T) = {0}. Then, if T(u_{1}) = T(u_{2}), we will have 0 = T(u_{1})  T(u_{2}) = T(u_{1}  u_{2}). But this means that u_{1}  u_{2} Î ker(T) and hence, since we are assuming that ker(T) = {0}, we must have u_{1}  u_{2} = 0, or u_{1} = u_{2}. By the contrapositive, this means that if u_{1} ¹ u_{2}, then T(u_{1}) ¹ T(u_{2}). Q.E.D.
(I had to do at least one proof,
didn't I? :)

(Big) hint for proof: Let (u_{1}, u_{2}, ¼, u_{n}) and (v_{1}, v_{2}, ¼, v_{n}) be bases for U and V respectively. Define T : U ® V by T(u_{i}) = v_{i} for 1 £ i £ n, and extend by linearity. Make sense of the phrase ``extend by linearity,'' and then show that T is an isomorphism.
This means, for example, that such a T is onto if and only ker(T) = 0.

This transformation is an isomorphism. (ex)
Exercises: Morphisms  mono, epi, and iso Top









Thus, if we denote by O(n), U(n), and Sp(n) the distance preserving linear operators on R^{n}, C^{n}, and H^{n} respectively (called the orthogonal, unitary, and symplectic groups), then we have the monomorphisms:


L(V) has the algebraic structure of a ring with identity. A ring is similar to a field (as defined above), except without the requirements that multiplication be commutative and that there be multiplicative inverses for nonzero elements. The identity element is I_{V}. L(V) is a noncommutative ring, since in general ST ¹ TS. This is reflected in the fact that matrix multiplication is noncommutative. Only in very special cases is it true that [S][T] = [T][S] (for example, if both [S] and [T] are diagonal matrices, with [S]_{ij} = 0 for i ¹ j, and [T]_{ij} = 0 for i ¹ j).








If we let A : F^{n} ® F^{m} be the linear transformation with [A]_{ij} = a_{ij}, x be the vector (x_{1}, x_{2}, ¼, x_{n})^{t}, and b the vector (b_{1}, b_{2}, ¼, b_{m})^{t}, then we can rewrite the equation in the form:


On the other hand, if b Î im(A) there is at least one solution x_{0} with Ax_{0} = b.
Note, though, that we also know A is not a monomorphism, and hence dim(ker(A)) ³ 1. Then, if z Î ker(A), we have A(x_{0} + z) = Ax_{0} + Az = Ax_{0} + 0 = Ax_{0} = b, and so x_{0} + z is another solution. Furthermore, if y is another solution with Ay = b, then A(x_{0}  y) = Ax_{0} Ay = b  b = 0. This means x_{0}  y Î ker(A), and so y = x_{0} + z for some z Î ker(A).
Thus, if x_{0} is a particular solution, then every solution is of the form x_{0} + z for some z Î ker(A). The space of solutions is then a translation of the kernel of A, of the form x_{0} + ker(A). We then only need to find one particular solution.
In this case, we have broken the problem down into two parts: first, we solve Ax = 0 (called the homogeneous equation), then we find a single solution Ax_{0} = b. For F = R or C, there will be infinitely many solutions.

Note that ker(D) = span({1}), the onedimensional space consisting of all constant functions. If we collapse ker(D) down to nothing (in technical terms, form the quotient space ...) then we can think of D as an isomorphism (on the quotient space). D has an inverse,





We first note that if the function f_{0} is a solution to this equation, and z = z(x) Î ker(P(D)), then f_{0} + z is also a solution, and if f_{1} is another solution, then P(D)(f_{0}  f_{1}) = P(D)(f_{0})  P(D)(f_{1}) = g  g = 0. Thus, all solutions are of the form f_{0} + z where f_{0} is some particular solution, and z Î ker(P(D)).
We thus separate the problem into two parts. First we solve the associated homogeneous equation:

In general, we have that dim(ker(P(D))) = n, the degree of P. This is not an entirely obvious fact, but it is not counterintuitive ...
Hence what we need to do is find n functions which form a basis for ker(P(D)). What we need, then, are n linearly independent functions each of which is a solution to the homogeneous equation.
In theory (:) this is not too hard.
We note first that for the firstorder case, we have the solution:


We also have that the set of functions A = {x^{j}e^{rix}  0 £ j £ k, 1 £ i £ m}
is linearly independent. From this we see how to solve equations of the form:

Now, consider the operator

We have that



We have that the set of functions


We can now put together all the pieces to solve the homogeneous equation P(D) = 0. We use the fact that any polynomial over R can be completely factored as

To solve the inhomogeneous equation, we need only to find one particular solution of P(D)(f) = g.
This is just the bare beginnings of techniques for solving differential equations, but it gives the flavor of some relatively powerful methods, and the role that linear algebra plays. I haven't even mentioned the issues of initial values/boundary conditions. For much more on these topics, look at a book such as Elementary Differential Equations with Linear Algebra, by Finney and Ostberg.

We would then have D((a_{n})) = (a_{n+1}  a_{n}). Our example difference equation would be

















A space with such an associated function is called a normed linear space.
I won't go into this much here beyond a few examples, but good places to look are books on Hilbert Spaces and/or functional analysis. There are a few books indicated in the references.

We can also think of this in terms of the inner product given by
< (a_{1}, a_{2}, ¼, a_{n}), (b_{1}, b_{2}, ¼, b_{n}) >
= a_{1}[`b]_{1} + a_{2}[`b]_{2} + ... + a_{n}[`b]_{n}.
We then have v_{2} = < v, v > ^{1/2}.


A fun little exercise is to draw the circle of radius 1 in R^{2} for each of these norms:

One of these constitutes a ``proof'' that a square is a circle :)










Hint: Show that every f Î V^{*} corresponds with a function of the form


Eigenvalues and eigenvectors can thus give us a very simple representation for T.



We know that we can factor the polynomial P_{u}(T) over C into a product of linear factors



The differential operator D^{2} also has uncountably many eigenvalues and eigenvectors since for a > 0,


The general solution of this operator equation is


Is the preceding true if we replace C[x] with C^{¥}[x], power series? If so, show it. If not, give a counterexample.
If V is finite dimensional, with two bases S_{1} = (u_{1}, ¼, u_{n}) and S_{2} = (v_{1}, ¼, v_{n}), we can consider the matrices of T with respect to the mixed bases. We can indicate this by the various symbols [T]^{S1}, [T]^{S2}, [T]^{S1S2}, and [T]^{S2S1}, where, for example



Another way to say this is:


We can also define the trace of an n x n real or complex matrix A by trace_{m}(A) = å_{i = 1}^{n}A_{ii}, the sum of the diagonal elements of A.
These definitions are consistent, in the sense that trace(T) = trace_{m}([T]), where [T] is the matrix of T with respect to some basis for V. An exercise will be to show that it doesn't matter what basis we use (they all come out the same). Since these definitions are consistent, we will ordinarily write them both the same way, as trace().

If we define the commutator of S and T by [S, T] = ST  TS, then we always have [S, T] ¹ I, the identity operator.


We then have:

Showing that these two definitions are consistent is a fair amount of work (just looking at the formula for det_{m}() should give you some idea). You can look it up if you are interested.

We then have det_{m}(A) = det_{v}(A_{1}, ¼, A_{n}), where A_{k} is the kth column of the matrix A, considered as a vector.
We have that det(T) = det_{m}([T]) = det_{v}([T]_{1}, ¼, [T]_{n}). The fact that there are three different versions of the same thing suggests that many people have worked on this topic, and that this topic occurs in a variety of contexts ...



Suppose W Ì R^{n} and s: W ® R^{n}. We will think of s as a (local) coordinate system on W, or as a change of variables.
The derivative of s at x is the unique operator T (if it exists) satisfying:

If s is differentiable at x, then the matrix of s¢(x) is given by:

