First we recall the definition of the Albert algebra (or exceptional Jordan algebra) over the field \(F\) (which we shall take to either the real numbers \(\mathbb R\) or the finite field \(\mathbb F_q\) of order \(q\), where \(q\) is odd). It consists of \(3\times 3\) Hermitian matrices over the relevant version of the octonions (in the real case we may take either the split or the compact form). We write \[(a,b,c\mid A,B,C) = \pmatrix{a&C&\overline{B}\cr \overline{C}&b&A\cr B&\overline{A}&c}.\] The Jordan product \(X\circ Y\) of two such matrices is \(\frac12(XY+YX)\), in terms of the ordinary matrix product \(XY\). It can be readily checked that the algebra is closed under this multiplication.
The group \(F_4(F)\) (or \(F_4(q)\) in the finite case) is defined as the automorphism group of the algebra, or equivalently the group of \(F\)-linear maps which preserve the trace and the determinant: \[\det(a,b,c\mid A,B,C) = abc-aA\overline{A}-bB\overline{B}-cC\overline{C} +\Re(ABC)+\Re(CBA).\] (Note that in the real case the two different forms of the octonions give two non-isomorphic groups \(F_4(\mathbb R)\).) Similarly, the (simply-connected) group of type \(E_6\), which I shall denote \(SE_6(F)\), is the group of \(F\)-linear maps which preserve the determinant.
In [DM] Dray and Manogue show that for certain matrices \(M\) written over (any) complex subalgebra of the (real) octonions, the operation \(X\mapsto \overline{M}^\top X M\) makes sense, and preserves the Albert algebra. Notice that, because \(M\) is defined over a copy of the complex numbers, each entry in \(\overline{M}^\top X M\) is a sum of terms of the form \(m_1xm_2\), where \(m_1\) and \(m_2\) lie in a copy of the complex numbers. Hence \(m_1(xm_2)=(m_1x)m_2\), so the operation is well-defined.
Note that, since \(M\) is defined over the complex numbers, \(\det(M)\) is well-defined. It turns out that such a complex matrix \(M\) preserves the determinant if and only if \(\det(M)=\pm1\). The authors of [DM] told me they proved this via a brute force computation with Mathematica. In the sequel I shall not use this computation, instead verifying for individual matrices \(M\) that they preserve the determinant where necessary.
Now in order for \(M\) to preserve the identity element of the algebra, and hence to lie in \(F_4(F)\), it is necessary and sufficient to have the extra condition \(\overline{M}^\top M = I\). (Note: if the determinant is preserved, and the identity matrix is fixed, then using the trilinear form \(T\) obtained by polarizing the determinant, the trace of \(X\) is \(T(I,I,X)\), so is also preserved.)
We start with groups of type \(E_6\), and first write down enough complex matrices of determinant 1 to generate the group.
First we take diagonal matrices \[M=\mathrm{diag}(u,\overline{u},1)=\pmatrix{u&0&0\cr 0&\overline{u}&0\cr 0&0&1},\] where \(u\overline{u}=1\). These act as \[(a,b,c\mid A,B,C)\mapsto(a,b,c\mid uA, Bu, \overline{u}C\overline{u}),\] so generate the spin group \(2^2.P\Omega_8^+(q)\) (for \(q\) odd), as noted on page 150 of [FSG]. The duality and triality automorphisms are given by permutation matrices. Moreover, all these matrices preserve the determinant: the hardest part of this easy calculation is to show that the real part of \((uA)(Bu)(\overline{u}C\overline{u})\) is equal to the real part of \(ABC\). But by the Moufang law, \((uA)(Bu)=u(AB)u\), and then \((u(AB)u)(\overline{u}C\overline{u})\) is the real inner product of \(u(AB)u\) with \(u\overline{C}u\), which is equal to the inner product of \(AB\) with \(\overline{C}\), that is the real part of \(ABC\).
One way to extend to \(SE_6(q)\) (that is, \(3\cdot E_6(q)\) if \(q\equiv 1\bmod 3\), or \(E_6(q)\) otherwise) is to adjoin a matrix such as \[M=\pmatrix{1&1&0\cr 0&1&0\cr 0&0&1}.\] It is a straightforward exercise to show that this maps \((a,b,c\mid A,B,C)\) to \[(a,a+b+C+\overline{C},c\mid A+\overline{B},B,a+C).\] In particular, we see that the trace is not in general preserved. However, using the formula given above it is straightforward to check that the determinant is preserved. Indeed, the individual terms of the determinant are as follows: \begin{eqnarray*} abc&\mapsto& abc + a^2c + ac(C+\overline{C})\cr -aA\overline{A}&\mapsto& -aA\overline{A}-aAB-a\overline{B}.\overline{A}-a\overline{B}B\cr -bB\overline{B}&\mapsto&-bB\overline{B}-aB\overline{B}-(C+\overline{C})B\overline{B}\cr -cC\overline{C}&\mapsto&-cC\overline{C}-ac(C+\overline{C})-a^2c\cr (AB)C&\mapsto& (AB)C+(\overline{B}B)C+a\overline{B}B +aAB\cr (\overline{A}.\overline{B})\overline{C}&\mapsto& (\overline{A}.\overline{B})\overline{C}+a\overline{B}B+(B\overline{B})\overline{C} +a\overline{B}.\overline{A} \end{eqnarray*} and it is easy to see that all the terms except those in \(\det(a,b,c\mid A,B,C)\) cancel out.
There are other nice elements such as the diagonal matrices \(\mathrm{diag}(a,b,c)\) with \(a,b,c\in\mathbb F_q\) and \(abc=1\). Now if two matrices \(M\) and \(N\) both lie in \(SE_6(q)\), and are both written over the same 2-dimensional subalgebra of the octonions, then there is sufficient associativity to show that the action of \(M\) followed by the action of \(N\) is the same as the action of \(MN\). In other words, we can multiply together the generators of \(SE_6(q)\) as long as the entries stay within the same 2-dimensional subalgebra. In particular, we can make the commutator of the new generator given above with suitable diagonal elements \(\mathrm{diag}(u^{-1},1,u)\) to make some more root elements, with \(1-u\) in the off-diagonal position. For example we may take \(u=1+x_t\). In this way we obtain 48 root groups by putting \(\lambda x_t\) (for arbitrary \(\lambda\in F\) and fixed \(t\)) in one of the six off-diagonal positions. These correspond to the 48 roots which are outside \(D_4\). (In characteristic 2 this calculation does not directly give the root elements with \(t=4\) or 5, since \(1+x_4\) and \(1+x_5\) are not invertible in characteristic 2, but by doing the calculations in characteristic 0 we see that these root elements preserve the characteristic 0 determinant, and so this remains true after reduction modulo 2.)
In the second case we may have all three of \(A,B,C\) being non-zero, in which case there are \((q^4-1)(q^3+1)=q^7+q^4-q^3-1\) choices for \(A\), and then the condition \(AB=0\) leaves \(q^4-1\) choices for \(B\). The conditions \(BC=0\) and \(CA=0\) leave \(q^3-1\) choices for \(C\), making \((q^4-1)^2(q^6-1)\) in total. Similarly, if just two of \(A,B,C\) are non-zero, there are \(3(q^4-1)^2(q^3+1)\) choices; and if just one is non-zero, there are \(3(q^4-1)(q^3+1)\) choices. Adding together these six expressions gives the total \((q^9-1)(q^8+q^4+1)\) as claimed on page 168 of [FSG].
It is now a straightforward exercise to show that the group generated by the given matrices acts transitively on the set of white vectors, or the white points, that is the 1-spaces spanned by white vectors. On the other hand, we have not shown that the group of linear transformations which preserve the determinant also preserves this set. To prove this, we show that the sum of any two white points has determinant 0, and that the given set is maximal with respect to this property.
First consider the stabilizer of the white point spanned by \((1,0,0\mid 0,0,0)\). As this is invariant under a group of order \(q^{16}\) generated by elements of the shape \[\pmatrix{1&0&0\cr x&1&0\cr 0&0&1}\mbox{ and } \pmatrix{1&0&0\cr 0&1&0\cr y&0&1},\] we easily see a group of shape \(q^{16}{:}C_{q-1}.\mathrm{PO}_{10}^+(q)\) fixing this white point. Moreover, it is straightforward to show that this group has two orbits on the remaining white points, one consisting of the points spanned by \((a,0,0\mid 0,B,C)\) with \(\overline{B}B=\overline{C}C=0\) and \(BC=0\). This orbit contains \(2q(q^4-1)(q^3+1)\) vectors with either \(B=0\) or \(C=0\), and \((q-1)(q^4-1)(q^3+1)(q^4-1)\) vectors with \(B\) and \(C\) both non-zero, making \(q(q^3+1)(q^8-1)\) in all, that is \(q(q^3+1)(q^8-1)/(q-1)\) points. The points in this orbit visibly have the property that every point in the line spanned by it and the fixed point is a white point, while the points in the other orbit do not have this property. Let us call two white points adjacent if they have this property.
Now we may pick an arbitrary point in the other orbit, say that spanned by \((0,1,0\mid0,0,0)\), and pick an arbitrary vector in the space spanned by this point and the fixed point, say the vector \((1,1,0\mid0,0,0)\). Since this has determinant zero, it follows that the sum of any two white vectors has determinant zero. We now show that our set of white vectors is maximal with respect to this property. Suppose that \((a,b,c\mid A,B,C)\) is any other matrix with zero determinant. Now the action of the orthogonal group on pairs of triality-related 8-spaces is well understood, and thus by applying diagonal elements we may assume that (up to scalar multiplication) \((A,B)=(1,1)\), \((1,x_4)\), \((x_4,1)\), \((x_4,x_4)\) or \((x_4,x_5)\), or else one of \(A\) and \(B\) is zero. If \(B=1\) or \(B=A\) we can apply a root element to get \(A=0\). Similarly, if \(A=1\) we get \(B=0\). If \((A,B)=(x_4,x_5)\) a suitable root element gives \((A,B)=(x_5,x_5)\) and we are in a previous case. So without loss of generality we may assume \(A=0\).
If \(bc\ne0\), then adding \((1,0,0\mid0,0,0)\) gives a matrix with non-zero determinant. So we may assume \(c= 0\). Similarly if \(ca-B\overline{B}\ne 0\) we may add \((0,1,0\mid0,0,0)\), so we may assume that \(B\overline{B}=ca=0\). Similarly, we may assume \(C\overline{C}=ab\). Applying elements of the diagonal group again to \(B\) we may now assume that (up to scalars) \(B=x_4\) or \(B=0\). Finally, we have \(b\ne0\), for otherwise this is a white point, and therefore adding the white vector \((0,0,0\mid 0,-x_4,0)\) (if \(B=x_4\)) or \((0,0,\mid0,1,0)\) (if \(B=0\)) gives a matrix with non-zero determinant.
This concludes the proof of the fact that the group of linear maps which preserve the determinant also fixes the set of \((q^8+q^4+1)(q^9-1)/(q-1)\) white points enumerated above.
Next we show that the stabilizer of the white point spanned by \((1,0,0\mid0,0,0)\) is no bigger than the group exhibited above. First note that this stabilizer fixes the 17-space of matrices of the form \((a,0,0\mid 0,B,C)\). Hence it acts on the 10-dimensional quotient space. Now the trilinear form obtained by polarizing the determinant induces a bilinear form on this quotient, by substituting the original white vector as the first variable. This bilinear form is invariant up to scalar multiplication, and therefore the action of the point stabilizer on the 10-dimensional quotient can be no bigger than already given. In particular, any element of the kernel of this action maps \((0,1,0\mid 0,0,0)\) to a matrix of the form \((0,1,0\mid0,0,C)\), and maps \((0,0,1\mid0,0,0)\) to \((0,0,1\mid0,B,0)\). But we already have a group of order \(q^{16}\) permuting these pairs of matrices regularly, so we may assume that the two white points spanned by \((0,1,0\mid0,0,0)\) and \((0,0,1\mid0,0,0)\) are fixed. Now the white points which are adjacent to both of these span the 8-space \((0,0,0\mid A,0,0)\), so this 8-space is fixed. Similarly the 8-spaces \((0,0,0\mid 0,B,0)\) and \((0,0,0\mid0,0,C)\). As the white points are just the isotropic vectors in these 8-spaces, the action on any one of them can be no more than the orthogonal group already exhibited. Hence we may assume that our element of the kernel acts trivially on one: say on the \((0,0,0\mid A,0,0)\). Now we have a large number of pairs of non-adjacent white points which are fixed, and for every one of these pairs, the 8-space of white points which are adjacent to both is also fixed. This is enough to show that the kernel of the action is no bigger than the group already exhibited.
As a consequence, we have now proved the formula for the order of \(SE_6(q)\).
We have already shown that there is a unique orbit of the group on pure white 1-spaces and 2-spaces, so we may take the latter to be spanned by \((1,0,0\mid0,0,0)\) and \((0,0,0\mid 0,x_1,0)\). Our space contains white vectors of shape \((0,0,0\mid 0,B,C)\), and \(B\) lies in some totally isotropic subspace of the octonions, which may be taken to be one of \(\langle x_1,x_2\rangle\), \(\langle x_1,x_2,x_3\rangle\), \(\langle x_1,x_2,x_3,x_4\rangle\) or \(\langle x_1,x_2,x_3,x_5\rangle\). Then \(C\) lies in the corresponding annihilator \(\langle x_1,x_2\rangle\) (in the first case) or \(\langle x_1\rangle\) (in the second and last cases) or 0 (in the third case). All of these contain at least a 4-space in common with the above 6-space, and by transitivity on these 4-spaces, we see that there is just one more orbit on maximal white subspaces, with representative the 5-space spanned by \((1,0,0\mid0,0,0)\) and \((0,0,0\mid0,B,0)\) with \(B\in\langle x_1,x_2,x_3,x_4\rangle\).
Since any pure white 4-space is contained in a unique pure white 6-space, this leaves us with five different white spaces whose stabilizers we should investigate. Let us call these \(W_1\), \(W_2\), \(W_3\), \(W_5\) and \(W_6\), with the subscript denoting the dimension, as in (4.137) of [FSG]. Our representatives are slightly different from [FSG], however: we take \((1,0,0\mid0,0,0)\) to span \(W_1\), extend to \(W_2\) by adjoining \((0,0,0\mid0,x_1,0)\) and then to \(W_3\) by adjoining \((0,0,0\mid0,x_2,0)\). To obtain \(W_5\) from \(W_3\), adjoin \((0,0,0\mid0,x_3,0)\) and \((0,0,0\mid0,x_4,0)\), and to obtain \(W_6\) from \(W_3\) adjoin \((0,0,0\mid0,x_3,0)\), \((0,0,0\mid0,x_5,0)\) and \((0,0,0\mid0,0,x_1)\).
By adjoining appropriate root groups to the subgroup of \(2^2.\mathrm{P}\Omega_8^+(q).S_3\) which fixes \(W_i\), it is easy to obtain generators for the stabilizers. These turn out to be five of the six maximal parabolic subgroups. The other maximal parabolic subgroup fixes the 10-space \(W_{10}\) defined by adjoining to \(W_5\) the \(5\)-space spanned by \((0,1,0\mid0,B,0)\) with \(B\in\langle x_5,x_6,x_7,x_8\rangle\).
As noted in [FSG], the normalizer of a maximal torus can be found inside \(2^2.\mathrm{P}\Omega_8^+(q).S_3\). If we take the diagonal elements \(\mathrm{diag}(u,\overline{u},1)\) and \(\mathrm{diag}(1,u,\overline{u})\) with \(u=\lambda x_1+\lambda^{-1}x_8\), \(\lambda x_2+\lambda^{-1} x_7\), \(\lambda x_3+\lambda^{-1} x_6\) and \(\lambda x_4+\lambda^{-1} x_5\), and adjoin the coordinate permutations, then we obtain the normalizer of a maximal split torus.
The long root elements also lie in \(2^2.\mathrm{P}\Omega_8^+(q)\). For example we may take the product of the three group elements given by $$\mathrm{diag}(1+x_1,1-x_1,1), \mathrm{diag}(1-\lambda x_2,1+\lambda x_2,1), \mathrm{diag}(1-x_1+\lambda x_2,1+x_1-\lambda x_2,1)$$ to give the long root element displayed in (4.104) of [FSG].
In characteristic 2, the usual definitions are more complicated, and are not given in [FSG]. In particular, the Jordan multiplication is doubled to \(X\circ Y=XY+YX\), except for the product \(X\circ X=XX\) which remains but is denoted \(X^2\) to avoid confusion. Similarly, the inner product \(\mathrm{Tr}(X\circ Y)\) now becomes the associated symmetric bilinear form of a quadratic form \(\mathrm{Tr}(X^2)\). However, the given matrices and their actions still make sense in characteristic 2, so an alternative is to define \(F_4(q)\) as the group generated by the actions of these matrices.
The classification of primitive idempotents (in odd characteristic) given on page 158 of [FSG}, shows that, up to coordinate permutation, they are all of the form \(c(d,c^{-1}-1-d,1\mid\overline{y},x,\overline{x}y)\), with \(x\overline{x}=d\) and \(y\overline{y}=c^{-1}-1-d\). This may be written as \(c\pmatrix{\overline{x}\cr\overline{y}\cr1}\pmatrix{x&y&1}\).
In characteristic 2, again, the classification of primitive idempotents given in [FSG] does not work. But we can now define the primitive idempotents to be the matrices \(c\overline{v}^\top v\), where \(v\) and \(c\) are as listed above, and then everything works in characteristic 2 also. However, it is now not obvious that \(F_4(q)\) preserves the set of primitive idempotents, so this has to be checked separately. Invariance under the elements \(\mathrm{diag}(u,\overline{u},1)\) follows immediately from the Moufang laws, and the set is by definition invariant under coordinate permutations. It suffices therefore to check one more generator. Applying the short root element given above to the given primitive idempotent, we need to check that the result (let us call it \(c(p,q,r\mid P,Q,R)\)) satisfies the defining conditions \(PQR=pq\), \(Q\overline{Q}=p\), \(P\overline{P}=q\) and \(c(p+q+r)=1\). We leave this as an exercise for the reader. Similarly, one needs to check the primitive idempotents obtained by permuting the coordinates: however, this calculation can be much simplified by noting that we may assume they are of the form \((a,b,0\mid A,B,C)\). (An alternative is to use the result, proved above, that \(E_6(q)\) acts on the set of `white points', and that the primitive idempotents are just the white vectors with trace 1.)
It is now clear that \(F_4(q)\) acts transitively on the primitive idempotents. To calculate the group order we need only calculate the order of the stabilizer of one of the primitive idempotents. We already know that, for \(q\) odd, the stabilizer of \((1,0,0\mid0,0,0)\) contains \(2.\mathrm{P}\Omega_9(q).2\), acting as the orthogonal group on the 9-space of matrices \((0,b,-b\mid A,0,0)\). Moreover, the stabilizer fixes this 9-space, and it is clear that the action cannot be any bigger than the orthogonal group. It remains therefore to show that the kernel of the action has order 2, generated by \((a,b,c\mid A,B,C)\mapsto (a,b,c\mid A,-B,-C)\). Now every primitive idempotent of the form \((1,b,-b\mid \overline{y},x,\overline{x}y)\) is fixed, so its eigenspaces are invariant. The intersections of these eigenspaces with the space of \((0,0,0\mid0,B,C)\) show that any element of the kernel acts on this 16-space over \(F\) as a scalar, and therefore as \(\pm1\).
Essentially the same argument works in characteristic 2. In particular the formula for the group order is the same in characteristic 2 as in odd characteristic, namely $$|F_4(q)|=q^{24}(q^{12}-1)(q^8-1)(q^6-1)(q^2-1).$$
Then adjoin to \(F_4(q)\) the matrix $$M=\pmatrix{x&0&0\cr 0&x^q&0\cr 0&0&1},$$ where \(x\in \mathbb F_{q^2}\setminus \mathbb F_q\) satisifies \(x^{1+q}=1\). More generally, take matrices $$M=\pmatrix{a&b&0\cr -b^q&a^q&0\cr 0&0&1},$$ where \(a^{1+q}+b^{1+q}=1\).
The extra root elements are given by matrices like $$\pmatrix{1&\lambda x_1&0\cr -\lambda^q x_1&1&0\cr 0&0&1}.$$
With a certain amount of calculation it is now possible to show that this group has exactly three orbits on the white points for \(E_6(q^2)\). The lengths of these orbits are as follows:
[FSG] R. A. Wilson, The finite simple groups, Springer GTM 251, 2009.