PROBABILISTIC ALGORITHMS FOR TREES by Bruce E. Sagan Department of Mathematics Michigan State University East Lansing, MI 48824-1027 and Yeong-Nan Yeh Depatement de Mathematiques Universite du Quebec Montreal, P.Q. H3C 3P8 Keywords: Rooted trees, algorithm, probability, hooklength, FibonacciAMS Subject Classification (1980): 05C05, 60C05, 68C05 Research supported in part by a Michigan State All University Research Initiation Grant Proposed running head: ALGORITHMS FOR TREES Send proofs to: Bruce E. Sagan Department of Mathematics Michigan State University East Lansing, MI 48824-1027 ABSTRACT Natural labelings of a poset whose Hasse diagram is a rooted tree, t, display many of the enumerative properties of Young tableaux. For example, there is a hook formula for the number,f , of such labelings: f = n!/ P hwhere h is the number of nodes above and including vet. If onerestricts attention to the Fibonacci trees [9] of order n, then S f = n!We provide algorithms for choosing trees at random so that the probability distributions involved give proofs of these two formulae. 1. INTRODUCTION AND DEFINITIONS A roooted tree, t, is a partially ordered set whose Hasse diagram is a tree (in the graph-theoretic sense of the term) havinga unique minimal element called the root, see Figure 1a. If \|t| = n,a natural labeling of t is a bijection T: t L {1,2,...,n} such that v < w in t implies T(v) < T(w). One such labeling is given in Figure 1b. In this case we say T has shape t. We let f represent the number of natural labelings of t. The hook of a node vet is H = { wet | w > v } with corresponding hooklength h = |H |. The hooklengths of our example tree are displayed in Figure 1c. The well known [3] hook FIGURE 1 ABOUT HEREformula for the number of natural labelings states that f = n!/ P h . (1.1)Thus in our example f = 7!/(7)(3)(2)(1) = 120. In Section 2 we will give a simple probabilistic proof of (1.1) inspired by an algorithm of Greene, Nijenhuis, and Wilf [1] for standard Young tableaux. The tree version has previously appeared in [5], but is included here for completeness. An algorithmic derivation of the hook generating function for reverse tree partitions (which specializes to (1.1) as the variable approaches 1) can be found in [6]. A Fibonacci tree [9] is a finite lower order ideal of the infinite poset in Figure 2a. The name derives from the easily FIGURE 2 ABOUT HERE proved fact that the number of Fibonacci trees with n nodes is the n Fibonacci number. For example, Figure 2b shows the 5 Fibonacci trees with 4 nodes. Let F be the set of all Fibonacci trees with n nodes, then S f = n! (1.2) Formula (1.2) has a bijective proof due to Bender (reported in [9]). In Sections 3 and 4 below we will give two constructions that build a labeled tree teF with probability f /n!, thus proving (1.2) twice. The first algorithm constructs the tree "from without" as done for tableaux in another paper of Greene et al. [2]. The second builds the tree "from within" and is based on work of Pittel [4]. 2. CHOOSING A LABELING UNIFORMLY Let t be a fixed shape with n nodes. The following algorithm can be used to choose a labeling of t. GNW1. Pick a node vet uniformly at random, i.e., with probability 1/n. GNW2. If v is maximal (a leaf) then let T(v) = n and return to GNW1 with t and n replaced by t - {v} and n-1 respectively (unless there are no nodes left, in which case the algorithm halts). GNW3. If v is not maximal then choose a different node weH uniformly at random, i.e., with probability 1/(h -1), and return to GNW2 with w in the role of v. A sequence of nodes generated in the process of finding a vertex to be labeled (in this case by the loop between GNW2 and GNW3) is called a trial. An example of a typical trial is given in Figure 3. FIGURE 3 ABOUT HERE Theorem_1. If t is a fixed rooted tree with n nodes then GNW1-3produces all labelings of t uniformly at random. In fact the probabilty of any given labeling is P h / n! Proof. Let w be any maximal element of t and let W be the set of vertices on the unique path from w to the root of t (excluding w itself). Note that these are the only vertices whose hooklengths are changed if w is removed from t during GNW2. Therefore by induction it suffices to show that the probability that w gets label n is P(w) = (1/n) P h /(h -1) = (1/n) P (1 + ). But 1/n is the probability of choosing an initial node and each term in the expansion of the product corresponds to the probability of a unique trial ending in w. p As an immediate corollary we have Corollary_2. The number of labelings of a given tree t with n nodes is f = n!/ P h . p 3. FIBONACCI TREES GROWN FROM WITHOUT It will be convenient to introduce coordinates for the infinite tree of Figure 2a. Let the nodes of the "spine" be (i,0)for i = 0,1,2,... while the leaves are denoted by (i,1) for the same range of i. Now any Fibonacci tree can be specified by its coordinates as is done in Figure 4a. FIGURE 4 ABOUT HERE Given any vertex v = (i,j) then v has associate v' = (i,1-j).If t is a Fibonacci tree with spine of length s then the associated tree is t' = { v = (i,j) | v'e t or i = s+1},see Figure 4b where the associated tree's nodes are the open circles. Note that t' is "upside down" with root r = (s+1,1). Now suppose we wish to build a labeled Fibonacci tree, T.Assume that the first m-1 vertices of T have already been constructed and given the labels 1,..,m-1. Let t be the current shape of T with associate t' whose root is r. To add a node labeled m to T we proceed as follows. WNG1. Chooose a v e t' - {r} uniformly at random. WNG2. If v m t then add v to t with label m amd halt. WNG3. If v e t, say v = (i,j), then return to WNG1 with t' replaced by t' - [{ (i',j') | i' < i }. Figure 5 presents an example of a trial generated by WGN1-3. FIGURE 5 ABOUT HERE If this procedure is used iteratively for m = 1,2,...,n to produce a labeled Fibonacci tree then let P(T) be the probability that labeling T is created. Thus the total probability of producing a given shape t is P(t) = S P(T) where the sum is over all labelings T of t. Theorem_3. If t is a Fibonacci shape with n nodes then iteration of WNG1-3 produces all labelings of t with total probability P(t) = f / n! Note. It is not true that WNG1-3 produces each labeling of t with probability P(T) = f / n! Proof. Let t have leaves w , w , ..., w and define the subtrees t = t - {w } for all i. Let P(w |t ) denote the probability that w gets labeled n after the algorithm constructs some labeling of t . Hence by the definitions above and induction P(t) = S P(t ) P(w |t ) = S (f / (n-1)!) P(w |t ). (3.1) Let the w be arranged in order of increasing first coordinate, i.e., w = (a ,1),..., w = (a ,1), w = (a ,j) where a < ... < a and j may be 0 or 1. We need a couple of lemmas to help compute the quantities in (3.1). Lemmma_4. Let t and the w be as above, then k-1 f = P (n - a - i). Proof_(of_Lemma_4). Using the hook formula (Corollary 2) we see that every term in the n! is canceled by a hook of t except those in the product above. p Lemma_5. Let t and the w be as above, then i-1 P(w |t ) = (1/n) P (1 + ). Proof_(of_Lemma_5). Initially we can pick any one of the n nodes in t' - {r}. Any trial ending at w can only pass through those w with j < i and their associates w'. Landing on either of these two reduces the number of available nodes in t' - {r} to n-a -j-1, accounting for the second term of the binomial above. p For notational convenience let b = n - a - i. Hence by Lemma 4: f = b b ...band f = (b -1)...(b -1)b ...b .Also from Lemma 5: P(w |t ) = (1/n)(1 + )...(1 + ).Thus f P(w |t )/(n-1)! = (1/n!) { P (b -1) } { P b }.Plugging this expression into (3.1), we see that the sum of products telescopes (from the right hand end) so that P(t) = b ...b /n! = f /n!as desired. p The obvious corollary is Corollary_6. S f = n! p We should also note that this algorithm has a "zone effect" similar to the original one for Young tableaux. Specifically if v = (a,1) and w = (b,1) with a < a,b < a then by Lemma 5 we have P(v|t) = P(w|t). This observation will be useful in the next section. 4. FIBONNACI TREES GROWN FROM WITHIN Given v e t then v is a singleton if v' m t and a doubleton otherwise. In Figure 6a the singletons are (0,0),(3,0),(4,0) and (6,0) with the rest of the vertices being doubletons. If FIGURE 6 ABOUT HERE t has a spine of length s then the corresponding extended tree is t" = t u {v'|vet is a singleton} u {(s+1,0)},see Figure 6b. The elements of t" - t are organized into zones, which are maximal strings of vertices with consecutive first coordinates. Zones are numbered from the bottom up starting with zone 0, e. g., in Figure 6 Z = {(0,1)}, Z = {(3,1),(4,1)} andZ = {(6,1),(7,0)}. In the same way, the doubletons of t are grouped into bands with band i directly below zone i. In our example the bands are B = o, B = {(1,0),(1,1),(2,0),(2,1)} and B = {(5,0),(5,1)}. Finally it will be convenient to have a total order on the vertices. If v = (i,j) and w = (x,y) then we will write v < w if i < x or j < y. Now given a labeled Fibonacci tree, T of shape t, on m-1 nodeswe find a node of w e t"-t to label m by constucting a trial as follows. As usual " := " is the Pascal assignment symbol. P1. Let v := (0,0) with probability 1. Let the set of predecessors of v be P := o. P2. Set P := P u {v}. P3. Pick w uniformly at random from among the set, D, of possible direct successors of v = (i,j) defined by : (a) if v is a doubleton then D = { w e t"-P | w > v }. (b) if v is a singleton then let B be the band of largest index containing an element of P and let b be the maximum node of B (with respect to < ). In this case D = { w e t"-P | w > b } - { w a singleton | w < v }. If B does not exist, i.e., P consisits only of singletons up to this point, then we take b = (0,0). P4. If w e t"-t then halt, else return to P2 with w := v. Note that the trials generated by P1-4 do not necessarily respect the partial order in t and the sequence of D's computed in P3 is not ordered by containment. For example if a trial in the tree of Figure 6 has begun (0,0), (4,0) then the next node could be any one in t" except the two initial nodes and (3,0). If the trial continues to (1,1) then any non-trial vertex (i,j) with i>1 is available for the next choice, including (3,0). However if the trial begins (0,0), (1,1), (4,0) then the only possible successors are vertices (3,1), (4,1), (5,0), (5,1) and (6,0). Nevertheless, these rules do provide the desired distribution. Theorem_7. If t is a Fibonacci shape with n nodes then iteration of P1-4 produces all labelings of t with total probability P(t) = f / n! Proof. It suffices to show that Lemma 5 is still true when using P1-4. It will be convenient to reformulate the Lemma slightly for this setting. Let s be a Fibonacci tree with n-1 nodes and leaves w , w , ... with first coordinates a < a < ..., Lemma_8. With s as above and w e s"-s in the k zone thenthe probability of terminating a P1-4 trial at w is P(w) = (1/n) p (1 + ). Proof_(of_Lemmma_8). Induct on k. We will provide an explicit proof of the induction step, the anchor step being similar. The trials t: v = (0,0), v ,..., w are of two types: those that pass through an element of B and those which do not. The latter are in bijective probability preserving correspondence with trials v , v , ..., w' where w' e B . In the former case, if v e B is the first such node then v , ..., v , w' is a legal trail having the same probability as the initial segment of t. We will show below that the sum of the probailities, P, of all possible final segments v , v , ..., w is independent of both the particular node of B and the initial history of t! Thus by induction it suffices to demonstrate that 1 + P |B | = p (1 + ). But the right side above telescopes to (s + |B |) / s, where s is the denominator corresponding to the largest leaf in B which has coordinates (a,1) say. It is easy to see that if we consider the subtree s = {(i,j) e t | i > a} then s = |s| + 1. Hence to finish the proof of the theorem we need only show: Lemma_9. Let t, v = v , P, and s be as above. Then P is independent of the set of nodes on t prior to v and of v itself (as long as v e B ). In fact P = 1/(|s| + 1). Proof_(of_Lemma_9). Let {v=u < u < ... < u } be the set of all possible vertices that could appear on t from v up to (but not including) w, i.e., the set of all elements above v which are either elements of B or singletons not previously on t. Because of these restrictions, the set of direct successors, D(u ), does not depend on the previous u chossen and in fact we have D(u ) = {u | j > i} u {v e s" | v is not a singleton in s} = D(u ) - {u }.Thus |D(u )| = |{v e s" | v is not a singleton in s}| = |s| + 1and |D(u )| = |D(u )| - 1.Hence P = (1 + ) ...(1 + ) = as desired. p Of course Theorem 7 gives another proof of Corollary 6. 5. REMARKS AND OPEN QUESTIONS Another point of similarity between Fibonacci trees and standard tableaux is the formula S f = I (5.1)where I is the number of involutions in the symmetric group S . The correspondence of Bender [9] mentioned in the introduction also proves (5.1). Is there a probabilistic way to demonstrate this, either for trees or tableaux? A third family of posets that displays behavior similar to that of standard tableaux and roooted trees are the shifted standard tableaux [3]. The shifted analog of the hook formula (1.1) has been proved probabilistically by one of us [7]. It would be interesting to find an aleatory proof of the "sum of squares" equation in the shifted case (see [8] for the exact formula). Finally, tableaux and shifted tableaux are intimately connected with representations of S . Ordinary tableaux give the degrees of ordinary irreducible representations (using matrices in GL ) while their shifted cousins are related to projective ones (those using PGL , the projective linear group). In this setting the analog of (1.2) expresses the fact that the sum of the squares of the irreducible degrees equals the order of the group. Can (1.2) itself be recast in this light? Specifically, is there a group of matrices G such that the degrees of the irreducible representations r:S L G are given by the f ? REFERENCES 1. C. Greene, A. Nijenhuis, and H.S. Wilf, A probabilistic proof of a formula for the number of Young tableaux of a given shape, Adv. in Math 31 (1979) 104-109.2. C. Greene, A. Nijenhuis, and H.S. Wilf, Another probabilistic method in the theory of Young tableaux, J. Combin. Theory Ser. A 37 (1984) 127-135.3. D.E. Knuth, "The Art of Computer Programming: Sorting and Searching," Vol. 3, Addison-Wesley, Reading, Mass., 1973.4. B. Pittel, On growing a random Young tableau, J. Combin. Theory Ser. A 41 (1986) 278-285.5. B. Sagan, "Partially ordered sets with hooklenghts : an algorithmic approach," Ph.D. thesis, M.I.T. (1979).6. B. Sagan, Enumeration of partitions with hooklengths, European J. of Combin 3 (1982) 85-94.7. B. Sagan, On selecting a random shifted Young tableau, J. Algorithms 1 (1980) 213-234.8. B. Sagan, Shifted tableaux, Schur Q-functions and a conjecture of R. Stanley, J. Combin Theory Ser. A, to appear.9. R.P. Stanley, The Fibonacci Lattice, Fibonacci Quart. 13 (1975) 215-232. 7 5 6 1 1 1 3 2 4 3 1 2 1 7 (a) (b) (c) FIGURE 1 (a) (b) Fibonacci trees FIGURE 2 vet:Prob(v): 1/11 1/6 1/2 A GNW trial FIGURE 3 (2,1) (3,0) (2,0) (0,1) (1,0) (0,0) (a) (b) Coordinates and the associated tree FIGURE 4 vet:Prob(v): 1/7 1/4 1/2 A WNG trial FIGURE 5 (a) (b) A tree and the extended tree FIGURE 6