# A Calculus for Network Delay, Part II: Network Analysis Rene L. Cruz, Senior Member, IEEE Abstract—In Part I of this paper, several network elements were defined that can be used to model a wide variety of communication networks. A method to analyze the flow of data in a network consisting of the interconnection of these network elements is now presented. Assuming the data that enters the network satisfies burstiness constraints, burstiness constraints are derived for traffic flowing between network elements. These derived constraints imply bounds on network delay and buffering requirements. By example, it is shown that the use of regulator elements (defined in Part I) within the network can reduce maximum network delay. It is also found that such a use of regulator elements can enlarge the throughput region where finite bounds for delay are found. Finally, it is shown how regulator elements connected in series can be used to enforce general burstiness constraints. Index Terms — Queueing networks, burstiness, flow control, packet switching, high speed networks. #### I. Introduction COMMUNICATION NETWORK MODELS consisting of the interconnection of network elements defined and analyzed in isolation in Part I [7] are now considered in this paper. Networks operating under a fixed routing strategy are considered. A method is presented for calculating burstiness constraints (as studied in Part I [7]) for traffic flowing between network elements, assuming that the traffic which enters the network from external sources satisfies burstiness constraints. Roughly speaking, if the traffic entering the network is not too bursty, then the traffic flowing inside the network is also not too bursty. These derived burstiness constraints imply bounds on network delay and buffering requirements, using the results of Part I. Examples are given to illustrate the method. We consider the use of regulator elements within the network and find that such a use allows us to find much smaller bounds for network delay than we can otherwise obtain. In some cases such a use allows us to derive a larger throughput region where our bounds for network delay are finite. We show by an example that such internal regulation can indeed significantly reduce maximum network delay. The remainder of this paper is organized as follows. All notation used in this paper is identical to that of Part I. In Section II we present a method for analyzing an arbitrary Manuscript received November 30, 1987; revised November 24, 1989. This work was supported by the National Science Foundation under Contracts NSF NCR-8804029 and NSF ECS 83 52030 (with matching funds provided by AT&T), and by a National Science Foundation Graduate Fellowship. This work was presented in part at IEEE INFO-COM '88, New Orleans, LA, March 1988. This work is based in part on a Ph.D. thesis at the University of Illinois at Urbana-Champaign [1]. The author is with the Department of Electrical and Computer Engineering, University of California at San Diego, Mail Code R-007, LaJolla, CA 92093. IEEE Log Number 9039286. interconnection of network elements. In Section III we consider two contrived but illustrative examples of the method presented in Section II. In Section IV we apply the analysis to a class of multistage networks and investigate the effect of employing regulation mechanisms inside the network. In Section V we consider a more elaborate regulation mechanism which results from the series interconnection of the regulation elements studied in Part I of this paper. Finally, we conclude in Section VI with some closing remarks. ### II. Analysis of Networks with Arbitrary Topologies We consider a model of a communication network operating in a packet switched mode under a fixed routing strategy. The model consists of the interconnection of network elements. Specifically, the network elements can be delay lines, receive buffers, demultiplexers, multiplexers, $(\sigma, \rho)$ regulators, or FIFO queues. We assume the elements are connected via error-free point-to-point communication links. We assume that there are S "sessions" labeled by the integers 1 through S. There are M network elements, labeled 1 through M. Each switching node in the network is modeled by one or more network elements. Each session consists of the data traffic which originates at some given node, exits at some other given node, and travels along some fixed route between those nodes. We assume that traffic from session k passes through $H_k$ network elements before exiting the network. Let $$H = \max_{k = 1, 2, \dots, S} H_k. \tag{2.1}$$ Let P(s,h) be the label of the hth network element along the route for session s. Define P(s,h) = 0 if $h > H_s$ . The rate of input traffic to the network for session s is represented by $R_s$ . We assume throughout this section that $$R_s \sim (\sigma_s, \rho_s)$$ (2.2) and that $$R_s(t) = 0 \tag{2.3}$$ for any session s and for all t < 0. For $0 < h \le H_s$ , let $R_s^h$ represent the rate of traffic for session s as it exits network element P(s,h). Define $R_s^0 = R_s$ and $$R_s^h = R_s^{H_s}$$ for $H \ge h > H_c$ . Fix T > 0. For $s = 1, 2, \dots, S$ and $h = 1, 2, \dots, H$ , define $$\sigma_s^h = \max_{0 \le u \le T} W_{\rho_s}(R_s^h)(u). \tag{2.4}$$ We write $R_{\widetilde{T}}(\sigma, \rho)$ if for all s, t such that $s \le t \le T$ there holds $$\int_{s}^{t} R(u) du \le \sigma + \rho(t - s). \tag{2.5}$$ It follows that $$R_{c\tau}^h(\sigma_c^h,\rho_s) \tag{2.6}$$ for all s and h. Note that as long as $T < \infty$ and (2.2) and (2.3) hold, $\sigma_s^h$ is finite. Our objective is to find an upper bound for $\sigma_s^h$ for each value of s and h that is independent of T. These upper bounds then imply that $R_s^h \sim (x_s^h, \rho_s)$ for all s and h, where $x_s^h$ is the upper bound found for $\sigma_s^h$ . Thus we can then use the results of Part I of this paper to find bounds on delay and buffering requirements at every network element and hence bounds on total network delay. We proceed now with obtaining upper bounds for $\sigma_s^h$ that are independent of T. For all sessions s and $0 < h \le H$ , we define the following: INPUTS $$(s,h) = \{(x,y): P(x,y+1) = P(s,h),$$ $$1 \le x \le S, \ 0 \le y \le H - 1\}.$$ (2.7) In words, INPUTS(s,h) is the set of (x,y) such that the traffic from session x enters network element P(s,h) immediately after passing through y network elements. Thus, $R_x^y$ represents the rate of a traffic stream which is an input to the network element P(s,h) if $(x,y) \in \text{INPUTS}(s,h)$ . Given s, h with $P(s, h) \neq 0$ we can use the results of Part I of this paper to analyze the network element P(s, h) in order to find burstiness constraints for $R_s^h$ that are valid before time T, using the burstiness constraints of the input traffic to the element that are valid before time T. Specifically, we can use the analysis of Part I of the paper to find $\hat{\sigma}_s^h$ , which is a function of $\sigma_s^y$ for $(x, y) \in \text{INPUTS}(s, h)$ , such that $$R_{s7}^h(\hat{\sigma}_s^h, \rho_s).$$ (2.8) It follows from (2.8) and (2.4) that $$\sigma_s^h \le \hat{\sigma}_s^h, \qquad 1 \le s \le S, \qquad 1 \le h \le H.$$ (2.9) Since $\hat{\sigma}_s^h$ is a function of $\sigma_x^y$ for $(x, y) \in \text{INPUTS}(s, h)$ , (2.9) describes a feasible region for $\sigma$ , where $\sigma$ is defined as the SH dimensional column vector with components $\sigma_s^h$ for $1 \le s \le S$ and $1 \le h \le H$ . If this region is bounded, we can obtain finite upper bounds, independent of T, for all of the $\sigma_s^h$ as desired. By using the results of the previous paper we can always find $\hat{\sigma}_s^h$ , which satisfies (2.8) such that $\hat{\sigma}_s^h$ is an affine function of $\sigma_x^y$ for $(x,y) \in \text{INPUTS}(s,h)$ . The coefficient of $\sigma_x^y$ in $\hat{\sigma}_s^h$ depends on $\rho_w$ for w such that session w passes through network element P(s,h). (Although the expression for $\hat{\sigma}_s^h$ is simple for any given case, it is complicated to write down an expression for it that holds in general, so we will not.) We can then rewrite (2.9) as $$\sigma \le A\sigma + c, \tag{2.10}$$ where the elements of the matrix A are nonnegative and depend only on $\rho_s$ for all sessions s, and c is a constant vector, independent of $\sigma$ . Now (2.10) can be rewritten as $$(I-A)\sigma \le c. \tag{2.11}$$ If the eigenvalues of A are strictly inside the unit circle, then Fig. 1. Network example 1. $(I-A)^{-1}$ exists and $$(I-A)^{-1} = I + A + A^2 + A^3 + \cdots$$ (2.12) Hence $(I - A)^{-1}$ has nonnegative elements in this case. Thus, we can premultiply (2.11) by $(I - A)^{-1}$ , and the inequality is preserved. Doing this, we obtain $$\sigma \le (I - A)^{-1}c. \tag{2.13}$$ Thus the boundedness of the region described in (2.9) is related to the eigenvalues of the matrix A, which are in turn related to the throughput allocations of the sessions using the network. It can easily be shown that if the throughput allocations are sufficiently small, then all eigenvalues of A are inside the unit circle. For example, if $\rho_s$ is replaced by $\lambda \rho_s$ for all s, all eigenvalues of A are inside the unit circle for $\lambda$ sufficiently small. In many cases we can tighten the above results. Specifically, we can attempt to find more general constraints of the form $R_s^h \sim b_s^h$ for all s, h. Such derived constraints may imply smaller upper bounds on delay and buffering requirements than the simple approach previously taken. Since we want to keep things simple, we do not elaborate further on this. #### III. ILLUSTRATIVE EXAMPLES OF NETWORK ANALYSIS In this section we analyze two examples of specific networks. The examples are contrived, but yet hopefully illustrate the concepts discussed in the previous section. Additional examples appear in Section IV. If there are no "loops" in the traffic pattern generated by the routes of the sessions, then network analysis is quite simple. Indeed, in this case all of the eigenvalues of the matrix A defined in the previous subsection are zero. The analysis in the previous section for general networks need not be explicitly used; instead, the judicious application of the results of Part I can be employed. To illustrate this, consider the network illustrated in Fig. 1. The network has two multiplexers labeled 1 and 2, and one demultiplexer. Both multiplexers are assumed to be work conserving. The output of multiplexer 1 feeds one input of multiplexer 2. All links have transmission rate 1. There are three sessions; the rate of input traffic for session k is represented by $R_k = R_k^0$ . Traffic from sessions 1 and 2 are input to multiplexer 1 on different input links, and pass through both multiplexers before being demultiplexed and exiting the network. Traffic from session 3 enters the network through an input link of multiplexer 2. Since no receive buffers are used, the model for this network example is similar to the "cut-through" model described by Kermani and Kleinrock [2]. Fig. 2. Network example 2. We assume that for each k we have $$R_k \sim (\sigma_k, \rho_k). \tag{3.1}$$ Using Remark 4 after Theorem 4.5 of Part I we have $$R_1^1 + R_2^1 \sim (\sigma_1 + \sigma_2, \rho_1 + \rho_2).$$ (3.2) By Theorem 4.2 (in particular (4.13)) of Part I, $D_1$ is an upper bound to the delay in multiplexer 1, where $$D_1 = \frac{\sigma_1 + \sigma_2}{1 - \rho_1 - \rho_2}. (3.3)$$ Now we have burstiness constraints for traffic at all points in the network, specifically for traffic at the input to multiplexer 2, and hence can again apply Theorem 4.2 of Part I to derive an upper bound $D_2$ for the delay in multiplexer 2: $$D_2 = \frac{\sigma_1 + \sigma_2 + \sigma_3}{1 - \rho_1 - \rho_2 - \rho_3}.$$ (3.4) Thus the total network delay for sessions 1 and 2 is upper bounded by $D_1 + D_2$ , and the total network delay for session 3 is upper bounded by $D_2$ . Upper bounds on buffering requirements at the two multiplexers are also easily derived, since we have derived burstiness constraints for the traffic at the input to all network elements. This style of analysis breaks down when there are loops in the traffic pattern. In this case we can apply the theory developed in the previous section. As an example, consider the network illustrated in Fig. 2. The network consists of the interconnection of four nodes, each of which has two input links and two output links. Each node is modeled by two work-conserving multiplexers and two demultiplexers, as illustrated in Fig. 3. The nodes are labeled 0 through 3, as illustrated in Fig. 2. One output link of node k is fed to one input link of node $k+1 \pmod{4}$ . All links have transmission rate 1. There are 4 sessions. The rate of input traffic for session k is repre- Fig. 3. Model of network node in network example 2. sented by $R_k = R_k^0$ . We assume that $$R_k \sim (\sigma, \rho) \tag{3.5}$$ for all k. In addition, we assume for simplicity that "L=0," i.e., a fluid traffic model. By definition, session k enters the network on an input link to node k and exits the network on an output link of node $k+3 \pmod 4$ after traveling through a total of 4 nodes. Thus there is only one multiplexer in each node that has traffic entering both inputs; we label this multiplexer k if it is contained in node k. For additional simplicity, we assume that data from session k gets the lowest priority at multiplexer k. Thus, if $R_k^j$ represents the rate of traffic for session k after passing through j multiplexers, then for each k we have $$R_k^1 = R_k^2 = R_k^3 = R_k^4. (3.6)$$ This is because packets from each session only have the potential for experiencing a nonzero delay at the first multiplexer encountered, where the packets from the session have the lowest priority. At all other multiplexers encountered, the packets have zero delay since they have priority and since "L=0." We assume $R_k(t) = 0$ for each session k and t < 0. Fix T > 0 and let $$\sigma_k^j = \max_{0 \le u \le T} W_\rho(R_k^j)(u). \tag{3.7}$$ Thus, $$R_{k\tilde{\tau}}^{j}(\sigma_{k}^{j},\rho).$$ (3.8) From (3.6) we obviously have $\sigma_k^1 = \sigma_k^2 = \sigma_k^3 = \sigma_k^4$ for all sessions k. Next we analyze multiplexer k. One input link carries traffic whose rate is represented by $R_k$ , the other input link carries traffic whose rate is represented by $R_{k+2}^1 + R_{k+3}^1$ (using (3.6)), where subscripts are interpreted modulo 4. From (3.8) we have $R_{k+2}^1 + R_{k+37}^1$ ( $\sigma_{k+2}^1 + \sigma_{k+3}^1$ , $2\rho$ ). Thus, from Theorem 4.5 (in particular (4.51)) of Part I, we have $$R_{k7}^1(\hat{\sigma}_k^1,\rho),$$ (3.9) where $$\hat{\sigma}_{k}^{1} = \sigma + \rho \left( \frac{\sigma_{k+2}^{1} + \sigma_{k+3}^{1}}{1 - 2\rho} \right)$$ (3.10) and all subscripts are modulo 4. From (3.7) and (3.9) we have $$\sigma_k^1 \le \hat{\sigma}_k^1. \tag{3.11}$$ Combining (3.10) and (3.11) we obtain $$\sigma \le A\sigma + c, \tag{3.12}$$ where $$\mathbf{\sigma} = \left(\sigma_0^1, \sigma_1^1, \sigma_2^1, \sigma_3^1\right)^T, \tag{3.13}$$ $$\mathbf{c} = (\sigma, \sigma, \sigma, \sigma)^T, \tag{3.14}$$ $$A = \begin{bmatrix} 0 & 0 & \alpha & \alpha \\ \alpha & 0 & 0 & \alpha \\ \alpha & \alpha & 0 & 0 \\ 0 & \alpha & \alpha & 0 \end{bmatrix}, \tag{3.15}$$ and $$\alpha = \frac{\rho}{1 - 2\rho} \,. \tag{3.16}$$ Now all eigenvalues of A are strictly inside the unit circle if and only if $\rho < \frac{1}{4}$ . Thus, if $\rho < \frac{1}{4}$ we have $$\sigma \le (I - A)^{-1} c < \infty. \tag{3.17}$$ Note our bound on $\sigma$ blows up at $\rho = \frac{1}{4}$ . Since three sessions share each link, this corresponds to a maximum steady state rate of $\frac{3}{4}$ on each link, well below the transmission capacity of 1 for each link. By considering a simple generalization of this example, where K nodes in a ring are connected rather than 4, we can find bounds which blow up when the maximum utilization on each link is about $\frac{1}{2}$ when K is large. This raises the following interesting question: For such examples, what is the largest $\rho$ such that network delays are bounded? We might expect that the answer is 1/K-1; since K-1 sessions share each link, this corresponds to a maximum utilization of 1. This question remains open for large K. It is interesting to note that if the switching nodes employ regulators internally, as described in Section IV-B, then we can derive finite bounds on delay as long as the utilization is less than 1. This should become clear to the reader after reading Section IV-B. ## IV. UPPER BOUNDS FOR MAXIMUM DELAY IN BUFFERED MULTISTAGE NETWORKS In this section we investigate the effect of using regulator elements defined in Part I inside switching nodes. The network model we study is the class of buffered multistage networks. An example of a multistage network is the omega network [3], [4] illustrated in Fig. 4. We will describe omega networks precisely in Section IV-C. A multistage network consists of stages of switching nodes. Data packets enter the switching nodes in the first stage and pass through a fixed number of stages to reach their destinations. We obtain bounds for delay in such multistage networks assuming buffers exist inside the network and the input traffic to the network satisfies burstiness constraints. For ease of exposition we consider a rather simple model. Furthermore, the bounds we obtain are not the best that one can derive using the techniques presented, but are simple to derive. In Section IV-A we study networks that do not employ any flow control mechanisms *internally*. We find that the bounds for network delay grow exponentially in the number of stages. In contrast, Section IV-B considers networks with switching nodes that employ the regulator elements as de- Fig. 4. Omega network. fined in Part I to control the flow of traffic between switching nodes. This allows us to find much smaller upper bounds for delay than that obtained in Section IV-A. Specifically, the upper bounds for network delay are proportional to the number of stages. In both Sections IV-A and IV-B, we assume that multiplexers inside each switching node operate with an FCFS service discipline. In Section IV-C, we obtain analogous results assuming only that the multiplexers inside switching nodes are LFCFS and work conserving. In this case, the bounds for network delay behave identically to those obtained in Sections IV-A and IV-B, depending on whether or not regulator elements are used within switching nodes. We describe a specific type of LFCFS work-conserving multiplexer, the exhaustive service multiplexer, for which the bounds apply. We give a specific example using exhaustive service multiplexers, where actual maximum delay can indeed grow exponentially in the number of stages if regulator elements are not used within switching nodes. Thus, we demonstrate that the use of regulation mechanisms inside a network can in fact reduce maximum network delay. #### A. Delay in Unregulated Multistage Networks We consider networks with $N=2^n$ inputs and N outputs which use $2\times 2$ switching nodes. An example for n=3 is the omega network in Fig. 4. Each stage has N/2 switching nodes. We assume that there are H stages. The stages are numbered from 1 to H. The first stage accepts input traffic. Outputs from the switches in stage s are linked to inputs of the switches in stage s are linked to inputs of the switches in stage s are the outputs of the network. We assume that all links in the network have transmission rate 1. The model for traffic is as follows. We assume that there are S sessions that are set up in advance. The rate of input traffic of the ith session is represented by $R_i^0$ . For all i we assume $R_i^0 \sim (\sigma_0, \rho)$ and that no packet is longer than L bits, where without loss of generality (see (2.3) in Part I) $$L \leq \frac{\sigma_0}{1-\rho} \, .$$ For any switching node we assume that the number of sessions that enter a given input link of the switch and exit a given output link of the switch is not greater than k. Thus the number of sessions sharing any link between successive stages is at most 2k. We require that $2k\rho < 1$ . Let $R_i^s$ be the rate function for the *i*th session as it exits stage s. Fig. 5. Model for each $2 \times 2$ switching node. Fig. 6. Illustration of fact. Fact: delay through multiplexer $\leq (m - 1)^{-1}$ 1)[ $L\rho^* + \sigma^*(1-\rho^*)^{-1}$ ]. Maximum packet length = L bits. The switches do not employ receive buffers, that is, the network operates in a cut-through mode. The model for each switch, in terms of the network elements defined in Part I, is illustrated in Fig. 5. Consider the FCFS multiplexer illustrated in Fig. 6, where m streams on m input links are multiplexed onto a single output link; all links have transmission rate 1. Packets on the output link are transmitted in FCFS order. Suppose that for all i the ith input stream has rate function $\tilde{R}_i$ and that $\tilde{R}_i \sim (\sigma^*, \rho^*)$ , where $m\rho^* < 1$ . We also assume that no packet is longer than L bits and that $$L \leq \frac{\sigma^*}{1-\rho^*}$$ . We use the following fact, which can be proved using arguments identical to the proof of Theorem 4.1 of Part I. Fact: It follows that the delay of any packet through the multiplexer illustrated in Fig. 6 is upper bounded by $$(m-1)\left[L\rho^*+\frac{\sigma^*}{1-\rho^*}\right].$$ Theorem 4.1: For the model for a multistage network, we $$R_i^s \sim (\sigma_s, \rho)$$ , for all $i = 1, 2, \dots, S$ and $s = 1, 2, \dots, H$ , where $$\sigma_s = (1 - k\rho)^{-s} [\rho L(1 - k\rho) + \sigma_0] - \rho L(1 - k\rho).$$ (4.2) Furthermore, the delay of any packet through a switch in stage s is no more than $D_s$ , where $$D_s = k(1 - k\rho)^{-s} [\rho L(1 - k\rho) + \sigma_0]. \tag{4.3}$$ Thus the total delay of each session is no more than $$\overline{D} = \sum_{s=1}^{H} D_s$$ $$= \left[ L(1 - k\rho) + \sigma_0 \rho^{-1} \right] \left[ (1 - k\rho)^{-H} - 1 \right]. \quad (4.5)$$ $$= \left[ L(1-k\rho) + \sigma_0 \rho^{-1} \right] \left[ (1-k\rho)^{-H} - 1 \right]. \quad (4.5)$$ Proof: We proceed by induction on s. The induction hypothesis is that $R_i^{s-1} \sim (\sigma_{s-1}, \rho)$ . The base case s=1 holds by hypothesis. Fix $1 \le s \le H$ . Fix any switch in stage s. Label the input links to the switch as 0 and 1; label the output links to the switch as 0 and 1. Let $A_{x,y}$ be the set of all i such that session i enters input link x and exits output y of the given switch. The two input streams for the multiplexer which feeds output link y have rate functions $\sum_{i \in A_{0,y}} R_i^{s-1}$ and $$\sum_{i \in A_{1,y}} R_i^{s-1}.$$ By the induction hypothesis, it follows that $$\sum_{i \in A_{x,y}} R_i^{s-1} \sim (k\sigma_{s-1}, k\rho)$$ (4.6) for all x, y since $|A_{x,y}| \le k$ . Thus by the previous fact using $m=2, \ \sigma^*=k\sigma_{s-1}$ and $\rho^*=k\rho$ , it follows that the delay of any packet through the switch is upper bounded by $D_s$ , where $$D_s = \frac{k\sigma_{s-1}}{1 - k\rho} + k\rho L. \tag{4.7}$$ Therefore, since the switch in stage s is arbitrary, it follows by the induction hypothesis and Theorem 2.1 of Part I that $R_i^s \sim (\sigma_s, \rho)$ for all i, where $$\sigma_s = \sigma_{s-1} + \rho D_s. \tag{4.8}$$ Solving the recursion for all s given by (4.8) and (4.7), we obtain (4.2). Equation (4.3) then follows from (4.7) and (4.2). Note that the bounds for the total delay grow exponentially in H. It would be interesting to determine whether actual delay can grow exponentially for this model. In Section IV-C we give an example, for a different operation of the multiplexers, where actual delay can grow exponentially. In the next section we consider a different model of operation for the switching nodes that utilizes regulator elements, and obtain bounds for the total delay that are proportional to H. #### B. Delay in Regulated Multistage Networks In this section we assume the same model of a buffered multistage network defined in Section IV-A, but we assume that each switching node operates in a manner illustrated by the schematic in Fig. 7. As the illustration suggests, we assume that the sessions are all demultiplexed and then sent through a $(\sigma, \rho)$ regulator before being multiplexed on the appropriate output link. Note that $\sigma$ is a parameter of the network. Fig. 7. Model for each $2\times 2$ (regulated) switching node. In this section we prove the following theorem. For convenience, we assume that $$\sigma_0 \le \sigma \Big[ 1 + (2k - 1)\rho (1 - \rho)^{-1} \Big] + L \Big[ (2k - 1)\rho (1 + \rho) + 1 - \rho \Big]. \quad (4.9)$$ If this is not the case, then the results and the proof of the theorem can be easily modified. We make this assumption so that we can obtain a bound on delay in the first stage, which is the same as the bounds for delay in all other stages. Theorem 4.2: For the previous model, it follows that $$R_i^s \sim (\bar{\sigma}, \rho) \tag{4.10}$$ for all $i = 1, 2, \dots, S$ and $s = 1, 2, \dots, H$ , where $$\bar{\sigma} = \sigma \Big[ 1 + (2k - 1)\rho (1 - \rho)^{-1} \Big] + L \Big[ (2k - 1)\rho (1 + \rho) + 1 - \rho \Big]. \quad (4.11)$$ Furthermore, the delay of any packet through a switch in stage s is no more than $\tilde{D_s}$ , where $$\tilde{D}_s = L(\rho^{-1} - 1) + (4k - 2) \left[ (1 + \rho)L + \frac{\sigma}{1 - \rho} \right]. \quad (4.12)$$ Thus the total delay of any session is upper bounded by $$\tilde{D} = \sum_{s=1}^{H} \tilde{D_s} \tag{4.13}$$ $$= H\left\{L(\rho^{-1}-1) + (4k-2)\left[(1+\rho)L + \frac{\sigma}{1+\rho}\right]\right\}. \quad (4.14)$$ **Proof:** Let $\tilde{R}_i^{s-1}$ represent the rate for traffic from session i as it exits the $(\sigma, \rho)$ regulator in the appropriate switch in stage s. From the results of Part I we can conclude that $\tilde{R}_i^{s-1} \sim (\sigma + (1-\rho)L, \rho)$ for all i and s. Thus, using the fact in Section IV-A with m = 2k, $\sigma^* = \sigma + (1-\rho)L$ , and $\rho^* = \rho$ , we can conclude that the delay through any multiplexer in any switch is upper bounded $D_2^*$ , where $$D_2^* = (2k - 1) \left[ (1 + \rho) L + \frac{\sigma}{1 - \rho} \right]. \tag{4.15}$$ Hence by Theorem 2.1 of Part I, we have $$R_i^s \sim (\bar{\sigma}, \rho) \tag{4.16}$$ for all i and $s \ge 1$ , where $\overline{\sigma} = \sigma + (1 - \rho)L + \rho D_2^*$ . In fact, (4.16) also holds for s = 0 by our assumption (4.9) about $\sigma_0$ . This proves (4.10). By (4.10) and our results for the delay in regulators, it follows that the delay through *any* regulator in *any* switch is upper bounded by $D_1^*$ , where $$D_1^* = \rho^{-1}(\bar{\sigma} - \sigma) \tag{4.17}$$ $$= (2k-1)\left[ (1+\rho)L + \frac{\sigma}{1-\rho} \right] + (\rho^{-1}-1)L. \quad (4.18)$$ Thus the delay through any switch in stage s is upper bounded by $$D_1^* + D_2^* = \tilde{D}_s. {4.19}$$ C. Delay in Multistage Networks with LFCFS Service Disciplines In the preceding two sections we analyzed multistage networks which used FCFS multiplexers. We can easily obtain bounds on network delay when we require only that the multiplexers are LFCFS and work conserving. Such bounds are contained in Theorems 4.3 and 4.4. Since the analysis is almost identical to that of the preceding sections, we omit the proofs. Theorem 4.3: Consider the unregulated multistage network of Theorem 4.1, where we now assume that the multiplexers are merely LFCFS and work conserving. All other assumptions and notation are the same. It follows that $$R_i^s \sim (\tilde{\sigma}_s, \rho),$$ for all $s = 1, 2, \dots, H$ and $i = 1, 2, \dots, S,$ $$(4.20)$$ where $$\tilde{\sigma}_{s} = (1 - (2k - 1)\rho)^{-s} \sigma_{0}. \tag{4.21}$$ Furthermore, the delay of any packet through a switching node in stage s is upper bounded by $\hat{D}_s$ , where $$\hat{D}_{s} = \frac{2k\,\tilde{\sigma}_{s-1}}{1 - (2k-1)\rho} \tag{4.22}$$ $$=2k(1-(2k-1)\rho)^{-s}\sigma_0. \tag{4.23}$$ Finally, the total delay of each session is no more than $$\hat{D} = \sum_{s=1}^{H} \hat{D}_s \tag{4.24}$$ $$= \rho^{-1} \sigma_0 \left[ \left( 1 - (2k - 1)\rho \right)^{-H} - 1 \right] \left[ \frac{2k}{2k - 1} \right]. \quad (4.25)$$ We will give an example to show that delay can indeed grow exponentially in the number of stages. First we state a much smaller upper bound that holds when regulator elements are used inside switching nodes. Theorem 4.4: Consider the regulated multistage network of Theorem 4.2 where we now assume that the multiplexers inside each switching node are merely LFCFS and work conserving. All notation and other assumptions are the same except that we assume $$\sigma_0 \le \frac{\sigma + (1 - \rho)L}{1 - (2k - 1)\rho}$$ (4.26) Fig. 8. Example of port labeling scheme for single stage. It follows that $$R_i^s \sim (\bar{\sigma}', \rho),$$ for all $i = 1, 2, \dots, S$ and $s = 1, 2, \dots, H$ , $$(4.27)$$ where $$\bar{\sigma}' = \frac{\sigma + (1 - \rho)L}{1 - (2k - 1)\rho}.$$ (4.28) The total delay of each session is upper bounded by $$\hat{D}' = H \left[ \frac{(\bar{\sigma}' - \sigma)}{\rho} + \frac{2k(\sigma + (1 - \rho)L)}{1 - (2k - 1)\rho} \right]. \quad (4.29)$$ The remainder of this section is devoted to an example of an unregulated multistage network where the maximum delay can indeed grow exponentially in the number of stages. We formally define the binary omega network with $2^n$ inputs as follows. As the name suggests, the network has $N=2^n$ inputs and N outputs. The network consists of nN/2 2×2 switching nodes. The nN/2 switching nodes are partitioned into n groups of N/2 switches. Each group is referred to as a stage. Each $2\times 2$ switch has 2 input ports and 2 output ports. We number the input ports of any given stage 0 through N-1, inclusive. The output ports of each stage are labeled similarly. The numbering is done such that the labels of the input ports belonging to the same switching node are consecutive and that the output ports of any switching node have the same labels as the input ports to that same switch. It is convenient to represent the label for any input or output port by its binary expansion. An example of the port labeling scheme is shown in Fig. 8 for the case n=3. The stages are numbered 1 through n. The interconnection of switching nodes is such that all of the output ports of stage i, $1 \le i \le n-1$ , are connected to all of the input ports of stage i+1 as follows. The output port $(a_{n-1}, a_{n-2}, \cdots, a_1, a_0)$ at stage i is connected via a directed link to the input port $(a_0, a_{n-1}, a_{n-2}, \cdots, a_2, a_1)$ at stage i+1. This interconnection between stages is commonly referred to as a "reverse shuffle." The input ports to stage 1 serve as the N "inputs" to the network; the N output ports of stage n serve as the "outputs" of the network. For convenience, we label each link connecting switching nodes as follows: for $1 \le s \le n$ , $(s; a_{n-1} \cdots a_1 \ a_0)$ is the label for the link between output $(a_{n-2} \ a_{n-3} \cdots a_1 \ a_0 \ a_{n-1})$ in stage s and input $(a_{n-1} \ a_{n-2} \cdots a_0)$ in stage s+1. Sources are connected to the inputs of the network in stage 1, and destinations are connected to the outputs of the network in stage n. We also label the sources and destinations by binary sequences of length n. The link connecting source a to input a in stage 1 is labeled (0; a). By convention, the destinations are said to be in stage n+1. Thus $(n; a_{n-1} \cdots a_1 \ a_0)$ is the label for the link between output $(a_{n-2} \ a_{n-3} \cdots a_1 \ a_0 \ a_{n-1})$ in stage n and destination $(a_{n-1} \ a_{n-2} \cdots a_0)$ . It is easily verified that there exists a unique path between any source and any destination. Consider a binary omega network with $N = 2^n$ inputs, as previously described. Each switching node is modeled as in Theorem 4.3 (see Fig. 5) and the multiplexers use an "exhaustive" discipline (we also assume that the multiplexers are LFCFS and work conserving). This means that once a multiplexer starts transmitting data from an input stream, it continues to transmit data at maximum rate from that stream as long as possible (i.e., without violating the work conserving assumption) before transmitting data from another input stream. As an example, consider an exhaustive multiplexer with two input links. Suppose the input and output links have transmission rate 1 and that the "L = 0" model is used. Let the rates of input streams 1 and 2 be represented by $R_1$ and $R_2$ , respectively, and let $R_1^{\text{out}}$ and $R_2^{\text{out}}$ represent the rates of streams 1 and 2 as they exit the multiplexer. Thus $R_1^{\text{out}} + R_2^{\text{out}}$ represents the rate of all the traffic flowing on the output link of the multiplexer. For $x \ge 0$ define the unit pulse of length $x, P_x$ , as $$P_{x}(t) = \begin{cases} 0, & t < 0, \\ 1, & 0 \le t < x, \\ 0, & t \ge x, \end{cases}$$ (4.30) for all t. For $x \ge 0$ and integers $k \ge 0$ define the function $h_{x,k}$ as $$h_{x,k}(t) = P_x(t) + \rho P_{\alpha x(\alpha - 1)^{-1}(\alpha^k - 1)}(t - x)$$ = $P_x(t) + \rho P_{x(\Sigma_{t-1}^k, \alpha^t)}(t - x)$ (4.31) for all t, where $\rho$ is a given constant satisfying $0<\rho<1/2$ and $\alpha=(1-\rho)^{-1}$ . Suppose that for all t $$R_1(t) = h_{x,k}(t) (4.32)$$ and $$R_2(t) = h_{x,0}(t) = P_x(t),$$ (4.33) where x > 0 and k > 0. At time zero, data begins to arrive from both streams at rate 1. Suppose the multiplexer begins to transmit data from stream 2 at time zero. As a consequence, the multiplexer must continue to transmit data from stream 2 until time x. At time x, the multiplexer must begin to transmit data from stream 1 at rate 1. It can do so for $x\alpha$ units of time, since x bits from stream 1 are queued at time x and these bits are depleted at rate $1 - \rho$ . Thus we have $$R_1^{\text{out}}(t) = h_{\alpha x, k-1}(t-x)$$ (4.34) and $$R_2^{\text{out}}(t) = R_2(t) = h_{x,0}(t),$$ (4.35) for all t. In the following, the basic idea is to recursively utilize this example ((4.32)–(4.35)) for the operation of the exhaustive multiplexer. Given any binary sequence $\mathbf{a} = (a_{n-1} \ a_{n-2} \ \cdots \ a_1 \ a_0)$ of length n, define the operators $(\leftarrow_0)$ , $(\leftarrow_1)$ , and $f_m$ as $$(\leftarrow_0) \mathbf{a} = (a_{n-2} \ a_{n-3} \ \cdots \ a_0 \ 0),$$ $(\leftarrow_1) \mathbf{a} = (a_{n-2} \ a_{n-3} \ \cdots \ a_0 \ 1),$ $f_m \mathbf{a} = (a_{n-1} \ \cdots \ a_{m+1} \ \overline{a}_m \ a_{m-1} \ \cdots \ a_0),$ where $\bar{a}$ denotes the complement of a. Also define $\phi a$ to be the number of trailing (least significant positions) zeros of a. Thus, for example, if the least significant bit of a is one, then for $0 \le m \le n$ there holds $\phi (\leftarrow_0)^m a = m$ . Finally, define the operator $\theta_m$ for $0 \le m \le n$ by $\theta_m a = \phi[(\leftarrow_0)^m a] - m$ for all $a \in \{0,1\}^n$ . We suppose that there are N sessions, each entering the network at a different input: the session a that enters at input a is routed to destination $(\leftarrow_0)a$ . Let $R_a^0$ represent the rate of traffic from session a as it enters the network, and let $R_a^m$ represent the rate of traffic from session a as it exits stage m. We assume that $$R_a^0 \sim (\sigma_0, \rho) \tag{4.36}$$ for all $\mathbf{a} \in \{0,1\}^n$ , where $\sigma_0$ and $\rho$ are given constants with $0 < \rho < \frac{1}{2}$ . We also assume the "L = 0" model of traffic. Let $g_m \mathbf{a}$ be the label of the link between stage m and stage m+1 in the unique path from input $\mathbf{a}$ to destination ( $\leftarrow_0$ ) $\mathbf{a}$ . Thus for example, $$g_{0}\mathbf{a} = (0; a_{n-1} \cdots a_{1} a_{0}),$$ $$g_{1}\mathbf{a} = (1; 0 \ a_{n-1} \cdots a_{2} a_{1}),$$ $$g_{2}\mathbf{a} = (2; a_{0} \ 0 \ a_{n-1} \cdots a_{3} a_{2}),$$ $$\cdots$$ $$g_{n-1}\mathbf{a} = (n-1; a_{n-3} \cdots a_{1} a_{0} \ 0 a_{n-1}),$$ $$g_{n}\mathbf{a} = (n; a_{n-2} \cdots a_{1} a_{0} \ 0).$$ A moment's look at this reveals that there are at most two sessions that flow through any link. For example, only sessions $\boldsymbol{a}$ and $f_m\boldsymbol{a}$ pass through link $g_{m+1}\boldsymbol{a}$ , and half of the links between any two stages are unused. Thus Theorem 4.3 applies with k=1. We now give a specific example for $R_a^0$ , $a \in \{0, 1\}^n$ , satisfying (4.36), which leads to possible network delay growing exponentially in n. Let $$R_a^0 = h_{w,\phi a}$$ for all $a \in \{0,1\}^n$ , (4.37) where $$w = \sigma_0 (1 - \rho)^{-1}.$$ We show by induction on m that the following holds for all $a \in \{0,1\}^n$ , t, and all $m = 0,1,2,\cdots,n$ : $$R_{(\leftarrow_{i_1})^m a}^m(t) = h_{\alpha^m w, \theta_m a} \left( t - \left( \sum_{j=1}^m \alpha^{j-1} \right) w \right)$$ $$= h_{\alpha^m w, \theta_m a} \left( t - w \frac{\alpha^m - 1}{\alpha - 1} \right). \tag{4.38}$$ To prove (4.38), we assume that whenever a multiplexer begins to receive data from both streams at the same time at full rate, the stream from the input link which has "1" as a least significant bit in its label is transmitted first. The base case m = 0 holds by hypothesis. Suppose then that (4.38) holds for some fixed value of m. We now show that it also holds when m is replaced by m + 1. Fix a. Now the stream whose rate is represented by $$R_{t\leftarrow \lambda^{m+1}}^{m+1}$$ is transmitted on link $g_{m+1}((\leftarrow_0)^{m+1}a)$ , multiplexed together with the stream whose rate is represented by $$R_{f_m(\leftarrow_0)^{m+1}a}^{m+1}$$ . Thus the rates of the input streams to the multiplexer that feed link $g_{m+1}((\leftarrow_0)^{m+1}a)$ are represented by $$R^m_{(\leftarrow, \gamma^{m+1})}$$ a nd $$R_{f_m(\leftarrow_0)^{m+1}a}^m$$ . Now $$(\leftarrow_0)^{m+1} \mathbf{a} = (\leftarrow_0)^m (\leftarrow_0) \mathbf{a}$$ and $$f_m(\leftarrow_0)^{m+1}\boldsymbol{a} = (\leftarrow_0)^m(\leftarrow_1)\boldsymbol{a}.$$ Thus, since $$\theta_m(\leftarrow_0) \boldsymbol{a} = \theta_{m+1} \boldsymbol{a} + 1 \text{ and } \theta_m(\leftarrow_1) \boldsymbol{a} = 0,$$ it follows from the induction hypothesis that $$R_{f_{m}(\leftarrow_{0})^{m+1}a}^{m}(t) = h_{\alpha^{m}w,0} \left( t - \left( \sum_{j=1}^{m} \alpha^{j-1} \right) w \right) \quad (4.39)$$ and $$R_{(\leftarrow_{0})^{m+1}a}^{m}(t) = h_{\alpha^{m}w,\theta_{m+1}a+1} \left( t - \left( \sum_{j=1}^{m} \alpha^{j-1} \right) w \right), \text{ for all } t.$$ (4.40) Since session $f_m(\leftarrow_0)^{m+1}a$ enters an input of stage m+1 whose least significant bit is 1, traffic from this session commences transmission on link $g_{m+1}((\leftarrow_0)^{m+1}a)$ at time $$\sum_{j=1}^{m} \alpha^{j-1} w$$ and continues for $\alpha^m w$ units of time. Thus $$R_{(\leftarrow_0)^{m+1}a}^{m+1}(t) = h_{\alpha^{m+1}w,\theta_{m+1}a} \left( t - \left( \sum_{j=1}^{m+1} \alpha^{j-1} \right) w \right). \quad (4.41)$$ Since a is arbitrary, this completes the proof of (4.38). Now note that (4.38) with m=n implies that no traffic from session (0 0 ··· 0) reaches its destination until time $w(\alpha^n - 1)(\alpha - 1)^{-1}$ . Hence delay can indeed grow exponentially In summary, we have seen in this section that the use of regulator elements inside switching nodes enables us to obtain smaller upper bounds to delay than we can otherwise obtain. We have also demonstrated that such a use reduces actual maximum delay when an exhaustive service discipline is used in the multiplexer. It would be interesting to determine if a similar statement can be made for the FCFS service discipline. Another interesting problem is to investigate the effect of such use of regulator elements on average delay. ### V. REGULATOR ELEMENTS IN SERIES In this section we investigate the effect of connecting regulator elements in series. Lemma 5.1 of Part I will play a key role in this section. Consider a system as illustrated in Fig. 9. For simplicity at first we assume that all of the Fig. 9. Illustration of $(\sigma, \rho)$ regulators connected in series. regulator elements are $(\sigma, \rho)$ regulators and all links have the common transmission capacity C: later we will study what happens when one of the regulator elements in the chain is a FIFO queue. $R_k$ represents the rate of the traffic output of the kth regulator and $R_0$ represents the rate of the traffic input to this system. Let $d_j^{(k)}$ be the difference in the time that the jth packet begins to exit the kth regulator and the time it begins to arrive to the system. Thus for all kwe have $$R_k(t) = C \sum_{j=1}^{\infty} I_{\{s_j + d_j^{(k)} \le t < s_j + d_j^{(k)} + L_j / C\}},$$ where for all j, $d_j^{(0)} = 0$ , $s_j$ is the system arrival time for packet j, $L_i$ is the length in bits of packet j, and $s_i + L_i/C$ $\leq s_{i+1}$ . Theorem 5.1: For all $k = 1, 2, \dots, n$ there holds: a) $$W_{\rho_k}(R_n)(s_j+d_j^{(n)}) \le \sigma_k, \quad j \ge 1,$$ b) $R_n \sim (\sigma_k + (1-\rho_k/C)L, \rho_k),$ b) $$R_n^{-k} \sim (\hat{\sigma}_k + (1 - \rho_k / C)L, \rho_k),$$ c) $$d_j^{(k)} = \max_{m=1,2,\cdots,k} \left[ \frac{1}{\rho_m} (W_{\rho_m}(R_0)(s_j) - \sigma_m)^+ \right], \quad j \ge 1.$$ Remarks: 1) Note that b) implies that $$\int_{x}^{y} R_{n}(u) du$$ $$\leq \min_{k=1,2,\dots,n} \left( \sigma_{k} + (1 - \rho_{k}/C) L + \rho_{k}(y - x) \right)$$ for all $y \ge x$ . Thus, the output traffic of the entire system satisfies this constraint and the entire system can be thought of as a regulator. More generally, if b is a concave and increasing function defined on the nonnegative real line we can lower bound b: $$b(t) \ge \min_{k=1,2,\dots,n} (\bar{\sigma}_k + \bar{\rho}_k t)$$ for all t and appropriate choice for the $\bar{\sigma}_k$ 's and the $\bar{\rho}_k$ 's. We can make the inequality as tight as we wish by choosing n large. If we wish to regulate a traffic stream $R_0$ to produce an output stream $R_n$ , which approximately satisfies $$R \sim h$$ (the approximation gets better as L approaches 0) we can simply use this type of system to perform the regulation. - 2) Note that by c), $d_i^{(n)}$ is the maximum of n terms. The kth term is the delay that would be incurred by the jth packet if the input stream were fed to a single $(\sigma_k, \rho_k)$ regulator alone. - 3) Note that $d_i^{(n)}$ does not depend on the order in which the regulators are connected. Thus, the input-output behavior of the system is invariant with respect to reordering of the regulators. - 4) The implementation of the entire system need not consist of n "separate" regulators. We could simply delay the jth packet by $d_i^{(n)}$ as given in c). 5) In view of c) and the definition of the $(\sigma, \rho)$ regulator, the entire system gives the smallest possible delay for each packet among all systems that retransmit packets in FCFS order and produce an output stream $R_n$ satisfying a) for all k. Proof of Theorem 5.1: To prove a), we apply Lemma 5.1 of Part I *n* times: in the *m*th application we use $R_{m-1}$ , $R_m$ , $s_j + d_j^{(m-1)}$ , $s_j + d_j^{(m)}$ , $d_j^{(m)} - d_j^{(m-1)}$ , $\rho_m$ in place of $R_0$ , $R_1$ , $s_j$ , $t_j$ , $d_j$ , $\rho$ , respectively. This implies $$W_{\bar{p}}(R_m)(s_j + d_j^{(m)}) = \left[W_{\bar{p}}(R_0)(s_j) - \bar{p}d_j^{(m)}\right]^+,$$ for all $\bar{p}$ and $m = 1, 2, \dots, n$ . (5.1) In particular, it follows that $$W_{o_i}(R_m)(s_i+d_i^{(m)})$$ is nonincreasing in m for each fixed j and k. On the other hand, by virtue of the regulation in the kth regulator we $$W_{\rho_k}(R_k)(s_j+d_j^{(k)})\leq \sigma_k.$$ Thus a) follows. By the remarks about the $(\sigma, \rho)$ regulator in Part I, b) follows easily from a). We prove c) by induction. It holds for k = 1 by (5.3) of Part I. Suppose that it holds for some fixed k; we now show it still holds when k is replaced by k + 1. By (5.3) of Part I and (5.1) in this paper we have $$d_j^{(k+1)} - d_j^{(k)} = \frac{1}{\rho_{k+1}} \left( \left[ W_{\rho_{k+1}}(R_0)(s_j) - \rho_{k+1} d_j^{(k)} \right]^+ - \sigma_{k+1} \right)^+.$$ (5.2) Now if $d_i^{(k+1)} - d_i^{(k)} > 0$ then all plus sign superscripts in (5.2) can be removed, yielding $$d_j^{(k+1)} = (\rho_{k+1})^{-1} (W_{\rho_{k+1}}(R_0)(s_j) - \sigma_{k+1})^+ > d_j^{(k)}.$$ On the other hand, if $d_i^{(k+1)} - d_i^{(k)} = 0$ , it follows from (5.2) $$(\rho_{k+1})^{-1} \big( W_{\rho_{k+1}}(R_0)(s_j) - \sigma_{k+1} \big) \le d_j^{(k)}$$ and hence $$(\rho_{k+1})^{-1} (W_{\rho_{k+1}}(R_0)(s_i) - \sigma_{k+1})^+ \le d_i^{(k)},$$ since $d_i^{(k)} \ge 0$ . It follows that $$d_i^{(k+1)} = \max \left( d_i^{(k)}, (\rho_{k+1})^{-1} \left( W_{\rho_{k+1}}(R_0)(s_i) - \sigma_{k+1} \right)^+ \right).$$ This completes the induction step. If we wish to construct a regulator, in the general sense of Remark 1), which produces an output stream on a link with transmission rate smaller than that of the input link to the regulator, we could simply feed the output of a system such as in Fig. 9 into a FIFO with the appropriate input and output transmission rates. For example, consider the system Fig. 10. (a) Illustration of general regulator. (b) Another implementation. (c) Another implementation. in Fig. 10(a). The total delay for the *j*th packet in the input stream $R_0$ , using the usual notation to define $R_0$ , is given by $$d_{j}^{(n+1)} = \max_{m=1,2,\dots,n+1} \left[ \frac{1}{\rho_{m}} (W_{\rho_{m}}(R_{0})(s_{j}) - \sigma_{m})^{+} \right], \qquad j \geq 1,$$ where $\sigma_{n+1} = 0$ and $\rho_{n+1} = C_{\text{out}}$ . The output traffic $R_{n+1}$ of this entire system satisfies $R_{n+1} \sim (\sigma_k + (1 - \rho_k / C_{\text{out}})L, \rho_k)$ for all $k = 1, 2, \dots, n+1$ . Using the same reasoning as in Theorem 5.1, we can analyze the systems in Figs. 10(b) and 10(c). We then see that the input-output behavior of all three systems in Figs. 10(a), 10(b), and 10(c) are identical. The remarks following Theorem 5.1 apply to these three systems as well. In fact, the FIFO in Fig. 10(c) can be inserted between any of the $(\sigma,\rho)$ regulators, and the input-output behavior of the entire system remains unchanged. Of course, all links preceding the FIFO should have transmission rate $C_{\rm in}$ and all links following the FIFO should have transmission rate $C_{\rm out}$ . #### VI. CONCLUSION In this paper, we have presented a fairly general model for communication networks. Our model differs from most in our assumptions about the traffic entering the network, which is assumed to be unknown rather than random. Our model allows us to easily obtain bounds for network delay, buffering requirements, and throughput. This contrasts with other models, where it is often difficult, if not impossible, to perform an exact analysis. The overall methodology of analyzing network elements, first in isolation and then relating the results, gives a powerful way of generating bounds. To obtain tight bounds, however, it may be necessary to analyze the network as a whole at the outset. An example of such a study in the context of dynamic routing with our traffic model is given in [5]. As another example, Sasaki [6] effectively considers the strong coupling between receive buffers with attached multiplexers. We believe the open problems we have encountered in the analysis for our model are of considerable theoretical interest. Among the most interesting open problems we have encountered is the identification of throughput regions where network delay is bounded. So far, we have only obtained bounds on such regions that may well be somewhat loose. We have found that the use of regulation elements inside switching nodes allows us to obtain larger throughput regions than we have otherwise obtained, and we have given an example where actual maximum delay is substantially reduced by such use. An interesting problem is to determine the effect of such regulation on average delay. A worst case analysis, which is in a sense what we have done, may be appropriate in many types of parallel and/or distributed computing applications. However, such analyses can be overly pessimistic and lead to "over-designing" in other applications. Although the traffic model we have considered is quite general, it only partially characterizes traffic. This leads to the interesting topic of synthesizing new traffic models that further characterize traffic in a useful way. We feel that the model presented here can be combined with statistical models to obtain useful and interesting results. #### ACKNOWLEDGMENT The author would like to thank Prof. Bruce Hajek for many helpful discussions. Thanks also to the Associate Editor, the anonymous referees, M. C. Chuah, and D. Roberts, who have all made comments that have been very helpful in revising the manuscripts. #### REFERENCES - R. Cruz, "A calculus for network delay and a note on topologies of interconnection networks," Technical report UILU-ENG-87-2246, Coordinated Science Laboratory, Univ. of Illinois at Urbana-Champaign, 1987. - [2] P. Kermani and L. Kleinrock, "Virtual cut-through: A new computer communication switching technique," Computer Networks, vol. 3, pp. 267–286, 1979. - [3] D. Lawrie, "Access and alignment of data in an array processor," *IEEE Trans. Comput.*, vol. C-24, no. 12, pp. 1145–1155, Dec. 1975 - [4] D. Mitra and R. Cieslak, "Randomized parallel communications on an extension of the omega network," J. ACM, vol. 34, no. 4, pp. 802–824, Oct. 1987. - [5] R. Cruz and M. C. Chuah, "Worst case analysis of a simple routing problem," Proc. 28th IEEE CDC, pp. 2560–2566, Dec. 1989. Also submitted to IEEE Trans. Automatic Contr. - [6] G. Sasaki, "Input buffer requirements for round robin polling systems," Proc. 27th Ann. Allerton Conf. Commun., Contr., and Computing, Monticello, IL, Sept. 27-29, 1989. - [7] R. L. Cruz, "A calculus for network delay, Part I: Network elements in isolation," *IEEE Trans. Inform. Theory*, vol. 37, no. 1, pp. 114–131, Jan. 1991.