Math 606, Spring 2024
University of Massachusetts Amherst
2024-04-21
We turn now to continuous Markov processes X_t where t \in [0,\infty). The simplest such example of such process is the ubiquitous Poisson process.
Counting processes N_t are continuous time stochastic process which counts the number of events occuring up to time t: N_t = \textrm{total number of events occuring up to time } t e.g. the number of accidents reported to an insurance company, the number of earthquakes in California, and so on.
The Poisson process is the simplest counting process, intuitvely characterized by the following properties:
The number of events occuring in disjoint time intervals are independent.
The rate at which event occur is constant over time.
The events occur one at a time.
More formally a Poisson process N_t with rate paratmeter \lambda is a continuous time stochastic process such that
Independent increments: Given times s_1\le t_1 \le s_2 \le t_2 \cdots \le s_n \le t_n the random variables N_{t_i}-N_{s_i} (that is the number of events occurring in the disjoint time intervals [s_i,t_i]) are independent.
The rate at which events occur is denoted by \lambda>0 and for any (small) time interval [t, t + \Delta t] we have
\begin{aligned}
P \{ N_{t + \Delta t } = N_t \} &= 1- \lambda \Delta t +o(\Delta t) & \\
P\{N_{t + \Delta t}= N_t +1\} &= \lambda \Delta t +o(\Delta t) &
\textrm{ with } \lim_{\Delta t \to 0}\frac{o(\Delta t)}{\Delta t}=0 \\
P\{N_{t + \Delta t}\ge N_t +2\} & = o(\Delta t) &
\end{aligned}
\tag{1.1}
Theorem 1.1 (Distribution of the Poisson process) The Poisson process N_t with N_0=0 has the distribution P\{N_t=k\} = e^{-\lambda t}\frac{(\lambda t)^k}{k!} i.e. N_t has Poisson distribution with paramter \lambda t. Morever for s, M_t=N_{t+s}-N_{s} is a Poisson process.
Proof (version 1 using Poisson limit). Pick a large number n and divide the interval [0,t] into n intervals of size \frac{t}{n}. Write
N_t = \sum_{j=1}^n\left( N_{j \frac{t}{n}}- N_{(j-1) \frac{t}{n}} \right)
as a sum of n independent random variables. If n is large the probability that any of these random variables is at least 2 is small. Indeed, by a union bound, we have
\begin{aligned}
P\left\{ N_{j \frac{t}{n}}- N_{(j-1) \frac{t}{n}} \ge 2 \textrm { for some } j \right\} &\le \sum_{j=1}^n P\left\{ N_{j \frac{t}{n}}- N_{(j-1) \frac{t}{n}} \ge 2 \right\}
\le n P \left\{ N_{\frac{t}{n}} \ge 2\right\} = t \frac{o( \frac{t}{n})}{\frac{t}{n}}
\end{aligned}
which goes to 0 as n \to \infty.
Therefore N_t is, approximately, a binomial random variables with success probability \lambda \frac{t}{n}: P(N_t=k) \approx {n \choose k} \left(\frac{\lambda t}{n} \right)^k \left(1-\frac{\lambda t}{n} \right)^{n-k} and as n \to \infty this converges to a Poisson distribution with parameter \lambda t. \quad \blacksquare.
Proof (version 2 using ODEs). Let us derive a system of ODEs for the family P_t(k)=P\left\{ N_t=k \right\}. We have \frac{d}{dt}P_t(k) = \lim_{\Delta t \to 0} \frac{P\left\{ N_{t+\Delta t}=k \right\} - P\left\{ N_t=k \right\} }{\Delta t} Conditioning we find \begin{aligned} P\left\{ N_{t+\Delta t}=k \right\}=& P\left\{ N_{t+\Delta t}=k | N_t=k \right\} P\left\{ X_{t}=k \right\} \\ & + P\left\{ N_{t+\Delta t}=k | N_t=k-1 \right\}P\left\{ X_{t}={k-1} \right\} \\ & + P\left\{ N_{t+\Delta t}=k | N_t\le k-2\right\}P\left\{X_t \le k-2\right\} \\ =& P_t(k) (1 - \lambda \Delta t) + P_t(k-1) \Delta t + o(\Delta t) \end{aligned} and this gives the system of equations \begin{aligned} \frac{d}{dt}P_t(0)&= - \lambda P_t(0) & \quad P_0(0)=1 \\ \frac{d}{dt}P_t(k)&= \lambda P_t(k-1) - \lambda P_t(k) &\quad P_0(k)=0 \end{aligned}
We find P_0(t)= e^{-\lambda t} for k=0. We use an integrating factor and set f_t(k)=e^{\lambda t} P_t(k). We have then f_t(0)=1 and for k >0
\frac{d}{dt}f_t(k)=\lambda f_t(k-1), \quad f_0(k)=0
which we can solve iteratively to find
f_t(k) = \frac{(\lambda t)^k}{k!}
and thus N_t has distribution
P_t(k)= e^{-\lambda t}\frac{(\lambda t)^k}{k!}\,.
\blacksquare.
Poisson processes and exponential (and gamma) random variables are intimately related. Given a Poisson process we consider the interarrival times T_1, T_2, \cdots: T_1 is time of the occurence of the first event, T_2 is the time elapsed between the first and second event, and so on.
Theorem 1.2 If N_t is a Poisson process with parameter \lambda the interarrival times are independent exponential random variables with paramter \lambda.
Proof. If T_1 >t it means no event has occured up time t and so N_t=0. Therefore P\left\{ T_1 >t \right\} = P\left\{ N_t =0\right\} = e^{-\lambda t} and thus T_1 has an exponential distribution. For T_2 we condition on T_1 P\left\{ T_2 >t \right\} = \int P(T_2>t |T_1 =s) f_{T_1}(s) ds and, using the independence of the increments, P\left\{ T_2>t |T_1 =s\right\} = P\left\{ 0 \textrm{ events in } (s, s+t] | T_1=s \right\} = P\left\{ 0 \textrm{ events in } (s, s+t] \right\} = e^{-\lambda t} from which we conclude that T_2 has exponential distribution and is independent of T_1. This argument can be repeated for T_3 by conditioning on the time of the second event, T_1+T_2, and so on. \quad \blacksquare
Another set of closely related quantities are the arrival times of the n^{th} event S_1, S_2, \cdots which are related to the interarrival times by
S_1=T_1, \quad S_2=T_1+T_2, \quad S_3=T_1+T_2+T_3, \cdots
By Theorem 1.2 S_n is the sum of n independent exponential RVs and thus S_n a Gamma RV with parameter (n,\lambda) with density
f_{S_n}(t) = \lambda e^{-\lambda t}\frac{(\lambda t)^{n-1}}{(n-1)!}\, \quad \textrm{ for }t \ge 0\,.
We can actually prove this fact using the Poisson process by noting that, by definition,
N_t \ge n \iff S_n \le t \,,
that is if n or more events have occured by time t if and only the n^{th} event has occured prior to or at time t.
So the CDF of S_n is F_{S_n}(t)=P\{ S_n \le t\} = P \{ N_t \ge n\} = \sum_{j=n}^\infty e^{-\lambda t} \frac{(\lambda t)^{j}}{j!} and upon differentiating we find f_{S_n}(t)= - \lambda e^{-\lambda t} \sum_{j=n}^\infty \frac{(\lambda t)^{j}}{j!} + \lambda e^{-\lambda t}\sum_{j=n}^\infty \frac{(\lambda t)^{j-1}}{(j-1)!} = \lambda e^{-\lambda t} \frac{(\lambda t)^{n-1}}{(n-1)!}. \quad \blacksquare
Let us start with a special case and assume assume that N_t=1, that is exactly one event has occured in [0,t]. Since the Poisson process has independent increments it seems reasonable the event may have occur with equal probability at any time on [0,t]. Indeed for s\le t we have, using the independence of increments. \begin{aligned} P\{T_1 \le s | N_t=1\} & = \frac{P\{T_1 \le s , N_t=1\}}{P\{N_t=1\}} = \frac{P\{\textrm{1 event in }(0,s] \textrm{ and no event in (s,t]} \}}{P\{N_t=1\}} \\ &= \frac{ \lambda s e^{-\lambda s} e^{-\lambda(t-s)} }{\lambda t e^{-\lambda t}} = \frac{s}{t} \end{aligned} and thus the density of the arrival time T_1 conditioned on N_t=1 is uniform on [0,t]
We study further the properties of the arrival times S_1< S_2 <\cdots of a Poisson process. The following result tells us that they follow a uniform distribution on [0,t].
Theorem 1.3 Given the event \{N_t=n\}, the n arrival times S_1, S_2, \cdots have the same distribution as the order statistics for n independent random variables uniformly distributed on [0,t].
Proof. The conditional density of (S_1, \cdots, S_n) given that N_t=n can be obtained as follows. If S_1=s_1, S_2=s_2, \cdots, S_n=s_n and N_t=n then the intearrival times must satisfy T_1=s_1, T_2=s_2-s_1, \cdots, T_n= s_n-s_{n-1}, T_{n+1}> t-s_n.
By the independence of the interarrival times proved in Theorem 1.2 the conditional density is given by \begin{aligned} f(s_1, s_2, \cdots, s_n|n) &= \frac{f(s_1, s_2, \cdots, s_n, n)}{P\{N_t=n\}} \\ &= \frac{\lambda e^{-\lambda s_1} \lambda e^{-\lambda(s_2-s_2)} \cdots \lambda e^{-\lambda(s_n-s_{n-1})} e^{-\lambda(t-s_n)} }{e^{-\lambda t} \frac{(\lambda t)^n}{n!} } = \frac{n!}{t^n} \end{aligned} which is the joint density of the order statistic of n uniform. \blacksquare
Recall if X_1, \cdots, X_n are IID random variable with joint density f(x_1)\cdots f(x_n) and X^{(1)} \le X^{(2)} \le \cdots X^{(n)} the order statistics, then the joint pdf of X^{(1)}, \cdots ,X^{(n)} is given by g(x_1,\cdots, x_n)) = \left\{ \begin{array}{cl} n! f(x_1) \cdots f(x_n) &\textrm{ if } x_1 \le x_2 \le \cdots x_n \\ 0 & \textrm{ else } \end{array} \right.
The characterization of the Poisson process in terms of exponential random variables suggest immediately a very simple algorithm to simulate N_t.
Simulate independent exponential RVs T_1, T_2, \cdots with parameter \lambda and set N_t=0 for 0\le t <T_1, N_t=1 for T_1\le t <T_1+T_2, and so on.
It is not easy to simulate directly a Poisson random variable X from its pdf/cdf but we can do it elegantly using its relation with exponential random variable. To do this generate independent exponential random variable until they sum up to 1 (so as to generate X=N_1) and use the relation between exponential and uniform.
\begin{aligned}
X =n & \iff \sum_{k=1}^n T_k \le 1 < \sum_{k=1}^{n+1} T_k
\iff \sum_{k=1}^n - \frac{1}{\lambda} \ln(U_k) \le 1 < \sum_{k=1}^{n+1} - \frac{1}{\lambda} \ln(U_k) \\
& \iff \ln( \prod_{k=1}^n U_k) \ge - \lambda > \ln( \prod_{k=1}^{n+1} U_k)
\iff \prod_{k=1}^n U_k \ge e^{-\lambda} > \prod_{k=1}^{n+1} U_k
\end{aligned}
Algorithm to simulate a Poisson random variable with parameter \lambda: Generate random numbers until their product is smaller than e^{-\lambda}.
Generate random number U_1, U_2, \cdots.
Set X=n if n+1 = \inf \left\{ j \,:\, \prod_{k=1}^{j} U_j < e^{-\lambda} \right\}
We investigate the behavior of N_t for large time. We prove a CLT type result, namely that
\lim_{t \to \infty} \frac{N_t-\lambda t}{\sqrt{\lambda t}} = Z \textrm{ in distribution}
where Z is a standard normal RV.
Recall that the characteristic function of a Poisson RV Y with parameter \mu is E[e^{i\alpha Y}]= e^ {\mu (e^{i\alpha} -1)}. Therefore E\left[ e^{i \alpha \frac{N_t - \lambda t}{\sqrt{\lambda t}}} \right] = e^{\lambda t \left( e^{i \frac{\alpha}{\sqrt{\lambda t}}} -i \frac{\alpha}{\sqrt{\lambda t}} -1\right)} Expanding the exponential we have \lambda t \left(e^{i \frac{\alpha}{\sqrt{\lambda t}}} -i \frac{\alpha}{\sqrt{\lambda t}} -1\right) = - \frac{\alpha^2}{2} + O( \frac{1}{\sqrt{\lambda t}}) and thus \lim_{t \to \infty} E\left[ e^{i \alpha \frac{N_t - \lambda t}{\sqrt{\lambda t}}} \right]=e^{-\frac{\alpha^2}{2}}.
The same computation shows also that, for any fixed t, \lim_{\lambda \to \infty} \frac{N_t-\lambda t}{\sqrt{\lambda t}} = Z \textrm{ in distribution} since rescaling the parameter is equivalent to rescaling time.
We can sample or split a Poisson process. Suppose that every event of a Poisson process (independently of the other events) comes into two different types, say type 1 with probability p and type 2 with probability q=1-p.
Theorem 1.4 Suppose N_t is a Poisson process with paramter \lambda and that every event (independently) is either of type 1 with probability 1 or type 2 with probability q=1-p. Then N^{(1)}_t, the number of events of type 1 up to time t, and N^{(2)}_t, the number of events of type 2 up to time t, are independent Poisson process with rate \lambda p and \lambda(1-p).
Proof. We check that N^{(1)}_t satisfy the definition of a Poisson process and the use Theorem 1.1
N^{(1)}_0=0 and N^{(1)}_t has independent increments since N_t has independent increments and events are classified of type 1 and 2 independently of each other.
We have
\begin{aligned}
P \left\{ N^{(1)}_{t + \Delta t}= N^{(1)}_t +1 \right\} & =
P\{ N_{t + \Delta t}= N_t +1 \textrm{ and the event is of type } 1 \} \\
& + P\{ N_{t + \Delta t} \ge N_t +2 \textrm{ and exactly one event is of type } 1 \} \\
&= \lambda \Delta t \times p + o(\Delta t)
\end{aligned}
\begin{aligned} P \left\{ N^{(1)}_{t + \Delta t}= N^{(1)}_t \right\} & = P\{ N_{t + \Delta t}= N_t \} + P\{ N_{t + \Delta t} = N_t + 1 \textrm{ and the event is of type 2} \} \\ & + P\{ N_{t + \Delta t} \ge N_t + 2 \textrm{ and no event of type $1$} \} \\ &= (1- \lambda \Delta t) + \lambda \Delta t (1-p) + o(\Delta t) =1-\lambda \Delta t p + o(\Delta t) \end{aligned} P \left\{ N^{(1)}_{t + \Delta t}\ge N^{(1)}_t +2 \right\} =o(\Delta t)
Finally to show that N^{(1)}_t and N^{(2)}_t are independent we compute their joint PDF by conditioning on the value of N_t and find \begin{aligned} P\left\{N^{(1)}_t =n , N^{(2)}_t =m \right\} &= P\left\{N^{(1)}_t =n , N^{(2)}_t =m | N_t={n+m}\right\} P\left\{N_t={n+m}\right\} \\ &= {n+m \choose n} p^n (1-p)^m e^{-\lambda t} \frac{(\lambda t)^{n+m}}{(n+m)!} \\ &= e^{-\lambda p t} \frac{ (\lambda p t )^n}{n!} e^{-\lambda (1-p) t} \frac{(\lambda(1-p) t)^{m}}{m!} \end{aligned} \blacksquare
We revisit the couplon collector but we relax the assumption that all the toys are equally probable. We assume that any box contains toy i with probability p_i. How do we compute now the expected number of boxes needed to collect all the M toys? The argument used earlier does not generalize easily.
We use the following trick or radomizing the time between boxes. Instead of collecting boxes at fixed time interval, we collect them at times which are exponentially distributed with paramter 1. Then the number of boxes collected up to time t a Poisson process N_t with rate \lambda=1 (on average it takes the same time to get a new box). We have now M types of events (getting a box with toy i) and we split the poisson process accordingly. Then by Theorem 1.4 the number of toys of type i collected up to time t, N^{(i)}_t is a Poisson process with rate \lambda p_i=p_i and the Poisson processes N^{(i)}_t are independent.
We now consider the times T^{(i)} = \textrm{ time of the first event for the process } N^{(i)}_t that is the time where the first toy of type i is collected. The times T^{(i)} are independent since the underlying Poisson processes are independent, and are exponential with paramter p_i. Furthermore S = \max_{i} T^{(i)} = \textrm{ time until one toy of each type has been collected}.
By independence we have P\left\{ S \le t \right\} = P\left\{ \max_{i} T^{(i)} \le t\right\} = \prod_{i=1}^M P\left\{ T^{(i)} \le t \right\} = \prod_{i=1}^M (1- e^{-p_it}) Thus E[S]= \int_0^\infty P\{S \ge t\} dt = \int_0^\infty ( 1- \prod_{i=1}^M (1- e^{-p_it}) )\, dt
Finally we relate S to the original question. If X is the number of box needed to collect all the toys then we have S = \sum_{k=1}^X S_k where S_k aree IID exponential with parameter 1. But conditioning it is easy to see that E[S]= E[N]E[S_1] = E[N] and we are done.
We can generalize the Poisson process by making the rate \lambda(t) at which event occur depend on time: a nonhomogeneous Poisson process N_t with rate paratmeter \lambda(t) is a continuous time stochastic process such that
Independent increments: Given times s_1\le t_1 \le s_2 \le t_2 \cdots \le s_n \le t_n the random variables N_{t_i}-N_{s_i} (that is the number of events occurring in the disjoint time intervals [s_i,t_i]) are independent.
We have \begin{aligned} P \{ N_{t + \Delta t } = N_t \} &= 1- \lambda(t) \Delta t +o(\Delta t) & \\ P\{N_{t + \Delta t}= N_t +1\} &= \lambda(t) \Delta t +o(\Delta t) & \textrm{ with } \lim_{\Delta t \to 0}\frac{o(\Delta t)}{\Delta t}=0 \\ P\{N_{t + \Delta t}\ge N_t +2\} & = o(\Delta t) & \end{aligned} \tag{1.2}
One way to construct a nonhomgeneous Poisson process is by sampling it in a time-dependent manner. Suppose \lambda(t) is bounded (locally in t), then we pick \lambda > \lambda(t). We consider a Poisson process M_t with constant rate \lambda, and if an event occurs at time t then we decide to keep this event with probability p(t)=\frac{\lambda(t)}{\lambda} and we discard the event with probability 1-p(t). By the same argument we used in the section Sampling a Poisson process we see the number of kept events satisfies the definition of a non-homogeneous Poisson process in Equation 1.2
Let us consider an event for the process M_t which occured in the interval [0,t]. By our analysis of arrival time we know that this event occured a time which is uniformly distributed on the interval [0,t]. Therefore the probability that this event was accepted and contribute to N_t is therefore p_t = \frac{1}{t}\int_0^t \frac{\lambda(s)}{\lambda} \, ds
By repeating then the second part of the arguement in the section Sampling a Poisson process we see that M_t has a Poisson distribution with parameter \lambda t p_t = \int_0^t \lambda(s) \, ds and in particular E[N_t] = \int_0^t \lambda(s) \, ds
Assume that the the flow of customers entering an onine store follows a Poisson process N_t with rate \lambda. The time S spent in the store for a single customer (browsing around, checking out, etc..) is given by tis CDF G(t)=P\{ S \le t\} and we assume that the customers are independent of each other.
To figure out how to allocate ressources one wants to figure out what is number of customers, M_t, which are still in the sytem at time t.
To find the distribution of M_t let us consider one of the customer by time t. If he arrived at time s\le t then he will have left the system at time t with probability G(t-s) and will still be in the system by time t with probability 1-G(t-s). Since the arrival time of that customer is uniform on [0,t] the distribution of M_t is Poisson with mean
E[M_t] = \int_0^t (1- G(t-s))ds = \lambda \int_0^t (1 - G(s))ds,.
For large t, we see that E[M_t] \approx \lambda E[S].
Example: Suppose that the number of claims receieved by an insurance follows a Poisson process. The size of each claim will be different and it natural to assume that claims are independent from each other. If we look at the total claims incurred by the insurance company this leads to a stochastic process called a compound Poisson process.
A stochastic process X_t is called a compound Poisson process if it has the form X_t = \sum_{k=1}^{N_t} Y_k where N_t is a Poisson process and Y_1, Y_2,\cdots are IID random variables which are also independent of N_t.
The process X_t has stationary independent increments. Using that N_t-N_s is a Poisson process X_t-X_s = \sum_{k=N_s}^{N_t} Y_k \textrm{ has the same distribution as } X_{t-s}= \sum_{0}^{N_{t-s}} Y_k
We can compute the MGF of X_t (or its charactersitic function) by conditining on N_t. Suppose m_Y(\alpha)=E[e^{\alpha Y}] is the moment generating function of Y and using the MGF for the Poisson RV we find \begin{aligned} m_{X_t}(\alpha)& =E\left[ e^{\alpha X_t}\right] = E\left[ e^{\alpha \sum_{k=1}^{N_t} Y_k} \right] = \sum_{n=0}^\infty E\left[ e^{\alpha \sum_{k=1}^{N_t} Y_k} | N_t =n \right] P\{n_t=n\} \\ & = \sum_{n=0}^\infty m(\alpha)^n P\{n_t=n\} = e^{\lambda t ( m(\alpha) -1 )} \end{aligned}
We can compute then the mean and variance \begin{aligned} m_{X_t}'(\alpha)&= e^{\lambda t ( m(\alpha) -1 )} \lambda t m'(\alpha) \\ m_{X_t}''(\alpha)&= e^{\lambda t ( m(\alpha) -1 )} (\lambda t m''(\alpha) + \lambda t)^2() m'(\alpha)^2 ) \end{aligned} and thus E[X_t]= \lambda t E[Y] \quad \textrm{ and } Var[X_t]= \lambda t ({\rm Var}(Y) + E[Y]^2]) With a bit more work we could prove a central limit theorem.
Exercise 1.1 Let N_t be a Poisson process with rate \lambda and let 0 < s < t. Compute
P(N_t =n+k|N_s =k)
P(N_s =k|N_t =n+k)
E[N_tN_s]
Exercise 1.2 Robins and Blackbirds make independent visit to my birdfeeder and they are described by independent Poisson processes R_t and B_t with rate \rho and \beta (per minute) respectively.
What is the probability I see four birds within the 5th and the 10th minutes.
What is the expected number of Robbins I will see between the third and fifth minutes given that I saw 3 Robbins in the first two minutes.
What is the probability that the first two birds I see are Robins?
I have seen ten birds in the last hour. What is the probbaility that three of them were balckbirds?
What is the probability that I see exactly three Robins while I am waiting for to see my first blackbird?
Let T denotes the arrival time of the first blackbird. Find the distribution of R_T (i.e. compute P(R_T=k))
Exercise 1.3 (Estimating the number of asymptomatic using Poisson process with variable rates) Suppose people get infected with a disease at a certain rate, a process described by a Poisson process I_t with rate \lambda which is unknown but constant.
Upon being infected there is an incubation time I until the infected individual exhibits symptoms and we have P( I \le t) =G(t) for some known distribution function G(t).
Suppose S_t is the total number of infected individual exhibiting symptoms by time t and A_t is number of infected individual which do not exhibit symptoms. What are the rates for the processes S_t and A_t?
If t is reasonably large one can argue that a poisson processes N_t with variable rate \lambda(t) satisfies N_t \approx E[N_t]= \int_0^t \lambda(s) ds with high probability. (This follows from the fact that a Poisson RV with large parameter concentrates around its mean, see the CLT argument).
Use this fact to estimate the E[A_t] even if the infection rate \lambda is unknown.
Suppose P(I \ge t)=e^{-t/\beta} with \beta=10, and that after 16 years 220 thousand people are infected. What is the estimate for the number of asymptomatic individuals?
In this section that we build a continuous time Markov process X_t with t\ge 0. The Markov property can be expressed as P\{ X_t=j| \{X_{r}\}, 0\le r \le s \} = P\{X_t=j| X_s \} \,. for any 0 < s < t.
To construct a Markov we will need to use exponential random variables. Recall that an exponential random variable T with parameter \lambda has the pdf f_T(t)=\lambda e^{-\lambda t}, for t \ge 0, the cdf F_T(t) = 1- e^{-\lambda t} and mean E[T]=\frac{1}{\lambda}.
A simple and important fact is the memoryless property of exponential random variables.
P((T > t+s | T> s) =P(T>t)
If you think of T as a waiting time then the memoryless property tells you that if you have waited a time s then the probability that you have to wait an extra time t is exactly the same as waiting for a time t at the beginning. In that sense the process of waiting starts anew at anytime, and so you have forgotten the past. This property is the key to construct Markov process in continuous time.
For general Markov process we will need exponential random variables with various paramters and we will use the following simple fact repeatedly.
Proposition 2.1 (Properties of exponential randomm variables) Let T_1, T_2, T_3, \cdots be independent exponential random variables with parameter \lambda_1, \lambda_2, \cdots. Then
T= \min\{ T_1, \cdots, T_n\} is an exponential random variables with paramter \lambda_1+ \cdots + \lambda_n. Note that n=\infty is allowed if we assume that \sum_{n}\lambda_n is finite.
\displaystyle P\{ T_i = \min \{ T_1, \cdots, T_n\} \} = \frac{\lambda_i}{\lambda_1+ \cdots + \lambda_n}
Proof. For 1. we have, using independence,
P\left\{ T > t \right\} = P\left\{ T_1 > t, \cdots, T_n > t \right\} = P\left\{ T_1 > t \right\} \cdots P\left\{ T_m > t \right\}
= e^{-(\lambda_1+ \cdots \lambda_n)t}
and thus T is an exponential random variable.
For 2. we have, by conditioning,
P\left\{ T_1 = T \right\} = \int_0^\infty P\{ T_2>t, \cdots, T_n > t \} f_{T_1}(t) \, dt = \int_0^\infty e^{-(\lambda_2 + \cdots+ \lambda_n)t} \lambda_1 e^{-\lambda_1 t}\, dt = \frac{\lambda_1}{\lambda_1+ \cdots \lambda_n}
\blacksquare
As for the Poisson process we will give two equivalent definition of the process, the first one describe infinitesimal rates of change of the probability disttribution and leads to a system of ODEs describing the evolution of the pdf of X_t which are called the Kolmogorov equation. The second definition use exponential random variables and waiting times and will lead naturally to an algorithm to simulate a continuous time Markov chain, often called the stochastic simulation algorithm.
To define a Markov process on the state space S we assign a number \alpha(i,j) for any pair of states i,j with i\not=j. You should think these numbers \alpha(i,j) = \textrm{ rate at which the chain changes from state } i \textrm{ to state } j \,. We denote \alpha(i) = \sum_{j\not= i} \alpha(i,j) = \textrm{ rate at which the chain changes from state } i \,.
Formally a continuous Markov chain with rates \alpha(i,j) is a stochastic process X_t such that \begin{aligned} P\{ X_{t+\Delta t}=i| X_t=i\} &= 1 - \alpha(i)\Delta t + o(\Delta t) \\ P\{ X_{t+\Delta t}=j| X_t=i\} &= \alpha(i,j)\Delta t + o(\Delta t) \end{aligned}
Proceeding as for the Poisson process we can derive a differential equation for p_t(i)=P\{X_t =i\}. By conditioning we have P\{ X_{t+\Delta t}=i\} = (1 - \alpha(i)\Delta t )P\{ X_t=i\} + \sum_{j \not =i} \alpha(j,i)\Delta t P\{ X_t=j\} + o(\Delta t) which leads to the system of linear ODE’s \frac{d}{dt}p_t(i) = -\alpha(i) p_t(i) + \sum_{j \not= i} \alpha(j,i) p_t(j) \tag{2.1} called the Kolmogorov backward equations.
The infinitesimal generator of a continuous-time Markov chain is given by the matrix A = \begin{matrix} 1 \\ 2 \\ 3 \\ \vdots \end{matrix} \begin{pmatrix} -\alpha(1) & \alpha(1,2) & \alpha(1,3) & \cdots \\ \alpha(2,1) & -\alpha(2) & \alpha(2,3) & \cdots\\ \alpha(3,1) & \alpha(3,2) & -\alpha(3) & \cdots\\ \vdots & \vdots & \vdots & \end{pmatrix} and the entries of A satifies A(i,j)\ge 0 \textrm{ for } i \not= j \quad \textrm{ and } \sum_{i} A(i,i) = 0
If we use a row vector p_t = (p_t(1), p_t(2), \cdots) then we can rewrite Equation 2.1 as the system \frac{d}{dt}p_t = p_t A \tag{2.2}
If S is finite then one can write the solution in terms of the matrix exponential e^{tA} := \sum_{k=0}^\infty \frac{t^k A^k}{k!} p_t = p_0 e^{tA} \,. where p_0(i)=P\{X_0=i\} is the initial distribution.
We can also write equation for the transition probabilities (take X_0=i) P_t(i,j) = P \{ X_t =j| X_0=i\} and we obtain the matrix equation \frac{d}{dt}P_t = P_t A, \quad \textrm{ with } P_0 =I \quad \implies \quad P_t = e^{tA} which we can solve using linear algebra techniques.
A = \begin{matrix} 1 \\ 2 \\ 3 \\ 4 \end{matrix} \begin{pmatrix} -1 & 1 & 0 & 0 \\ 1 & -3 & 1 & 1 \\ 0 & 1 & -2 & 1 \\ 0 & 1 & 1 & -2 \end{pmatrix} The (right) eigenvalues of A are 0, -1, -3, -4 and we can diagonalize A and find D=\begin{pmatrix} 0 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -3 & 0 \\ 0 & 0 & 0 & -4 \\ \end{pmatrix} = Q^{-1} A Q where the columns of Q are the corresponding eiegenvectors: Q=\begin{pmatrix} 1& 1 & 0 & -\frac13 \\ 1& 0 & 0 & 1 \\ 1& -\frac12 & -\frac12 & -\frac13 \\ 1& -\frac12 & \frac12 & \frac13 \\ \end{pmatrix} \quad \quad Q^{-1}=\begin{pmatrix} \frac14 & \frac14 & \frac14 & \frac14 \\ \frac23& 0 & -\frac13 & -\frac13 \\ 0& 0 & -1 & -1 \\ -\frac14& \frac34 & -\frac14 & -\frac14 \\ \end{pmatrix} Then we find P_t = e^{tA} = Q e^{tD} Q^{-1} =
We can also construct the Markov chain X_t using exponential random variable.
If X_t=i consider (independent) random variable T(i,j) with parameter \alpha(i,j) for j \not =i. Think of those as exponential clocks and when the first of the clocks rings, say clock j, then the Markov chain moves to state j.
Using Proposition 2.1 we see that the chain moves from the state i to j if the clocks associated to the pair (i,j) rings first and this occur with probability Q(i,j) = \frac{\alpha(i,j)}{\sum_{j \not= i}\alpha(i,j)} = \frac{\alpha(i,j)}{\alpha(i)}
The matrix Q(i,j) is the transition matrix for a discrete time Markov chain Y_n which has the same junmps probabilities as X_t but the transition occur at each unit time.
Then a new set of clocks with rates \alpha(j,k) is produced and the process starts anew until the Markov chain moves to a new state.
By Proposition 2.1 we can express this slightly differently. If in state i we use a single clock with rate \alpha(i) and when this clock rings then we move to state j according to the discrete time Markov chain Q(i,j).
The Markov property follows form the memoryless property for an exponential distribution T: P(T\ge t +s| T \ge s) = P(T>t).
If X_t=i then by construction the position after the next jump after time t clearly depends only i and not the states that the Markov chain visited before time t. Moreover if we consider the time of the last jump before time t which occured, say at time s<t, then the memoryless property of the exponential random variable implies that the time at which the jump occur after time t does not depend on s at all. Putting these together this implies the Markov property P\{X_{t+u} =j | X_{s}, 0 \le s \le t\} = P\{X_{t+u} =j | X_t \}
To connect this to the previous description we note that by conditioning on the first jump
P_t(i,j) = P(X_t=j|X_0=i) = \delta(i,j) e^{-\alpha(i) t} + \int_0^t \alpha(i) e^{-\alpha(i)s} \sum_{k} {Q(i,k)} P_{t-s}(k,j) ds
Iterating this equation and setting t=\Delta t we find
\begin{aligned}
P(X_{\Delta t}=j|X_0=i) &= \delta(i,j) e^{-\alpha(i) \Delta t} + \int_0^{\Delta t} \alpha(i) e^{-\alpha(i)s} \sum_{k} \frac{\alpha(i,k)}{\alpha(i)} \delta(k,j) e^{-\alpha(k)(\Delta t-s)} ds + \cdots \\
&= \delta(i,j) e^{-\alpha(i) \Delta t} + \alpha(i,j) \Delta t \times \underbrace{ e^{-\alpha(j) \Delta t}
\frac{1}{\Delta t} \int_0^{\Delta t} e^{-(\alpha(i)-\alpha(j)) s} ds
}_{\to 1 \textrm{ as }\Delta t \to 0} + \cdots \\
&= \delta(i,j) (1 - \Delta t \alpha(i)) + \alpha(i,j) \Delta t + o(\Delta t)
\end{aligned}
The higher order terms are negligible since the probability to have two jumps in the time interval [0,\Delta t] is o(\Delta t).
Consider a Markov chain Y_n with transition matrix Q and we assume that Q(i,i)=0. Then we pick rate \alpha(i)=1 for all states i. The times at which the Markov chain has transition is thus a sum of IID exponential, that at time t is then described by a Poisson process N_t. In other terms we have X_t = Y_{N_t} where N_t is a Poisson process with rate 1.
In this case we can compute the transition matrix quite explicitly: \begin{aligned} P\{X_t=j|X_0=i\}& = P\{ Y_{N_t}=j| X_0=i\} \\ &= \sum_{n=0}^\infty P\{ Y_{N_t}=j, N_t=n| X_0=i\} \\ &= \sum_{n=0}^\infty P\{ Y_n=j| X_0=i\} P\{N_t=n\} \\ &= \sum_{n=0}^\infty \frac{e^{- t} t^n}{n!} Q^n(i,j) = e^{t(Q-I)} \end{aligned}
and the generator is given by A = (Q-I).
Exercise 2.1 Consider a two-server system in which a customer is first served by server 1, then by server 2 and then departs. The service times at server i are exponential random variables with parameter \mu_i with i = 1, 2. When you enter the system you find server 1 free and two customers at server 2, customer A in service and customer B waiting in line.
Find the probability P_A that A is still in service when you move over to server 2.
Find the probability P_B that B is still in service when you move over to server 2.
Compute E[T], where T is the total time you spend in the system. Hint: Write T = S_1 + S_2 + W_A + W_b where S_i is your service time at server i, W_A os the amount of time you wait in queue when while A is being served, and W_B the amount of time you wait in queue when while B is being served.
Exercise 2.2 (Hyperexponential and hypoexponential random variables) Suppose T_1 and T_2 are independent exponential random variables with parameters \lambda_1 and \lambda_2.
Suppose N is a RV with P(N=1)=p(1) and P(N=2)=p(2)=1-p(1). The random variable T_N is called an hyperexponential RV. It describe the service time of an agent sent to one of two service station with suitable probabilities. What is the probability distribution function of T_N.
The random variable T_1+T_2 is called an hypoexponential RV and describe the service time of an agent going through 2 successive service station. What is the probability distribution of X_1+X_2?
A probability vector \pi is a stationary distribution for the Markov chain with generator A if
\pi P_t = \pi \quad \textrm{ for all } t =0
which, by Komlogorov equation, is equivalent to
\pi A = 0
In terms of the rate \alpha(i,j) we see that \pi is stationary if ansd only if \sum_{i \not= j} \pi(i) \alpha(i,j) = \pi(j)\alpha(j) = \sum_{i \not=j} \pi(j) \alpha(j,i) which we can interpret as a balance equation: \pi(i)\alpha(i,j) is the rate at which the chain in a state \pi changes from i to j.
As for discrete time we say that a Markov chain satisfies detailed balance if \pi(i) \alpha(i,j) = \pi(j) \alpha(j,i) \textrm{ for all } i \not= j and detailed balance obviously implies stationarity.
The Markov chain with generator A is irreducible if for any pair of states i,j we can find states i_1, \cdots, i_{N-1} such that \alpha(i,i_1) \alpha(i_2, i_3) \cdots \alpha(i_{N-1},j) >0 If there exists a stationary distribution for an irreducible chain then \pi(i)>0 for all i. Indeed if \pi(i)>0 and \alpha(i,j)>0 then \pi A(j) =0 implies that \pi(j)\alpha(j) = \sum_{k}\pi(k)\alpha(k,j) \ge \pi(i)\alpha(i,j)>0 and thus \alpha(j)>0.
The issue of periodicity cannot occur for continuous time Markov chains
Lemma 3.1 For an irreducible Markov chain with generator A, P_t(i,j)>0 for all i,j and all t>0.
Proof. Using the Markov property
\begin{aligned}
P\{X_t=j|X_0=i\} &\ge P \{ X_{t/N}=i_i, X_{2t/N}=i_2, \cdots X_{t}=j |X_0=i\} \\
&= P \{ X_{t/N}=i_1| X_0=i\} P\{ X_{2t/N}=i_2| X_{t/N}=i_i\} \cdots P\{ X_{t}=j |X_{t\frac{N-1}{N}}=i_{n-1}\}
\end{aligned}
and for example
P \{ X_{t/N}=i_1| X_0=i\} \ge \int_0^{t/N} \alpha(i) e^{-\alpha(i)s} Q(i,i_1)
e^{-\alpha(i_1)(t-s)}>0
\blacksquare
If the state space is finite and the chain is irreducible then we have convergence to equilibrium.
Theorem 3.1 If the state space S is finite and the Markov chain with generator A is irreducible then for any initial distribution \mu we have \lim_{t \to \infty}\mu P_t = \pi
Proof. We use a spectral argument and the result for discrete time Markov chain. Pick a number a such that a is strictly greater than all entries of A (this is possible since A is finite). Consider the matrix R= \frac{1}{a}A + I. Then R is a stochastic matrix and the Markov chain Y_n with transition matrix R is irreducible and aperiodic (because R(i,i)>0). Then from ?@thm-convMC we know (details in the exercises) that R has a simple eigenvalue 1 and all other eigenvalues \lambda satisfy |\lambda|<1. Now Rf = \lambda f \iff Af =a(\lambda -1) f and so 0 is a simple eigenvalue for A and the other eiegenvalue are of the form a(\textrm{Re}(\lambda) -1) + i a \textrm{ Im }(\lambda) and so the real part is strictly negative.
The vector 1=(1,1, \cdots,1 )^T is a right eigenvector for A and \pi is a left eigenvector for A. If we define the matrix \Pi as the matrix whose rows are equal to \pi, we have then
(P^t - \Pi) \begin{pmatrix} 1\\ \vdots \\1 \end{pmatrix}= 0 \textrm{ and } \pi(P^t - \Pi) =0\,.
Moreover if f the right eigenvector and g the left eigenvector for A for the eigenvalue \mu \not=0 then we have
\pi (Af) = \mu \pi f = (\pi A) f =0 \quad \textrm{ and } (g A) 1 = g (A1) = \mu g 1 =0
and thus we must have \pi f=0 and g 1 =0. Therefore
P^t - \Pi)f = P^t f = e^{\mu t} f \textrm{ and } g(P^t - \Pi) = gP^t = e^{\mu t} g.
This implies that P^t - \Pi has the simple eigenvalue 0 and the same eigenvalues e^{\mu t} as P^t and \mu has strictly negative real part. Therefore P^t - \Pi converges to 0 as t \to \infty, or
\lim_{t \to \infty} P_t(i,j)= \pi(j) .
To study the transient behavior in continuous time we can use similar ideas as in discrete time.
Absorption probabilities: the absorption probabilities do not depend on the time spent in every state so they can be computed using the transition matrix Q(i,j) for the embedded chain Y_n and the formula in Section [Absorption probabilities]
Expected hittiing time: For example we have the following result
Theorem 3.2 Supose X_t is an irreducible Markov chain with generator A and for j let \Sigma(j) = \inf\{t \ge 0 ; X_t=j\} be the first hitting time to state j. Let \tilde{A} the matrix obtained by deleting the j^{th} row and j^{th} column from the generator A. Then we have, for i \not= j, E[\Sigma(j)|X_0=i] = \sum_{l} B(i,l) \quad \textrm{ where } B= \tilde{A}^{-1} The matrix \tilde{A} has rowsums which are non-positive and at least one of the row must be strictly negative.
Proof. By conditioning on the first jump which happens at time T we have E[\Sigma(j)|X_0=i] = \underbrace{E[T|X_0=i]}_{\textrm{expected time until the first jump }} + \sum_{k\in S, k\not=j} P\{ X_T = k|X_0=i\} \underbrace{E[ \Sigma(j)|X_0=k]}_{\textrm{expected hitting time} \atop \textrm{ from the state after the first jump}} If we set b(i)=E[\Sigma(j)|X_0=i] (for i \not= j) we find the equation b(i)= \frac{1}{\alpha(i)} + \sum_{k \not= j} \frac{\alpha(i,k)}{\alpha(i)} b(k) \implies 1 = \alpha(i) b(i) - \sum_{k\not= j}\alpha(i,k) b(k) which reads, in matrix form as 1 = -\tilde{A} b \implies b= (-\tilde{A})^{-1} 1. To show that -\tilde{A} is invertible we consider the matrix R= \frac{1}{a}\tilde{A} + I where a is chosen larger than all the entries. Then the entries of R are non-negative and the rowsums do not exceed one with at least one strictly less than 1. By the results for discrete time Markov chains we know that I-R=(-\frac{1}{a} A) is invertible.
For a continuous time Markov chain X_t let us consider the time of the successive jumps S_1=T_1, S_2=T_1+T_2, S_3=T_1+T_2+T_3, \cdots Here the T_i are independent exponential but, in general, no identically distributed with paramters \alpha_i which depends on the state being visited. We have then E[S_n]= \sum_{i=1}^n \frac{1}{\lambda_i}
Explosion: To see what can happen consider a Markov chain with rate \alpha(i,i+1)=(n+1)^2 and all other rates equal to 0. Then the Markov chain moves up by 1 at every jump like a Poisson process but at accelerated pace. We have then E[S_\infty]= \sum_{n=1}\frac{1}{n^2} < \infty so S_\infty < \infty with probability 1. So there are infinitely many jumps in finite time and X_t=+\infty after a finite time. This is called explosion.
This is an issue familiar in ODE: the equation \frac{d}{dt} x_t = x_t^2 has solution has solution x_t= \frac{x_0}{x_0-t} which blows up at time t=x_0.
It is not easy to determine if an explosion really occurs. Indeed for no explosion to occur we must have, with probabilty 1, \sum_{n} \frac{1}{\alpha(Y_n)} = \infty where Y_n is the embedded chain.
A sufficient condition for non-explosion is a suitable upper bound on the rates \alpha(i), say \alpha(i) \le \alpha which is true for finite state spaces but this is by no means necessary.
The recurrence and transience for a continuous time Markov chain can be defined as for discrete time. A an irreducbile Markov chain X_t is recurrent if the chain starting from a state i returns to i with probability 1 and transient if X_t never return to i with some non-zero probability.
Transience/recurrence has nothing to do with the actual time spent in every state and so we have immediately X_t \textrm { with rates } \alpha(i,j) \textrm{ is } \begin{array}{l} \textrm{transient} \\ \textrm{recurrent}\end{array} \iff Y_n \textrm{ with transition } Q(i,j)=\frac{\alpha(i,j)}{\alpha(i)} \textrm{ is } \begin{array}{l} \textrm{transient} \\ \textrm{recurrent}\end{array}
For positive recurrence we need to define the first return to a state i. This is most easily defined in terms of the return for the embeded Markov chain Y_n and its first return time \tau(i). The time until X_0 returns to i will the sum of the time spent in each state. If S_n denotes the time of the jumps and \tau(i)=n after visiting the state j_0, j_1, \cdots j_{n-1}, j_n=i then the return time is \sum_{k=0}^{n-1}(S_{k+1}- S_k) where S_{k+1}-S_k is exponential with paramter \alpha(j_k). The first return time to state i for the Markov chain X_t is given \Sigma(i) = \sum_{k=0}^{\tau(i)-1}(S_{k+1}- S_k)
Theorem 3.3 If the Markov chain with rates \alpha(i,j) is irreducible and recurrent, then there exists a unique solution (up to a multiplicative constant) \eta=(\eta(1), \eta(2), \cdots) to the equation \eta A=0 with 0 < \eta(i) < \infty \,. If it holds that \sum_{i}\eta(i) < \infty then \eta can be normalized to a stationary distribution and X_t is positive recurrent.
Proof. The equation \eta A(k)=0 can be written as \sum_{j \not= k} \eta(j) \alpha(j,k) = \alpha(k) \eta(k) \iff \sum_{j \not= k} \eta(j)\alpha(j) Q(j,k) = \eta(j)\alpha(j) That is the row vector \mu with entries \mu(k)=\alpha(k) \eta(k) must satisfy \mu Q = \mu.
If X_t is recurrent then the embedded Markov chain Y_n with transition Q is recurrent and so by ?@thm-stationarydistribution there exists a solution to \mu Q=\mu and therefore we found a solution for \eta A=0.
Moreover, from ?@thm-stationarydistribution we have a the representation \mu(j)=\alpha(j) \eta(j) = E\left[ \sum_{k=0}^{\tau(i)-1} {\bf 1}_{\{Y_k=j\}}\right] where i is some fixed but arbitrary reference state (this counts the number of visits to the state j between two consecutive visits to the reference state i)
If we denote by S_n the time of the n^{th} jump for X_t we have \begin{aligned} \eta(j) &= \sum_{k=0}^\infty E\left[ \frac{1}{\alpha(j)} {\bf 1}_{\{Y_k=j\}} {\bf 1}_{\{\tau(i) >k\}} \right] = \sum_{k=0}^\infty E\left[ (S_{k+1} -S_k) {\bf 1}_{\{Y_k=j\}} {\bf 1}_{\{\tau(i) >k\}} \right] = E\left[ \sum_{k=0}^{\tau(i)-1} (S_{k+1} -S_k) {\bf 1}_{\{Y_k=j\}} \right] \end{aligned} which is nothing but the time spent (by X_t) in the state j between successive visits to i.
If \sum_{j} \eta(j) <\infty then we have E[\Sigma(j)]=E\left[ \sum_{k=0}^{\tau(i)-1} (S_{k+1} -S_k)\right] <\infty which is the expected return time to state i. That is the chain X_t is positive recurrent.
We have the following theorem which is the exact counterpart of the discrete time case (and is proved very similarly so we will omit the proof).
Theorem 3.4 Suppose X_t is irreducible and positive recurrent. Then X_t has a unique stationary distribution, and with probability 1, the time spent in state j, converges to \pi(j) \lim_{t\to \infty}\int_0^t \mathbf{1}_{\{X_s=j\}} \, ds = \pi(j). Moreover we have Kac’s formula: \pi(j) is also equal to the average time between consecutive visits to state j: \pi(j) = \frac{1}{E[\Sigma(j)|X_0=j]}. Conversely if X_t has a stationary distribution then X_t is positive recurrent.
A general birth and death process is a continuous time Markov chain with state space \{0,1,2,\cdots\} and whose only non-zero transition rates are \begin{aligned} &\lambda(n) = \alpha(n,n+1) & = \textrm{ birth rate for a population of size } n & (\textrm{ for } n\ge 0) \\ &\mu(n)=\alpha(n,n-1) & = \textrm{ death rate for a population of size } n & (\textrm{ for } n\ge 1) \end{aligned}
The Kolmogorov equations for the distribution of X_t are, for n\ge 1 \frac{d}{dt}p_t(n) = \underbrace{ \mu(n+1) p_t(n+1)}_{\textrm{ increase due do death} \atop \textrm{in a population of size } n+1} + \underbrace{\lambda_{n-1} p_t(n-1)}_{\textrm{ increase due do birth} \atop \textrm{in a population of size } n-1} - \underbrace{(\lambda(n) + \mu(n))p_t(n)}_{\textrm{ decrease due do birth/death} \atop \textrm{in a population of size} n}. For n=0, the eqation reads \displaystyle \frac{d}{dt}p_t(0)= \mu(1) p_t(1) - \lambda(0) p_t(0).
The generator has the form \begin{matrix} 0 \\ 1 \\ 2\\ \vdots \end{matrix} \begin{pmatrix} -\lambda(0) & \lambda(0) & 0 & 0 & \dots \\ \mu(1) &-\lambda(1) -\mu(1) & \lambda(1) & 0 & \dots \\ 0 & \mu(2) &-\lambda(2) -\mu(2) & \lambda(2) & \dots \\ \vdots & \ddots & \ddots & \ddots & \\ \end{pmatrix}
The transition matrix for the embedded process Y_n is \begin{matrix} 0 \\ 1 \\ 2\\ \vdots \end{matrix} \begin{pmatrix} 0 & 1 & 0 & 0 & \dots \\ \frac{\mu(1)}{\mu(1)+ \lambda(1)} & 0 & \frac{\lambda(1)}{\mu(1)+ \lambda(1)} & 0 & \dots \\ 0 & \frac{\mu(2)}{\mu(2)+ \lambda(2)} & 0 & \frac{\lambda(2)}{\mu(2)+ \lambda(2)} & \dots \\ \vdots & \ddots & \ddots & \ddots & \\ \end{pmatrix} which is the transition matrix of a general random walk on the non-negative integers.
Poisson process: There is no death and the birth rate is constant \mu(n)=0 \quad \textrm{ and } \quad \lambda(n)=\lambda
Queueing models: Imagine a Poisson process (with rate \lambda) describing the arrival of customers. In the systems there are k service station at which the service time is exponential with rate \mu. If one service station is empty, upon arrival a customer immediately enter a service station and stay in the system until being served. If all stations are currently busy serving other customers, the customer wait until one stay becomes free. These models are called M/M/k queues where M statnds for Markovian and k is the number of queues, the first M describing the arrival process and the second M the service time (which is exponential).
M/M/1 queue: If there is only one service station \lambda(n)=\lambda \quad \textrm{ and } \quad \mu(n) =\mu
M/M/k queue: If there is k service station then the rates are \lambda(n)=\lambda \quad \textrm{ and } \quad \mu(n) = \left\{ \begin{array}{cl} n \mu & n \le k \\ k \mu & n > k \end{array} \right.
M/M/\infty queue: If there is k service station then the rates are \lambda(n)=\lambda \quad \textrm{ and } \quad \mu(n) =n\mu
Population models: If X_t describes the size of a population birth rate will be naturally proportional to the size of the population if we assume that all individuals give birth or die with a certina rate.
Pure birth model: no death occur and so \lambda(n)=n \lambda \quad \textrm{ and } \quad \mu(n) = 0
Population model: the rates \lambda(n)=n \lambda \quad \textrm{ and } \quad \mu(n) = n \mu
Population model with immigration: if immigrants arrive acccording to a Poisson process with rate \nu the rates are \lambda(n)=n \lambda +\nu \quad \textrm{ and } \quad \mu(n) = n \mu
To study transience we use the embedded Markov chain and ?@thm-criteriontransience. Choosing the reference state 0 we look for a solution a(n) with a(0)=1 and 0 < a(n)< 1 for n\ge 1 of the system of equations \lambda(n) a(n+1) + \nu(n) a(n-1) = (\lambda(n) + \mu(n)) a(n) This leads to a(n+1) - a(n) = \frac{\mu(n)}{\lambda(n)} (a(n) - a(n-1)) = \cdots = \prod_{j=1}^n\frac{\mu(j)}{\lambda(j)} (a(1) - 1) and thus, by telescoping, a(n+1)= 1 + \sum_{k=0}^{n} (a(k+1) -a(k)) = 1 + \left[1+ \sum_{k=1}^{n} \prod_{j=1}^k\frac{\mu(j)}{\lambda(j)}\right] (a(1) - 1) Taking n \to \infty we must have a(n)\to 0 and \sum_{k=1}^\infty \prod_{j=1}^k\frac{\mu(j)}{\lambda(j)} < \infty and we find the solution a(n)=\frac{\sum_{k=n}^\infty \prod_{j=1}^k\frac{\mu(j)}{\lambda(j)}}{1+ \sum_{k=1}^\infty \prod_{j=1}^k\frac{\mu(j)}{\lambda(j)}}
To study postive recurrence we simply solve for the stationary distribution \pi A=0. Since the embedded Markov chain satisfies detailed balance it is natural tro try to solve the detailed balance equations which amounts \pi(n)\lambda(n) = \pi(n+1)\mu(n+1) which is eaisly solved to find \pi(n) = \prod_{j=1}^n \frac{\lambda(j-1)}{\mu(j)} \pi(0) and thus we have a stationary distribution if and only if \sum_{k=1}^\infty \prod_{j=1}^k \frac{\lambda(j-1)}{\mu(j)} <\infty\,.
The Poisson process are bure birth models are not irreducible and converge to +\infty almost surely.
M/M/1-queue: \prod_{j=1}^n \frac{\mu(j)}{\lambda(j)}=\left(\frac{\lambda}{\mu}\right)^n
Transient if \mu < \lambda.
Recurrent if \mu =\lambda
Positive recurrent if \lambda < \mu and with (geometric) stationary distribution: \pi(n)= \left(\frac{\lambda}{\mu}\right)^n\left(1 -\frac{\lambda}{\mu} \right)
M/M/k-queue: \prod_{j=1}^n \frac{\mu(j)}{\lambda(j)} = \left\{ \begin{array}{cl} n! \left( \frac{\mu}{\lambda} \right)^n & n < k \\ \frac{k!}{k^k} \left(\frac{k\mu}{\lambda}\right)^n & n \ge k \end{array} \right. .
Transient if k \mu < \lambda.
Recurrent if k \mu =\lambda
Positive recurrent if \lambda < k \mu, the stationary \pi is a bit messy to write.
M/M/\infty-queue: \prod_{j=1}^n \frac{\mu(j)}{\lambda(j)}= n! \left( \frac{\mu}{\lambda} \right)^n.
Always positive recurrent with Poisson stationary distribution \pi(n) = e^{- \frac{\lambda}{\mu}} \frac{ \left(\frac{\lambda}{\mu}\right)^n}{n!}
We can analyze the M/M/\infty queue in more details by the following argument. Suppose X_0=n is the size of the queue at time o. Then at time t the people in queue are of two different types: either they were in queue at time 0 and are still in queue at time t or they arrive after time between times 0 and time t and are still in queue.
The probability that a person in queue at time o is still in queue at time t is e^{-\mu t}.
Since the arrival of a Poisson process condiioned on N_t are uniformly distributed on [0,t], the probability that someone not present at time 0 is still present at time t is q(t)=\frac{1}{t}\int_0^t (1- e^{-\mu(t-s)})= \frac{1- e^{-\mu t}}{\mu t} ,. By combining this we see that X_t = N_t + Y_t where N_t \text{ has a binomial distribution with parameters } n \text{ and } e^{-\mu t} Y_t \text{ has a Poisson distribution with parameters } \lambda q(t)t = \frac{\lambda}{\mu}(1 - e^{-\mu t})
Clearly we see, again, from this computation that the distribution of X_t converges to a Poisson distribution with parameter \frac{\lambda}{\mu}.
We can also use it to control the speed of convergence since the previous calculation suggest a coupling. Indeed
Exercise 4.1 (Yule process) The Yule process is is a pure birth process describing the the growth of a population: if there are n individual in the population then each individual will give birth with a rate \lambda and so the birth rate is \lambda(n)=n \lambda. The goal here is to compute explicitly the transition probbaility P(X_t=n|X_0=k).
Assume first X(0)=1 and let T_i be the time it takes for the population to go from size i to size i+1, that is T_i is exponential with rate \lambda i.
Show that we have the relation T_1 + T_2 + \cdots T_n = \max( S_1, S_2, \cdots S_n) where the S_i are exponential with rate 1.
Using 1. compute P(T_1 + T_2 + \cdots T_n < t) and show that P(X_t=n|X_0=1) = e^{-\lambda t}(1- e^{-\lambda t})^{n-1} which is a geometric distribution with success probability p=e^{-\lambda t}.
Using 2. show that P(X_t=n|X_0=k) = {n-1 \choose k-1} e^{-\lambda t k}(1- e^{-\lambda t})^{n-k} which is negative binomial distribution.
Exercise 4.2 (Geometric sum of exponential) For use in Exercise 4.3 prove the following fact. If Q is geometric that P(Q=k)=(1-q)^{k-1}q for k=1,2,3,\cdots and T_i are IID exponential with parameter \lambda then T_1 + T_2 + \cdots T_Q is exponential with parameter \lambda q.
Hint: You can use either the MGF (like in the compound Poisson process or like for branching processes) or the CDF and the PDF..
Exercise 4.3 (More on the M/M/1 queue) Consider an M/M/1 queue with arrival rate \lambda and service rate \mu. Recall that when \lambda < \mu its stationary distribution is geometric with parameter \lambda/\mu.
Show that if the queue is stationary then rate at which customer leaves must be equal to \lambda (and is independent of \mu!)
Suppose again that the queue is stationary, compute the distribution and the expectation of the time W a customer spends in the queue until they reach service station (this is often an important quantity when designing a queueing model!).
Hint: Note the distribution of W as a continuous and discrete part. Use the result in Exercise 4.2.
Suppose that upon entering the system the customers look at the length of the queue and may decide to leave the system depending on the length of the queue. For example assume that if there are n customers in the system, enetering customers will stay with probability p(n)=\frac{1}{n+1}. Find the stationary distribution in this case.
Suppose that agents are actually difficult customers: upon exiting the service station with probability q they exit the system for good but with probaility 1-q they re-enter the sytem and go back in line Show that this process is equivalent to another M/M/1 queue with new rates.
Hint: Use Exercise 4.2 again.
Exercise 4.4 (M/M/1 queues in series) Consider the following queueing system. Agents arrive into the system according to a Poisson process with rate \lambda and then pass successviely through a sequence of K service stations where the service time in station i is exponential with parameters \mu_i.
We describe the number of customers by the vector {\bf X_t} =(X_{1t}, \cdots X_{Kt}) where X_{it} is the number of customers at the i^{th} station and which which takes values in the state space S = \left\{ {\bf n} = \left(n_1, \cdots, n_K\right) \,:\, n_i \ge 0 \text{ integer} \right\}\,.
Make a list of all possible transitions and compute the corresponding rate.
From Exercise 4.3 (part 1.) we learned that for a single M/M/1 queue in equilibrium the rate at which customers enter and leave the system is equal to \lambda. Since the customers are then fed into the next queue this suggests that for the queue in series the rate of customer entering and exiting all the queues will all be equal to \lambda and that can be achieved by the product of geometric distributions
\pi({\bf n})= \pi\left(n_1, \cdots, n_K\right) = \prod_{i=1}^K \left( 1- \frac{\lambda}{\mu_i} \right) \left( \frac{\lambda}{\mu_i} \right)^{n_i}
Prove that \pi is indeed stationary.
Hint: it is easiest to prove this using detailed balance.