Concurrent user estimation is an important step before going for performance testing and capacity planning as it is directly related to consumption of system resources. Therefore, before entering into the load testing phase we need to determine the peak user load or the maximum concurrent user load for designing a workload model. People often estimate the number of concurrent users by intuition or wild guessing with little justification. This often leads to improper performance testing and capacity planning. In this article we would like to share a very reliable method proposed by Eric Man Wong to calculate the concurrent number of users using estimated and justified parameters.
The method involves estimating the peak user load by calculating the average number of concurrent users, based on the total number of user sessions, the average length of the user sessions.
1. Estimating the Average number of concurrent users
For calculating the average concurrent user load, we need to find the following parameters,
- Period of concern (T): It is the time duration for which we are calculating the total number of user sessions.
- Total number of user sessions (n): The number of user sessions at the specified time duration
- The average Length of User sessions (L): The length of a user session is the amount of time that the particular user takes for completing his activity
(During which he consumes a certain amount of the system resource). The average length of user sessions is simply the mean value of the session length of all the users.
where s is the total number of user sessions). The average length of a user session can be estimated by observing how a sample of users uses the system.
A user session is a time interval defined by a start time and end time. Within a single session, let us assume that the user is in active state which means that the user is consuming a certain percentage of the total system memory. Between the start time and end time, there are one or more system resources being held. The number of concurrent users at any particular time is defined as the number of user sessions into which the time instance falls. This is illustrated in the following example
Each horizontal line segment represents a user session. Since the vertical line at time t0 intercepts with three user sessions, the number of concurrent users at time t0 is equal to three. Let us focus on the time interval from 0 to an arbitrary time instant T. The following result can be mathematically proven:
Alternatively, if the total number of user sessions from time 0 to T equals n, and the average length of a user session equals L, then
[NOTE: In the above diagram, t0 represents any particular instance of time. Whereas in the formulae we use the value T which gives us a specific duration or a time period between 2 instances of the time say t1 and t2]
2. Estimating the peak number of concurrent users
For determining the peak user load we make use of some basic probability distribution theorems in the following manner.
We determine the probability of X concurrent users occupying the system at a particular time. We make use of the Poisson distribution for the same. Then we use the normal distribution pattern to determine the pea amount of user load.
By Poisson distribution,
Under this assumption, it can be proven that the concurrent number of users at any time instant also has a Poisson distribution,
Where C is the average number concurrent users we find using the formula
It is well known that the Poisson distribution with mean = C can be approximated by the normal distribution with mean C and standard deviation √c. We denote the number of concurrent users by X.
This implies that (X-C)/√c has the standard normal distribution with mean 0 and standard deviation 1. Looking up the statistical table for the normal distribution, we have the following result:
The above equation means that the probability of the number of concurrent users being smaller than C + 3√c is 99.87%. The probability is large enough for most purposes that we can approximate the peak number of concurrent users by C +√c
We see that the simplicity by which we can determine the peak concurrent users just by determining the average concurrent user load makes it highly efficient. The Eric Man Wong method remains the most reliable method to replicate a realistic and sensible workload model for the performance testing activity.