A **probability distribution** is the collection of all probabilities in the exhaustive set of events for a random variable. The distributions we cover are uniform, binomial, normal and lognormal. There are two types of random variables, **discrete** and **continuous**. Discrete random variables have a countable number of possible outcomes, while continuous random variables have an infinite set of outcomes in the exhaustive set of events.

We can describe probability distributions using **probability functions,** **probability density functions, **and **cumulative distribution functions. **Probability functions simply state the probability that a discrete random variable takes on a value x.

**Probability Function:**P(X = x), where X is a discrete random variable and x is a possible outcome.

When we have been trying to describe the distribution of **continuous random variables**, we use probability density functions. Probability density functions have individual probabilities between zero and one, and the sum of all these probabilities is one.

**Probability Density Function:**f(x)

A probability function is basically an equation for a single point, like X = 3, where as a probability function describes a line, like f(x) = y. This makes intuitive sense because the number of points on a line is infinite, just like the number of possible outcomes for a continuous random variable is infinite.

The cumulative distribution function describes the probability of a portion and a probability function or a probability density function. To find these probabilities, you sum the included points in a pf, or find the area of the range in a pdf.

**Cumulative density function:**f(x) = P(X < or > x)

The first two distributions we look at in the chapter are a **discrete uniform distributions** and a binomial distribution. A discrete uniform distribution has an equal probability for each outcome. T

**Discrete uniform distribution:**P(X = x) = y

Binomial distributions describe the range of probabilities for a **binomial random variable**, whose values are determined by the sum of (n) **Bernoulli random variables. **A Bernoulli random variable is a binary variable which can only take on one of two mutually exclusive outcomes. They can be viewed as success or failure outcomes.

**Bernoulli random variable:**P(1) = p, P(0) = 1 – p**Binomial random variable:**B(n , p) where (n) is the total number of trials and p is the probability of success.

The above reads: the binomial random variable is equal to the probability of success in (n) Bernoulli trials. The expected value of a binomial random variable is depicted as n*p. There is a formula to calculate the probability of x given B(n,p).

**Probability function for a binomial variable:**P(x) = [n!/((n-x)!x!)]p^{x}(1-p)^{n-x}**Mean of a binomial random variable:**n*p**Variance of a binomial random variable:**np(1-p)

The probability distribution is described visually by a binomial tree.

The CFA level presents two distributions to describe continuous random variables, the **continuous uniform distribution **and the **normal distribution.** The continuous uniform distribution can be understood as a linear representation of a continuous random variable, where the probability of an outcome equals the area under the line that is in the range of variable.

The normal distribution is one of the most common distributions in statistics, and the reason for this is the **central limit theorem,** which states that the mean of a large number independent random variables is normally distributed. In other words, given a large enough sample size, the outcomes of a random variable will cluster around the variable mean, forming the bell-curve shape of the distribution. The variance determines the specific widths of the shape, but normal distributions are symmetric and mesokurtic.

**Normal distribution:**X ~ N(µ, σ^{2})

The above equation describes normal distributions for **univariate distributions, **which map only one random variable. To describe the probability distribution of a **multivariate distribution** involving more than one variable, we also need to take into account ((n)(n-1))/2 number of correlations between the variables.

The normal distribution is useful because we use the following probability statements, called **confidence intervals**, to describe the distribution:

- Approximately 50% of all observations fall in the interval µ +/- (2/3)σ
- Approximately 68% of all observations fall in the interval µ +/- σ
- Approximately 95% of all observations fall in the interval µ +/- 2σ
- Approximately 99% of all observations fall in the interval µ +/- 3σ

Any normal distribution can be standardized to N(0,1). Once we have standardized the outcome we are looking for to the N(0,1) distribution, we can choose the desired confidence interval and determine the probability of the outcome using the standard normal distribution cdf chart.

**Standardizing X to N(0,1):**Z = (X- µ)/σ

There are two application introduced in this section, the **safety first optimal portfolio **and **Value at Risk (VaR).** The safety first optimal portfolio is the same equation as the Sharpe ratio, but instead of measuring excess return over the risk free rate per unit of standard deviation, we use a **shortfall value **as the risk free rate. The shortfall risk is the risk that a portfolio falls below a minimum target, and the optimal portfolio maximizes the excess return per standard deviation over this minimum target.

**Safety-first ratio:**(E(R_{p}) – R_{l})/σ_{p}

** _{}**The value at risk describes the minimum value of losses expected over time period (t). A 95% one day VaR of $500,000 means there is a 5% chance of losing at least $500,000 over the day.

The last distribution we look at is the **lognormal distribution, **which is a distribution who natural log is the normal distribution. Because this distribution is bounded on the left by 0 and tends to be skewed to the right (positive skew). The distribution is frequently used to model asset prices.

**Continuously compounded return:**ln(1 + HPR)

In the final section of this reading we are introduced to the **Monte Carlo Simulation** which uses computer algorithms to model probabilities using huge amounts of simulated trials. They are frequently used to price complicated securities that have no analytical pricing models.