-
Defines the probability distribution of a continuous random variable.
-
The probability of the variable lying within a range ([a, b]) is given by:
$$ P(a \leq X \leq b) = \int_{a}^{b} f(x),dx $$ -
The total area under the PDF curve is 1.
-
Example: If (X) follows an exponential distribution with rate (\lambda):
- Gives the probability that the variable takes a value (\leq x):
- For continuous random variables:
- Example: CDF of an exponential distribution:
- Can take any real value within an interval. - Defined using a PDF. -
Expected value (mean):
Variance:
- Calculating probabilities:
- Use integration of the PDF.
Describes the number of successes in ( n ) independent Bernoulli trials.
Parameters:
- ( n ): number of trials,
- ( p ): success probability per trial.
Probability Mass Function (PMF):
Expected Value:
Variance:
The variance measures how spread out the data is from the mean:
The standard deviation is the square root of the variance:
Covariance measures how two variables (X) and (Y) change together:
- Models rare events occurring in a fixed interval of time or space.
Parameter:
(expected number of events per interval).
PMF:
Expected Value:
Variance:
Parameters:
- (N): total population size
- (K): number of successes in the population
- (n): sample size
Random Variable: (X),
- the number of observed successes in the sample (without replacement)
PMF:
- Models the number of trials until the first success.
PMF:
- Expected value:
- As the number of trials increases, the sample mean converges to the expected value.
Types: Weak Law: Convergence in probability. Strong Law: Almost sure convergence.
-
When population variance is unknown,
-
Student's t-distribution is used.
- Introduced the t-distribution for small sample sizes when the population variance is unknown.
- t-statistic: $$ t = \frac{\bar{X} - \mu}{s/\sqrt{n}} $$
- Widely used in hypothesis testing and confidence intervals
-
If a sequence of random variables converges in probability to a constant ( c ), and another sequence has a limiting distribution, then the product converges in distribution to ( c ) times the limiting distribution of the other sequence.
-
Important in asymptotic analysis and regression theory.
- PDF/CDF:
Fundamental for continuous variables.
- Discrete Distributions: Includes Binomial, Poisson, Geometric, and Hypergeometric.
- Law of Large Numbers: Ensures convergence of sample mean.
- Central Limit Theorem: Explains normality of sample means.
- Reproductive Theorem: Maintains distribution consistency under linear transformations.
- Gosset's t-distribution: Crucial for small sample inference.
- Slutsky's theorem: Aids in asymptotic analysis.
- Counting Principles and Sampling Distributions: Key in inferential statistics.
- Sample Space ((\Omega)): Set of all possible outcomes.
- Event: A subset of the sample space.
- (0 \le P(A) \le 1) for any event (A).
- (P(\Omega) = 1).
- For mutually exclusive events: $$ P(A \cup B) = P(A) + P(B) $$
- Permutations (order matters): $$ P(n, r) = \frac{n!}{(n-r)!} $$
- Combinations (order doesn't matter): $$ C(n, r) = \frac{n!}{r!(n-r)!} $$
- Conditional Probability: $$ P(A|B) = \frac{P(A \cap B)}{P(B)} $$
- Independence: (A) and (B) are independent if: $$ P(A \cap B) = P(A) \cdot P(B) $$
- Law of Total Probability: $$ P(A) = \sum_i P(A|B_i)P(B_i) $$
-
Formula: $$ P(A|B) = \frac{P(B|A) \times P(A)}{P(B)} $$
-
Components:
- (P(A)): Prior probability.
- (P(B|A)): Likelihood.
- (P(A|B)): Posterior probability.
Definition: A function that maps outcomes to real numbers.
-
Types:
- Discrete: Takes countable values.
- Continuous: Takes uncountable values.
Expectation and Moments
Expected Value (Mean)
- Discrete: $$ E[X] = \sum x,P(X=x) $$
- Continuous: $$ E[X] = \int_{-\infty}^{\infty} x,f(x),dx $$
- Properties: $$ E[aX + b] = a,E[X] + b $$ $$ E[X + Y] = E[X] + E[Y] $$
-
Variance: $$ \operatorname{Var}(X) = E[(X - E[X])^2] = E[X^2] - (E[X])^2 $$
-
Standard Deviation: $$ \sigma = \sqrt{\operatorname{Var}(X)} $$
-
Properties: $$ \operatorname{Var}(aX + b) = a^2,\operatorname{Var}(X) $$
-
For independent (X) and (Y): $$ \operatorname{Var}(X+Y) = \operatorname{Var}(X) + \operatorname{Var}(Y) $$
Covariance: $$ \operatorname{Cov}(X, Y) = E[(X-\mu_X)(Y-\mu_Y)] = E[XY] - E[X]E[Y] $$
-
Correlation Coefficient: $$ \rho = \frac{\operatorname{Cov}(X,Y)}{\sigma_X \sigma_Y} $$
-
(-1 \le \rho \le 1) - (\rho = \pm1) indicates a perfect linear relationship. - (\rho = 0) indicates no linear relationship.
-
Definition: $$ M_X(t) = E[e^{tX}] $$
-
Properties:
- Uniquely determines the distribution.
- Moments can be derived as: $$ E[X^n] = \frac{d^n}{dt^n}M_X(t)\Big|_{t=0} $$
-
For independent variables: $$ M_{X+Y}(t) = M_X(t) \cdot M_Y(t) $$
-
Models a single trial with success probability (p).
-
PMF: $$ P(X=1)=p,\quad P(X=0)=1-p $$
-
Expected Value: ( E[X]=p )
-
Variance: ( \operatorname{Var}(X)=p(1-p) )
-
PDF: $$ f(x)=\frac{1}{b-a},\quad a \le x \le b $$
-
Expected Value: $$ E[X]=\frac{a+b}{2} $$
-
Variance: $$ \operatorname{Var}(X)=\frac{(b-a)^2}{12} $$
- PDF: $$ f(x)=\frac{1}{\sqrt{2\pi\sigma^2}}\exp!\left(-\frac{(x-\mu)^2}{2\sigma^2}\right) $$
- Standard Normal: ( Z \sim N(0,1) )
- Transformation: ( Z = \frac{X-\mu}{\sigma} )
- 68-95-99.7 Rule: Approximately (68%), (95%), and (99.7%) of data lie within 1, 2, and 3 standard deviations, respectively.
- PDF: $$ f(x)=\lambda e^{-\lambda x}, \quad x \ge 0 $$
- Expected Value: $$ E[X]=\frac{1}{\lambda} $$
- Variance: $$ \operatorname{Var}(X)=\frac{1}{\lambda^2} $$
- Memoryless Property: $$ P(X>s+t \mid X>s)=P(X>t) $$
-
Parameters: ( \alpha ) (shape) and ( \beta ) (scale).
-
PDF: $$ f(x)=\frac{x^{\alpha-1}e^{-x/\beta}}{\beta^\alpha\Gamma(\alpha)} $$
- Expected Value: $$ E[X]=\alpha\beta $$
- Variance: $$ \operatorname{Var}(X)=\alpha\beta^2 $$
-
Models: Probabilities or proportions.
-
PDF: $$ f(x)=\frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha,\beta)},\quad 0\le x\le 1 $$
-
Expected Value: $$ E[X]=\frac{\alpha}{\alpha+\beta} $$
-
Variance:
-
$$ \operatorname{Var}(X)=\frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)} $$
- Used when estimating the mean with unknown population variance.
- Parameter: ( \nu ) (degrees of freedom).
- Approaches the normal distribution as ( \nu ) increases.
- Widely used in hypothesis testing and confidence intervals.
- Sum of squared standard normal variables.
- Parameter: ( k ) (degrees of freedom).
- Expected Value: ( E[X]=k )
- Variance: ( \operatorname{Var}(X)=2k )
-
Ratio of two chi-square distributed variables.
-
Parameters: ( d_1 ) and ( d_2 ) (degrees of freedom).
-
Commonly used in ANOVA and variance testing. ---
-
Describes the distribution of two or more random variables.
-
Joint CDF: $$ F(x,y)=P(X\le x,; Y\le y) $$
-
Joint PMF (discrete): $$ p(x,y)=P(X=x,; Y=y) $$
-
Joint PDF (continuous): $$ \iint f(x,y),dx,dy=1 $$
-
Derived by summing or integrating out other variables.
-
Discrete: $$ P_X(x)=\sum_y P(X=x, Y=y) $$
-
Continuous: $$ f_X(x)=\int f(x,y),dy $$
-
Discrete: $$ P(X=x|Y=y)=\frac{P(X=x,Y=y)}{P(Y=y)} $$
-
Continuous: $$ f(x|y)=\frac{f(x,y)}{f_Y(y)} $$
- (X) and (Y) are independent if:
For independent variables:
- (E[XY]=E[X]E[Y])
- (\operatorname{Var}(X+Y)=\operatorname{Var}(X)+\operatorname{Var}(Y)) ---
- The distribution of a statistic (e.g., sample mean (\bar{X}), sample variance (S^2), or sample proportion (\hat{p})) computed from random samples.
- As the number of trials increases, the sample mean converges to the expected value.
- Types:
- Weak Law: Convergence in probability.
- Strong Law: Almost sure convergence.
-
For a large sample size, the sampling distribution of the sample mean is approximately normal.
-
If the population has mean ( \mu ) and standard deviation (\sigma ), then: $$ \bar{X}\sim N\Bigl(\mu,\frac{\sigma^2}{n}\Bigr) $$ -
When the population variance is unknown, the Student's t-distribution is used.
-
If independent random variables belong to a particular distribution, then their sum or any linear combination also follows that distribution.
-
Example: If ( X_1, X_2, \dots, X_n ) are independent normal variables, then $$ aX_1 + bX_2 + \cdots + cX_n $$ is normally distributed.
-
If a sequence (X_n) converges in probability to a constant (c), and another sequence (Y_n) has a limiting distribution, then the product (X_n \cdot Y_n) converges in distribution to (c \cdot Y_n).
-
This theorem is important in asymptotic analysis and regression theory. ---
-
A method for estimating a population parameter with a single value.
-
Properties of good estimators:
-
Unbiasedness: (E[\hat{\theta}]=\theta)
-
Consistency: (\hat{\theta} \to \theta) as (n\to\infty)
-
Efficiency: Minimum variance among unbiased estimators.
-
Sufficiency: Contains all information about the parameter.
-
Method: Find the parameter ( \theta ) that maximizes the likelihood ( L(\theta|x) ).
-
Log-likelihood: $$ \ell(\theta|x)=\log\bigl(L(\theta|x)\bigr) $$
-
Properties: Consistency, asymptotic normality, and efficiency. - Note: Closely related to the Kullback-Leibler divergence in information theory.
- Estimate parameters by equating sample moments to population moments.
- Simpler but often less efficient than MLE.
- A statistic is sufficient if it captures all information in the sample about the parameter.
- Provides a method to determine sufficient statistics.
- Closely related to exponential family distributions.
- An interval that, with a specified confidence level ((1-\alpha)), is likely to contain the true parameter.
- Construction methods vary based on the parameter and distribution.
- Setup: Compare a null hypothesis ((H_0)) against an alternative hypothesis ((H_1)).
- Test Statistic: A measure to evaluate the evidence against (H_0). - Errors:
- Type I Error: Rejecting (H_0) when it is true ((\alpha)).
- Type II Error: Failing to reject (H_0) when it is false ((\beta)).
- Power of the Test: (1-\beta) (the probability of correctly rejecting a false (H_0)).
- p-value: The probability of observing a test statistic as extreme or more extreme than the one observed, assuming (H_0) is true. - Common Tests:
- z-test: When the population variance is known.
- t-test: When the population variance is unknown (one-sample, two-sample, or paired).
- F-test: For comparing variances.
- Chi-square test: For goodness-of-fit and testing independence.
- ANOVA: Analysis of variance for comparing multiple means. ---
-
A stochastic process with the Markov (memoryless) property.
-
Transition Probabilities: $$ P(X_{n+1}=j \mid X_n=i)=p_{ij} $$
-
Transition Matrix: (P=[p_{ij}])
-
States: Classified as transient, recurrent, or absorbing.
-
Stationary Distribution: Long-run behavior of the chain.
-
Applications: Queueing theory, genetics, economics.
- A counting process for random events over time or space.
Properties:
- Independent and stationary increments.
- ( N(t) \sim \text{Poisson}(\lambda t) )
- Interarrival times are exponentially distributed with rate (\lambda).
-
A continuous-time stochastic process with continuous paths.
-
Properties:
-
(B(0)=0). - Independent increments. - Normal increments: $$ B(t)-B(s) \sim N(0,t-s) $$
-
Arranging sample values in ascending order: $$ X_{(1)} \le X_{(2)} \le \cdots \le X_{(n)} $$
-
Applications: Reliability theory, extreme value analysis.
- Uses Bayes' theorem to update probabilities based on new evidence.
Process:
- Prior (\to) Likelihood (\to) Posterior.
Advantages:
- Incorporates prior knowledge and provides direct probability statements about parameters.
Bayesian Estimation:
- Yields credible intervals.
-
Linear Regression: ( Y = \beta_0 + \beta_1X + \epsilon )
-
Multiple Regression: $$ Y = \beta_0 + \beta_1X_1 + \beta_2X_2 + \cdots + \beta_kX_k + \epsilon $$
-
Estimation: Typically via least squares.
-
Assumptions: Linearity, independence of errors, homoscedasticity, and normality of errors.
- Data collected sequentially over time. Components: Trend, seasonality, cyclical, and irregular. Models: AR, MA, ARIMA, GARCH. Applications: Forecasting and analyzing temporal patterns. ---