Union of events

The union of two events \(A \cup B\) is the event that at least one of them occurs. It is the foundation of the addition rule in probability and appears in any situation where you want to know the probability of one thing or another happening.

Definition

Let \(A\) and \(B\) be two events defined on the same sample space \(\Omega\). The union \(A \cup B\) is the event that at least one of \(A\) or \(B\) occurs:

\[A \cup B = \{\omega \in \Omega : \omega \in A \text{ or } \omega \in B\}\]

“Or” in probability is always inclusive: \(A \cup B\) includes outcomes in \(A\) only, in \(B\) only, and in both.

Venn diagram showing the union of events A and B

Probability of the union

The probability of \(A \cup B\) is given by the addition rule:

\[P(A \cup B) = P(A) + P(B) - P(A \cap B)\]

The subtraction of \(P(A \cap B)\) corrects for double-counting: outcomes in both \(A\) and \(B\) are included in \(P(A)\) and again in \(P(B)\), so they must be subtracted once.

⚠️ The most common mistake: forgetting to subtract the intersection

Writing \(P(A \cup B) = P(A) + P(B)\) without subtracting \(P(A \cap B)\) is only valid when \(A\) and \(B\) are mutually exclusive. In general, this overestimates the probability. If a customer can both open an email AND click a link, you cannot add those two probabilities directly: some customers do both, and you would count them twice.

Special case: mutually exclusive events

When \(A\) and \(B\) cannot both occur (\(A \cap B = \emptyset\), so \(P(A \cap B) = 0\)):

\[P(A \cup B) = P(A) + P(B)\]

This simplification is only valid when the events truly cannot overlap.

Union of three or more events

For three events, the inclusion-exclusion principle gives:

\[P(A \cup B \cup C) = P(A) + P(B) + P(C) - P(A \cap B) - P(A \cap C) - P(B \cap C) + P(A \cap B \cap C)\]

The pattern alternates: add individual probabilities, subtract pairwise intersections, add triple intersections, and so on.

For \(n\) events the general formula is:

\[P\!\left(\bigcup_{i=1}^n A_i\right) = \sum_i P(A_i) - \sum_{i<j} P(A_i \cap A_j) + \sum_{i<j<k} P(A_i \cap A_j \cap A_k) - \cdots\]

Step-by-step examples

Example 1: email marketing campaign

In an email campaign sent to 1,000 subscribers: - 35% opened the email: \(P(O) = 0.35\) - 20% clicked a link: \(P(C) = 0.20\) - 12% both opened and clicked: \(P(O \cap C) = 0.12\)

What is the probability that a subscriber opened the email or clicked a link (or both)?

\[P(O \cup C) = 0.35 + 0.20 - 0.12 = 0.43\]

43% of subscribers engaged in at least one action.

Checking with counts

Out of 1,000 subscribers:

  • 350 opened, 200 clicked, 120 did both.
  • Opened only: \(350 - 120 = 230\).
  • Clicked only: \(200 - 120 = 80\).
  • Either or both: \(230 + 80 + 120 = 430\), which is \(430/1000 = 0.43\). ✓

Drawing a table or thinking in counts is always a good way to verify a union calculation.

Example icon

Example 2: system reliability

A backup system has two independent components. Component A fails with probability 0.05 and component B fails with probability 0.08. The system fails if at least one component fails.

Since the components are independent: \(P(A \cap B) = 0.05 \times 0.08 = 0.004\)

\[P(\text{system fails}) = P(A \cup B) = 0.05 + 0.08 - 0.004 = 0.126\]

About 12.6% chance of system failure. Note: if we had incorrectly omitted the intersection, we would have estimated 13%, a small but systematic overestimate.

Example 3: three events with inclusion-exclusion

A software audit finds that among 200 projects:

  • 80 have security issues (\(A\)): \(P(A) = 0.40\)
  • 70 have performance issues (\(B\)): \(P(B) = 0.35\)
  • 50 have documentation issues (\(C\)): \(P(C) = 0.25\)
  • 30 have both \(A\) and \(B\): \(P(A \cap B) = 0.15\)
  • 20 have both \(A\) and \(C\): \(P(A \cap C) = 0.10\)
  • 15 have both \(B\) and \(C\): \(P(B \cap C) = 0.075\)
  • 10 have all three: \(P(A \cap B \cap C) = 0.05\)

\[P(A \cup B \cup C) = 0.40 + 0.35 + 0.25 - 0.15 - 0.10 - 0.075 + 0.05 = 0.725\]

72.5% of projects have at least one type of issue.

💡 The complement is often easier for 'at least one' problems

For independent events, computing \(P(A \cup B \cup \cdots)\) directly requires tracking all intersections, which grows exponentially. The complement is usually much simpler:

\[P(\text{at least one}) = 1 - P(\text{none})\]

For the reliability example: \(P(\text{at least one fails}) = 1 - P(A^c \cap B^c) = 1 - 0.95 \times 0.92 = 1 - 0.874 = 0.126\).

Same answer, one step.