## 1. What the Findings Actually Tell Us At a high level, these observations demonstrate that **password selection is highly structured, culturally influenced, and statistically concentrated**. Key takeaways: - Passwords are **shorter than policy expectations** - Users favor **memorability over randomness** - Behavior converges into **predictable families** - Small differences across regions **shift probability mass**, not randomness - Reuse dominates outcomes ## 2. Length, Vocabulary, and Structural Compression ### Observed - Average password length: **7–9 characters** - Average English word length: **~5 characters** - Known vocabulary size: **50k–144k words** - Most passwords map to **1–2 words + minor transforms** ### Impact This means that most passwords are **not random strings**, but **compositions**: `[word] + [minor variation] [word][digit] [word][year] [word][symbol]` The effective search space collapses dramatically: - Dictionaries cover a **large percentage of base material** - Transform rules account for the rest - Length alone does not create unpredictability --- ## 3. Character Usage and Guess Ordering ### Observed - Vowels appear in ~61% of passwords - Capital letters are usually: - At the beginning - Followed by a vowel - Numbers are usually: - `1`, `2`, or a year - Sequential - At the end or beginning - Symbols cluster tightly (`! @ # $ % & ?`) ### Impact This creates **guess-order certainty**. Attackers don’t need to guess _everything_, only what is **most likely first**. This leads to: - High early success rates - Rapid credential harvesting - Fast escalation via reuse **Probability mass is concentrated**, which means: - First guesses matter far more than later ones - “Strong-looking” passwords still fail early --- ## 4. Reuse as the Dominant Failure Mode ### Observed - **80% of users reuse 1–3 passwords** - Variants differ only cosmetically - Reuse spans: - Work systems - Cloud services - VPNs - Email - Personal accounts ### Impact This is not a password problem, it is a **systems problem**. One compromised password becomes: - Multiple authentications - Multiple trust boundaries crossed - Multiple attack paths unlocked **The weakest system defines the security ceiling.** --- ## 5. Cultural and Organizational Clustering ### Observed - Western datasets: - Lowercase-heavy - Word-based - Eastern datasets: - Digit-heavy - Numeric sequences - Organizational themes cluster tightly - Cultural references amplify reuse ### Impact This enables **targeted optimization**: - Attackers tune guesses per organization - Regional behavior shifts mask design - “Rare” passwords become common within a group --- ## 6. The 20–60–20 Rule - Why Breaches Escalate ### Breakdown - **20%** fall almost immediately (dictionary/common) - **60%** fall via slight variation - **20%** require disproportionate effort ### Impact Attackers rarely need the last 20%. Once the first 80% are compromised: - Lateral movement begins - Privilege escalation compounds - The remaining 20% become irrelevant --- ## 7. Attack Path Entry Points These findings map cleanly to **attack path analysis**: ### Passwords are not endpoints They are **edges** in a graph. A cracked or reused password: - Creates new edges - Unlocks new nodes - Shortens distance to high-value targets ### Structural predictability creates deterministic paths - Reuse collapses isolation - Shared credentials connect systems - Privileged accounts amplify blast radius From an attack-path perspective: - **Authentication is the graph fabric** - Password behavior defines graph density - Weak patterns create **shortest paths** This is why: - Tier Zero isolation matters - Identity systems fail first - Password issues are graph problems, not string problems --- ## 8. Defensive Implications ### What This Invalidates - Complexity rules as a primary defense - Entropy-only scoring - Per-system password thinking - Isolated credential assessments ### What This Reinforces - Length over complexity - Reuse detection across systems - Known-compromised password blocking - Identity graph analysis - Attack-path-driven prioritization --- ## 9. Final Impact Summary These findings demonstrate that: - Password compromise is **predictable** - Guess success is **front-loaded** - Reuse turns local failure into systemic failure - Scale removes uncertainty - Identity behavior defines security outcomes **Breaches do not happen because passwords are broken.** They happen because **human behavior + scale + reuse create certainty**. #research [[Password Pattern Analysis|Analysis]] [[Home]]