# Chapter X
## Structural and Text-Analytic Analysis of Password Composition
### Abstract
Passwords encode far more than authentication secrets; they embed behavioral, cultural, and contextual signals derived from their creators. By decomposing passwords into observable structural components, defenders can better understand failure modes, population-level risk, and the predictable patterns exploited by attackers. This chapter formalizes a **text-analytic framework for password analysis**, categorizing patterns into **Basic**, **Macro**, and **Micro** levels, and correlates these findings with regional trends and password manager behaviors.
---
## 1. Introduction
Passwords are not random artifacts. They are **human-generated strings**, shaped by memory constraints, policy requirements, language, culture, and habit. As such, password analysis can be treated as a specialized subdomain of **text analytics**, where structure, frequency, and context reveal meaningful information.
This chapter presents a structured methodology for password analysis based on three pattern categories:
- **Basic Patterns** – visually obvious components
- **Macro-Patterns** – statistical and structural properties
- **Micro-Patterns** – subtle, contextual, and behavioral signals
Together, these layers enable systematic interpretation of password datasets and explain why compromise scales rapidly once guessing begins.
---
## 2. Text Analytics Framework for Password Analysis
### 2.1 Overview of Pattern Categories
Password structure can be analyzed across three increasingly granular layers:
**Figure 1. Pattern Analysis Layers**
```
+---------------------+
| Micro-Patterns | Context, habits, themes
+---------------------+
| Macro-Patterns | Length, charset, masks
+---------------------+
| Basic Patterns | Base words, digits, symbols
+---------------------+
```
Each layer provides distinct analytical value and informs different attack and defense strategies.
---
## 3. Basic Pattern Analysis
### 3.1 Definition
Basic patterns are **visually obvious components** identifiable through direct inspection and simple grouping. These typically include:
- Language and base words
- Common substitutions (L33T)
- Numeric suffixes or prefixes
- Predictable special characters
### 3.2 Examples
```
R0b3rt2017!
Jennifer1981!
```
**Observed characteristics:**
- Use of personal names (`Robert`, `Jennifer`)
- L33T speak substitution (`o → 0`, `e → 3`)
- Four-digit numeric suffix resembling a year
- Trailing exclamation mark
### 3.3 Analytical Implications
These patterns lend themselves to:
- Dictionary-based attacks
- L33T substitution rules
- Hybrid attacks combining dictionaries with date suffixes
**Table 1. Basic Pattern Indicators**
|Indicator|Observation|Analytical Use|
|---|---|---|
|Base word|Personal name|Dictionary seed|
|Digits|4-digit year|Hybrid suffix|
|Symbol|Trailing `!`|Rule-based append|
|Substitution|Common L33T|Rule expansion|
---
## 4. Macro-Pattern Analysis
### 4.1 Definition
Macro-patterns describe **statistical properties** of passwords independent of specific content, including:
- Total length
- Character classes used
- Mask structure
- Repeated constants
### 4.2 Examples
```
7482Sacrifice
Solitaire7482
```
**Structural observations:**
- Fixed numeric constant (`7482`)
- Word–number inversion
- Consistent capitalization
- Length ≈ 12 characters
### 4.3 Analytical Implications
While the passwords appear long, the **effective search space** is reduced:
- A fixed 4-digit constant lowers uncertainty
- Only the dictionary portion varies
- Special characters may be absent entirely
**Figure 2. Effective Search Space Reduction**
```
Total length: 12
Fixed digits: 4
Variable characters: 8
```
**Table 2. Macro-Pattern Characteristics**
|Property|Value|
|---|---|
|Length|12 ± 1|
|Charset|Lowercase + digits|
|Fixed component|4-digit constant|
|Likely attack|Hybrid (Dict + Digits)|
---
## 5. Micro-Pattern Analysis
### 5.1 Definition
Micro-patterns capture **subtle consistency and contextual meaning**, including:
- Themes (colors, animals, hobbies)
- Sequential numbering
- Consistent capitalization rules
- Personal or cultural associations
### 5.2 Examples
```
BlueParrot345
RedFerret789
```
**Observed micro-patterns:**
- Color + animal structure
- Title-case capitalization
- Sequential three-digit suffixes
### 5.3 Analytical Implications
These patterns reveal:
- Likely reuse across accounts
- Predictable variant generation
- The presence of themed password “families”
**Table 3. Micro-Pattern Signals**
|Signal|Example|Inference|
|---|---|---|
|Theme|Color + animal|Personal preference|
|Digits|Sequential|Low entropy|
|Capitalization|Consistent|Habitual rule|
|Structure reuse|Yes|Variant family|
---
## 6. Regional Password Trend Analysis
### 6.1 US / EU / CA Trends
#### Length Distribution
**Table 4. Western Password Length Distribution**
|Length|Percentage|
|---|---|
|7|15%|
|8|27%|
|9|15%|
|10|12%|
|11|4.8%|
|12|4.9%|
|13+|<1%|
#### Character Frequency
- **English text:** `etaoinshrdlcumwfgypbvkjxqz`
- **Passwords:** `aeionrlstmcdyhubkgpjvfwzxq`
#### Common Masks
**Table 5. Common Western Masks**
|Mask|Description|
|---|---|
|`?l?l?l?l?l?l`|6 lowercase|
|`?l?l?l?l?l?l?d?d`|6 lowercase + 2 digits|
|`?d?d?d?d?d?d`|6 digits|
|`?l x12`|12 lowercase|
---
### 6.2 CN (Chinese) Trends
#### Length Distribution
**Table 6. Eastern Password Length Distribution**
|Length|Percentage|
|---|---|
|7|21%|
|8|22%|
|9|12%|
|10|12%|
|11|4.2%|
|12+|<1%|
#### Observed Differences
- Strong preference for **numeric-only passwords**
- Higher digit density
- Lower reliance on symbols
**Table 7. Common Eastern Masks**
|Mask|Description|
|---|---|
|`?d x8`|8 digits|
|`?d x6`|6 digits|
|`?l x6`|6 lowercase|
|`?l?l?l?d?d?d?d?d?d`|Mixed|
---
## 7. Password Manager Generation Trends
Password managers introduce **non-human structures** that are statistically distinct.
**Table 8. Password Manager Defaults**
|Manager|Length|Charset|Distinguishing Traits|
|---|---|---|---|
|Safari|15|u/l/d + `-`|Grouped segments|
|Dashlane|12|u/l/d|No symbols|
|KeePass|20|u/l/d/s|High entropy|
|LastPass|12|u/l/d|Minimal complexity|
|1Password|24|u/l/d/s|Long, dense|
These passwords resist traditional pattern-based guessing but remain vulnerable if exposed through reuse or endpoint compromise.
---
## 8. Implications for Security Analysis
### 8.1 Why This Matters
- Passwords cluster around **behavioral norms**
- Structure dominates guessability more than entropy
- Length alone does not imply strength
- Cultural and organizational patterns amplify risk at scale
### 8.2 Defensive Value
Understanding these structures enables:
- Better password policy design
- More accurate breach impact assessment
- Improved detection of reuse and variant families
- Realistic modeling of attacker success rates
---
## 9. Conclusion
Password security failures are rarely cryptographic. They are **behavioral and structural**. By applying text-analytic techniques across Basic, Macro, and Micro layers, defenders gain a predictive understanding of how and why passwords fail. This framework transforms password cracking from a purely technical exercise into a **measured analysis of human behavior at scale**.
[[Password Pattern Analysis|Analysis]]
[[Home]]
#research