HTML Character Sets
Understanding character encoding and special characters in HTML
🔤 HTML Character Sets
Character sets define how browsers interpret and display text. UTF-8 is the recommended character encoding for HTML5 as it supports all languages and special characters.
<!-- UTF-8 character encoding (recommended) -->
<meta charset="UTF-8">
<!-- Example with special characters -->
<p>Hello World! 🌍 Café © 2024</p>
<p>Math: 2 + 2 = 4, π ≈ 3.14</p>
Character Encoding
UTF-8
Universal character encoding
<meta charset="UTF-8">
ASCII
Basic English characters only
<meta charset="ASCII">
ISO-8859-1
Western European characters
<meta charset="ISO-8859-1">
UTF-16
16-bit Unicode encoding
<meta charset="UTF-16">
🔹 HTML Entities
Special characters that have meaning in HTML must be encoded as entities:
<!-- Reserved HTML characters -->
<p>The <p> tag creates a paragraph.</p>
<p>Use "quotes" for attributes.</p>
<p>The & symbol means "and".</p>
<!-- Special symbols -->
<p>Copyright © 2024</p>
<p>Registered ® trademark</p>
<p>Price: €29.99</p>
Output:
The <p> tag creates a paragraph.
Use "quotes" for attributes.
The & symbol means "and".
Copyright © 2024
Registered ® trademark
Price: €29.99
🔹 Common HTML Entities
Frequently used HTML character entities:
Reserved Characters:
| Character | Entity Name | Entity Number | Description |
|---|---|---|---|
| < | < | < | Less than |
| > | > | > | Greater than |
| & | & | & | Ampersand |
| " | " | " | Quotation mark |
| |   | Non-breaking space |
🔹 Symbol Entities
Common symbols and their HTML entities:
<!-- Currency symbols -->
<p>Dollar: $ Cent: ¢ Pound: £ Euro: € Yen: ¥</p>
<!-- Math symbols -->
<p>Plus: + Minus: − Times: × Divide: ÷</p>
<p>Less than or equal: ≤ Greater than or equal: ≥</p>
<!-- Arrows -->
<p>Left: ← Right: → Up: ↑ Down: ↓</p>
<!-- Other symbols -->
<p>Heart: ♥ Spade: ♠ Club: ♣ Diamond: ♦</p>
Output:
Dollar: $ Cent: ¢ Pound: £ Euro: € Yen: ¥
Plus: + Minus: − Times: × Divide: ÷
Less than or equal: ≤ Greater than or equal: ≥
Left: ← Right: → Up: ↑ Down: ↓
Heart: ♥ Spade: ♠ Club: ♣ Diamond: ♦
🔹 Unicode Characters
With UTF-8, you can use Unicode characters directly or as numeric entities:
<!-- Direct Unicode (with UTF-8) -->
<p>Emojis: 😀 🎉 🌟 ❤️ 🚀</p>
<p>Languages: Español, Français, 中文, العربية, Русский</p>
<!-- Numeric entities -->
<p>Smiley: 😀 Star: ⭐ Heart: ❤</p>
<!-- Hex entities -->
<p>Smiley: 😀 Star: ⭐ Heart: ❤</p>
Output:
Emojis: 😀 🎉 🌟 ❤️ 🚀
Languages: Español, Français, 中文, العربية, Русский
Smiley: 😀 Star: ⭐ Heart: ❤
Smiley: 😀 Star: ⭐ Heart: ❤
🔹 Character Set Best Practices
Follow these guidelines for proper character handling:
✅ Do:
- Always use UTF-8: <meta charset="UTF-8">
- Place charset early: First element in <head>
- Encode reserved characters: < > & " '
- Save files as UTF-8: In your text editor
- Test with special characters: Verify display
❌ Don't:
- Mix character sets: Stick to UTF-8
- Forget charset declaration: Always include it
- Use deprecated encodings: Avoid ISO-8859-1
- Assume ASCII is enough: Use UTF-8 for future-proofing
🔹 Complete Example
A properly encoded HTML document with various characters:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Character Set Demo</title>
</head>
<body>
<h1>Welcome to Our Café! ☕</h1>
<p>We serve coffee & pastries from around the world:</p>
<ul>
<li>Espresso - €2.50</li>
<li>Café au Lait - £3.00</li>
<li>Cappuccino - $4.25</li>
</ul>
<p>Hours: 7:00 AM → 9:00 PM</p>
<footer>
<p>© 2024 Global Café. All rights reserved.</p>
</footer>
</body>
</html>
Output:
Welcome to Our Café! ☕
We serve coffee & pastries from around the world:
- Espresso - €2.50
- Café au Lait - £3.00
- Cappuccino - $4.25
Hours: 7:00 AM → 9:00 PM