HTML Character Sets

Understanding character encoding and special characters in HTML

🔤 HTML Character Sets

Character sets define how browsers interpret and display text. UTF-8 is the recommended character encoding for HTML5 as it supports all languages and special characters.


<!-- UTF-8 character encoding (recommended) -->
<meta charset="UTF-8">

<!-- Example with special characters -->
<p>Hello World! 🌍 Café © 2024</p>
<p>Math: 2 + 2 = 4, π ≈ 3.14</p>
                                    

Character Encoding

🌐

UTF-8

Universal character encoding

<meta charset="UTF-8">
🔤

ASCII

Basic English characters only

<meta charset="ASCII">
🌍

ISO-8859-1

Western European characters

<meta charset="ISO-8859-1">
🎯

UTF-16

16-bit Unicode encoding

<meta charset="UTF-16">

🔹 HTML Entities

Special characters that have meaning in HTML must be encoded as entities:

<!-- Reserved HTML characters -->
<p>The &lt;p&gt; tag creates a paragraph.</p>
<p>Use &quot;quotes&quot; for attributes.</p>
<p>The &amp; symbol means "and".</p>

<!-- Special symbols -->
<p>Copyright &copy; 2024</p>
<p>Registered &reg; trademark</p>
<p>Price: &euro;29.99</p>

Output:

The <p> tag creates a paragraph.

Use "quotes" for attributes.

The & symbol means "and".

Copyright © 2024

Registered ® trademark

Price: €29.99

🔹 Common HTML Entities

Frequently used HTML character entities:

Reserved Characters:

Character Entity Name Entity Number Description
< &lt; &#60; Less than
> &gt; &#62; Greater than
& &amp; &#38; Ampersand
" &quot; &#34; Quotation mark
&nbsp; &#160; Non-breaking space

🔹 Symbol Entities

Common symbols and their HTML entities:

<!-- Currency symbols -->
<p>Dollar: &dollar; Cent: &cent; Pound: &pound; Euro: &euro; Yen: &yen;</p>

<!-- Math symbols -->
<p>Plus: &plus; Minus: &minus; Times: &times; Divide: &divide;</p>
<p>Less than or equal: &le; Greater than or equal: &ge;</p>

<!-- Arrows -->
<p>Left: &larr; Right: &rarr; Up: &uarr; Down: &darr;</p>

<!-- Other symbols -->
<p>Heart: &hearts; Spade: &spades; Club: &clubs; Diamond: &diams;</p>

Output:

Dollar: $ Cent: ¢ Pound: £ Euro: € Yen: ¥

Plus: + Minus: − Times: × Divide: ÷

Less than or equal: ≤ Greater than or equal: ≥

Left: ← Right: → Up: ↑ Down: ↓

Heart: ♥ Spade: ♠ Club: ♣ Diamond: ♦

🔹 Unicode Characters

With UTF-8, you can use Unicode characters directly or as numeric entities:

<!-- Direct Unicode (with UTF-8) -->
<p>Emojis: 😀 🎉 🌟 ❤️ 🚀</p>
<p>Languages: Español, Français, 中文, العربية, Русский</p>

<!-- Numeric entities -->
<p>Smiley: &#128512; Star: &#11088; Heart: &#10084;</p>

<!-- Hex entities -->
<p>Smiley: &#x1F600; Star: &#x2B50; Heart: &#x2764;</p>

Output:

Emojis: 😀 🎉 🌟 ❤️ 🚀

Languages: Español, Français, 中文, العربية, Русский

Smiley: 😀 Star: ⭐ Heart: ❤

Smiley: 😀 Star: ⭐ Heart: ❤

🔹 Character Set Best Practices

Follow these guidelines for proper character handling:

✅ Do:

  • Always use UTF-8: <meta charset="UTF-8">
  • Place charset early: First element in <head>
  • Encode reserved characters: < > & " '
  • Save files as UTF-8: In your text editor
  • Test with special characters: Verify display

❌ Don't:

  • Mix character sets: Stick to UTF-8
  • Forget charset declaration: Always include it
  • Use deprecated encodings: Avoid ISO-8859-1
  • Assume ASCII is enough: Use UTF-8 for future-proofing

🔹 Complete Example

A properly encoded HTML document with various characters:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Character Set Demo</title>
</head>
<body>
    <h1>Welcome to Our Café! ☕</h1>
    
    <p>We serve coffee &amp; pastries from around the world:</p>
    
    <ul>
        <li>Espresso - &euro;2.50</li>
        <li>Café au Lait - &pound;3.00</li>
        <li>Cappuccino - &dollar;4.25</li>
    </ul>
    
    <p>Hours: 7:00 AM &rarr; 9:00 PM</p>
    
    <footer>
        <p>&copy; 2024 Global Café. All rights reserved.</p>
    </footer>
</body>
</html>

Output:

Welcome to Our Café! ☕

We serve coffee & pastries from around the world:

  • Espresso - €2.50
  • Café au Lait - £3.00
  • Cappuccino - $4.25

Hours: 7:00 AM → 9:00 PM

© 2024 Global Café. All rights reserved.

🧠 Test Your Knowledge

What is the recommended character encoding for HTML5?