(X)HTML Tutorial

Special Characters and Entities

Consider the mathematical formula that x < y. It is a challenge to represent this in (X)HTML because the browser will automatically assume that < is the opening of a tag. In most cases, the browser, not recognising an element called <y> will simply ignore this "ghost" tag and not print anything. Not what you want.

To get around this problem, it is necessary to replace < with an entity, a code which tells the browser to display <. The code for a less than sign is <. Note the format. The entity begins with an ampersand, is followed by a name ("lt" for "less than"), and completed by a semicolon. You will not be surprised to find that the entity for > (greater than) is >.

Since entities begin with ampersands, you run into the same problem representing this & that as you did representing x < y. In order to actually display an ampersand, you must use the entity &.

We have already seen that typing "word 1     word2" in (X)HTML will produce the text "word 1 word 2". In order to force the browser to display multiple spaces, we must use the non-breaking space entity  . The code to produce the large gap between the words would be


   word 1     word 2

Note that   should not be used for indenting text. Use CSS for this purpose.

Entities may be references by their name, as above, or by numeric codes (preceded by #). The name-code equivalencies of each of entities we have looked at so far are listed below.

Result Description Entity Name Entity Number
  non-breaking space &nbsp; &#160;
< less than &lt; &#60;
> greater than &gt; &#62;
& ampersand &amp; &#38;

Entities can also be used to represent special characters such as accented letters like é. A large number of entities is available for this purpose. A good list of these entities is available at http://www.degraeve.com/reference/specialcharacters.php.

For convenience, this tutorial will list only those special characters commonly used to represent Old English.

Result Description Entity Name Entity Number
æ æsc &aelig; &#230;
Æ capital æsc &AElig; &#198;
þ thorn &thorn; &#254;
Þ capital thorn Þ &#222;
ð eth ð &#240;
Ð capital eth &ETH; &#208;

Note that entity names are case sensitive: æ is not the same as &AELIG;.