An HTML Primer
by John English
| HTML basics | |
| Structure of an HTML document | |
| Document headings | |
| Preformatted text | |
| Lists | |
| Miscellaneous tags | |
| Including images in your text | |
| Hypertext links | |
| URLs | |
| Summary |
![]()
HTML is the "markup language" used by web browsers to display documents. A web browser treats text as a continuous sequence of words separated by "white space" (one or more spaces, tabs or line breaks) and displays it according to the width of the display window, using "word wrapping" to fit as many words as will fit on a line before starting the next line. Changing the width of the window will reformat the text so it still fits inside the window (try it!).
Since line breaks are ignored, your document will end up as one long continuous paragraph if you don't do anything about it, regardless of how you laid it out when you wrote it. To tell the browser to start a new paragraph, you have to use markup tags which will be interpreted specially. HTML markup tags are written inside angle brackets "<...>"; the tag to tell a browser to start a new paragraph is <P>. It doesn't matter if you use capitals or not for tags, so <p> means the same thing as <P>.
You can also use markup tags to tell the browser about special formatting requirements (bold or italic text, and so on):
<B> ... </B> Text between <B> and </B> will be
displayed as bold text
<I> ... </I> Text between <I> and </I> will be
displayed as italic text<center> ... </center> Text between <center> and </center> will be centered
HTML tags are almost always used in pairs, like brackets; the closing tag is the same as the opening tag but preceded by "/", so <B> is the opening "boldface" tag and </B> is the closing "boldface" tag, and so on.
Because the characters "<" and ">" and a few others are treated specially by browsers, you have to encode them like this:
To display this: write this:
< <
> >
& &
" "
Any tags that a browser doesn't recognise will just be ignored, so that if you forget to encode "<" as "<" the browser will treat what follows as a tag. If it doesn't recognise the text after "<" as a valid tag, everything up to the next occurrence of ">" will be ignored, which means that a chunk of your text will just disappear completely. The easiest way to write HTML is to use an HTML editor, which will take care of all these details automatically.
![]()
An HTML document is actually divided into two parts: a header (which is not displayed) and a body (the text that is actually displayed in the browser window). The overall structure looks like this:
<HTML> -- start of HTML document
<HEAD> -- start of document header
... -- header contents
</HEAD> -- end of header
<BODY> -- start of document body
... -- body contents
</BODY> -- end of body
</HTML> -- end of document
The only thing the document header needs to contain is a document title which will be displayed in the browser's title bar. A title is enclosed in <TITLE> ... </TITLE> like this:
<TITLE>This is a document title</TITLE>
In fact, the document structure tags given above (<HTML>, <HEAD> and <BODY>) are normally ignored by browsers; usually, as soon as a browser sees anything which can't be part of the document header, it assumes that it's got to the document body and starts displaying text in the browser window. All the same, it's good practice to put these tags in since some browsers might require them.
![]()
To provide headings like the one immediately above, you can use the tag <H1> ... </H1>. The text in between is displayed as a separate paragraph in a large font. For example, if you write this:
<H1>A Level 1 Heading</H1>
it will be displayed like this:
Level 1 headings like this are normally only used at the start of a document. There are five other levels for subheadings:
<H2>A Level 2 Heading</H2> <H3>A Level 3 Heading</H3> <H4>A Level 4 Heading</H4> <H5>A Level 5 Heading</H5> <H6>A Level 6 Heading</H6>
which will be displayed like this:
![]()
Sometimes you want text to be displayed exactly as you've written it (e.g. program code). To do this, enclose the text in <PRE> ... </PRE> like this:
<PRE>
This text will be displayed exactly as it was typed
including any indentation
or alignment into columns
like this
Blank lines are also possible
You can still use <B>bold text</B> or <I>italic text</I> in
preformatted text.
</PRE>
This will be displayed as:
This text will be displayed exactly as it was typed
including any indentation
or alignment into columns
like this
Blank lines are also possible
You can still use bold text or italic text in
preformatted text.
![]()
If you want to write a bulleted list, you enclose the entire list in <UL> ... </UL> and then start individual list items with <LI>. For example:
<UL>
<LI>List item 1
<LI>List item 2
</UL>
will be displayed like this:
| List item 1 | |
| List item 2 |
To produce a numbered list istead of a bulleted list, use <OL> ... </OL> instead of <UL> ... </UL>:
<OL>
<LI>List item 1
<LI>List item 2
</OL>
will be displayed like this:
You can also produce definition lists using <DL> ... </DL>. Each entry in a definition list is in two parts: a definition term which begins with <DT> and a definition part which begins with <DD>. For example,
<DL>
<DT>BTW
<DD>"By the way"
<DT>TTYL
<DD>"Talk To You Later"
</DL>
which will be displayed like this:
![]()
Here are a couple more useful tags to round things off:
<HR> |
A horizontal rule (like the one above the heading for this section) |
<BR> |
A line break |
The line break starts a new line, but doesn't put a gap between lines the way that starting a new paragraph would.
![]()
To include an image, you need to have the image available in a .GIF or .JPG (JPEG) file. To reference the file you use an IMG tag, like this:
<IMG SRC="filename.gif">
This will display the image in the file filename.gif as part of the current paragraph. If you want the image to be displayed as a separate paragraph, start a new paragraph before and after the IMG tag, or put line breaks (<BR>) before and after. The filename can be any URL (Uniform Resource Locator) so that it can be on any accessible machine anywhere in the world. URLs are described more fully below.
![]()
Hypertext links are what make web documents so powerful. A link can be used to reference another document, which can be another local file or another document anywhere in the world.
Links are generated by using anchor tags. The link above is written like this in HTML:
<A HREF="../.././welldone.htm">like this</A>
The text between <A> and </A> is highlighted by the browser, and when you click on it the browser goes to the file specified by the HREF part of the tag. Simple, isn't it?
You can also use images as hypertext links with the following markup:
<A HREF="someURL"><IMG SRC="someFile.jpg"></A> If you want to link to a specific section in a document, you need to put #section after the filename, which will go to the section called section in the specified document:
<A HREF="somefile.htm#index">The index in some file</A>
If the reference is to a section of the current document, you just use #section on its own:
<A HREF="#contents">Go to the table of contents</A>
which will be displayed like this:
To attach a section name to part of a document, you need to use another variation of the <A> tag:
<A NAME="section-name">Some text</A>
For example, the bookmark "contents" was attached to the heading for the table of contents at the beginning of this document like this:
<P><B><A NAME="contents">Contents:</A></B>
This has no visible effect on the text. All the section headings in this document have bookmarks attached, which are referenced from the table of contents at the start of the document.
![]()
As I mentioned earlier, images and hypertext links can both use Uniform Resource Locators (URLs) which can reference documents all over the world. A typical URL looks like this:
http://www.comp.it.bton.ac.uk/je/burks.html
which references the front page for the online copy of BURKS at the University of Brighton. The URL consists of:
| a protocol specification, which says which Internet protocol the browser needs to use to access the document (in this case http, the HyperText Transfer Protocol); | |
| the internet address of a server (in this case the server is a machine called www.comp.it.bton.ac.uk); and | |
| the document's filename on the server, in this case burks.html in the directory je. |
In general, a URL looks like this:
protocol://server/document
If you leave out the protocol and server name, the protocol and server name
from the current URL will be assumed. So by leaving out the protocol and server
name and just providing a file name, you end up referring to a file whose
location is relative to the document containing the link. Here is another
example:
<A HREF="../rfc1738.htm">
The file rfc1738.htm is in the directory above the one where this
document is located.
![]()
Here's a quick roundup of the HTML tags covered in this document:
Paragraph types:
<P> Paragraph break
<H1> ... </H1> Heading level 1
<H2> ... </H2> Heading level 2
<H3> ... </H3> Heading level 3
<H4> ... </H4> Heading level 4
<H5> ... </H5> Heading level 5
<H6> ... </H6> Heading level 6
<UL> ... </UL> Bulleted (unordered) list
<OL> ... </OL> Numbered (ordered) list
<LI> List item in a bulleted or
numbered list
<DL> ... </DL> Definition list
<DT> Definition term
<DD> Definition
Text formatting
<B> ... </B> Bold text <I> ... </I> Italic text
<center> ... </center> Centered text
Miscellaneous
<TITLE> ... </TITLE> Document title <BR> Line break <HR> Horizontal rule
Hyperlinks
<IMG SRC="url"> Inline image <A HREF="url"> ... </A> Hyperlink to another document <A NAME="tag"> ... </A> Bookmark within a document