There’s much to love about HTML 5 (microdata, canvas animations, and a slew of new APIs), but at its core it also revisits the HTML syntax and makes it simpler. When you’re building web apps you need to juggle HTML, CSS, JavaScript and PHP/Ruby/whatever around in your head, and whatever gets easier is a plus. In this article we review what’s simpler in HTML 5: the new doctype definition, optional link/script attributes and a less rigorous syntax.

Let’s look at the minimum HTML 5 you need to write for a basic page:

<html lang="en">
  <head>
     <meta charset="utf-8">
     <title>Title</title>
     <link rel="stylesheet" href="style.css">
     <script src="script.js"></script>
  </head>
  <body>
  </body>
</html>

Now compare this to the minimum requirements for HTML 4:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
  <head>
    <meta http-equiv="content-type" content="text/html" charset="utf-8">
    <title>Title</title>
    <link rel="stylesheet" type="text/css" href="style.css">
    <script type="text/javascript" src="script.js"></script>
  </head>
  <body>
  </body>
</html>

The HTML 5 snippet looks a lot cleaner. So what are the differences?

Out with the complicated DOCTYPE

If you want to write valid HTML, you’ll need to include a DOCTYPE property. This property, also called a Document Type Declaration, is used by HTML validation tools to validate your HTML. The document type declarations are actually files that live at the W3C; if you’re interested you can look at this one for the HTML 4 strict document type. These files contain grammar rules that valid HTML documents must follow. Now for HTML 4 there where three different document type declarations, namely transitional, strict and frameset, and the DOCTYPE declaration itself was difficult to remember (you would almost always have to look it up or copy it from another HTML document).

For HTML 5, the DOCTYPE line is very simple, and there is only one type:

<!DOCTYPE html>

Looking at the HTML 5 and HTML 4 snippets above, you may also notice that the <link> and <script> elements have become shorter. That’s because the type attribute is now optional. The HTML 5 specification now includes reasonable defaults for files referenced through these elements. For <link> elements with rel="stylesheet", it assumes that the type will be text/css, while for <script> elements it will be text/javascript. You can override these defaults if necessary.

Less rigorous syntax

In HTML 4, all attributes values must be surrounded by quotes. Not so in HTML 5. Instead of:

<link rel="stylesheet" href="style.css">

…you can now write:

<link rel=stylesheet href=style.css>

Quotes are only required when the attribute values contain spaces. It might be a good idea to put the quotes in anyway, though, in order to avoid any ambiguity. When you do have an attribute with a space in it, and you forget the quotes, you might spend unnecessary time debugging.

Also, in HTML 5, there are a bunch of elements that no longer require a closing tag: </li>, </dt>, </dd>, </tr>, </th>, </td>, </thead>, </tfoot>, </tbody>, </option>, </optgroup>, </p>, </head>, </body> and </html> are automatically inserted by newer browsers. You must be careful with this: almost all browsers do close unclosed tags automatically, but you can never be sure. Only newer browsers actually implement HTML 5 and close these tags because the specification requires it, not because the author forgot.

Summary

All in all, it seems the gains are due to the simplified DOCTYPE element, and the optional type attributes for <link> and <script> elements. The less rigorous syntax might actually encourage people to write sloppy HTML, placing the onus of being forgiving with the browser.