Custom Data Attributes in HTML5

For years developers have needed a method for adding their own metadata to HTML elements, but no standardized technique existed.  One common practice was to encode metadata using classes.  Some developers would even go as far as creating their own attributes.  Since browsers didn’t understand these made up attributes, they would simply ignore them and continue rendering the page.

These ad hoc attributes are not standards compliant, and therefore result in invalid HTML. But, a bigger problem relates to backwards compatibility issues.  While a given attribute name is not defined in the current HTML standard, no guarantees can be made about future standards.  For example, older versions of HTML did not allow developers to set a maximum length for <textarea> elements.  However, JavaScript could be used in conjunction with a non-standard “maxlength” attribute to circumvent this shortcoming.  When HTML5 standardized the “maxlength” attribute, many existing pages could have potentially been broken.

Introduction of Custom Data Attributes

To help solve the problem of custom metadata, HTML5 introduced custom data attributes.  Custom data attributes are sometimes referred to as data-* attributes due to the way they are named.  Any attributes that start with the characters “data-” are recognized by the browser, but have no effect on the layout of the page.  Although HTML is case insensitive, by convention data-* attribute names are specified in all lowercase letters.  If the name contains multiple words, the words are separated by hyphens.  The importance of following this naming convention will be revisited when I explain how JavaScript interacts with data-* attributes.

The following example shows how custom data attributes are added to HTML.  In this case, several types of animals are defined.  Since each animal has a different number of legs, a “data-number-of-legs” attribute has been created.

<span id="human" data-number-of-legs="2">Human</span>
<span id="dog" data-number-of-legs="4">Dog</span>
<span id="ant" data-number-of-legs="6">Ant</span>
<span id="spider" data-number-of-legs="8">Spider</span>

Scripting data-* Attributes

There are two APIs that allow JavaScript to interact with custom data attributes.  First, custom data attributes can be accessed using the same interface methods as any other attribute.  In the following example, the “data-number-of-legs” attribute is written, read, and finally deleted using the setAttribute(), getAttribute(), and removeAttribute() methods respectively.

var person = document.getElementById("human");
var legs;

person.setAttribute("data-number-of-legs", "2");
legs = person.getAttribute("data-number-of-legs");
person.removeAttribute("data-number-of-legs");

The second API is designed specifically to work with data-* attributes.  In modern browsers, individual elements contain a special object named “dataset” which is used to access custom data attributes.  Any data-* attributes become properties of the element’s dataset object.  The new property names are derived as follows:

  1. The attribute name is converted to all lowercase letters.
  2. The “data-” prefix is stripped from the attribute name.
  3. Any hyphen characters are also removed from the attribute name.
  4. The remaining characters are converted to CamelCase.  The characters immediately following the hyphens removed in Step 3 become uppercase.

The properties can then be accessed just like any other variables.  To delete a data-* attribute from the dataset, simply assign it to null.  In the following code block, the previous example has been rewritten to use the the dataset API.

var person = document.getElementById("human");
var legs;

person.dataset.numberOfLegs = "2";  // Set the attribute
legs = person.dataset.numberOfLegs; // Get the attribute
person.dataset.numberOfLegs = null; // Delete the attribute

Notice that the original attribute name “data-number-of-legs” has been converted to “numberOfLegs” in the dataset object.  The name conversion process must be accounted for when naming data-* attributes.  Since the attribute names are converted to lowercase, it is best to avoid using uppercase letters.  The following example shows how several attribute names translate to dataset properties.

"data-number-of-legs" translates to "numberOfLegs"
"data-numberOfLegs"   translates to "numberoflegs"
"data-NUMBER-OF-LEGS" translates to "numberOfLegs"
"data-NuMBeROfLeGs"   translates to "numberoflegs"

Things to Remember

  1. Custom data attributes are typically used to store metadata that aids/simplifies JavaScript code.
  2. An element can have any number of custom data attributes.
  3. Custom data attributes should only be used if a more appropriate element or attribute does not exist.  For example, you should not create a custom “text description” attribute on an image.  The existing “alt” attribute is a better choice.
  4. The HTML5 specification clearly states data-* attributes should not be used by third party applications.  This means that programs such as search engines should not rely on custom data attributes when preparing search results.  Instead, third party applications should rely on microdata.
  5. Custom data attributes are compatible with selector APIs.


2 thoughts on “Custom Data Attributes in HTML5