Removing HTMLCollection elements from the DOM

I have a collection of paragraph elements. Some are empty and some contain whitespace only, while others have content:

<p>Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Vestibulum tortor quam, feugiat vitae, ultricies eget, tempor sit amet, ante. Donec eu libero sit amet quam egestas semper. Aenean ultricies mi vitae est. Mauris placerat eleifend leo.</p>
<p></p>
<p> </p>
<p>    </p>
<p>&nbsp;</p>
<p> &nbsp;</p>

I'm using getElementsByTagName to select them:

var paragraphs = document.getElementsByTagName('p');

This returns all paragraphs in the document. I want to remove all of them, so I'd like to run

for (var i = 0, len = paragraphs.length; i < len; i++) {
   paragraphs[i].remove();
}

but I get Uncaught TypeError: Cannot read property 'remove' of undefined errors. I think this is strange, but figure I'll try adding a guard and see what happens:

for (var i = 0, len = paragraphs.length; i < len; i++) {
   paragraphs[i] && paragraphs[i].remove();
}

No error, but not all elements are removed. So I run it again, and it removes some of the elements which weren't removed previously. I run it again and finally all of the paragraphs are removed from the document.

I'm wondering what obvious detail I'm missing here.

Demo of the problem

Solution 1:

The problem is that paragraphs is a live list. By removing a p element, you are also changing that list. A simple fix is to iterate over the list in reverse order:

for (var i = paragraphs.length - 1; i >= 0; --i) {
  paragraphs[i].remove();
}

The alternative solution is to create a static list (non-live list). You can do this by either:

converting the list into an Array:

var paragraphs =
  Array.prototype.slice.call(document.getElementsByTagName('p'), 0);

using document.querySelectorAll:

var paragraphs = document.querySelectorAll('p');

You can then iterate over the list in regular order (using a for loop):

for (var i = 0; i < paragraphs.length; ++i) {
  paragraphs[i].remove();
}

or (using a for...of loop):

for (var paragraph of paragraphs) {
  paragraph.remove();
}

Note that .remove is a relatively new DOM method, and not supported in every browser. See the MDN documentation for more info.

To illustrate the problem, let’s imagine we have a node list of three elements, paragraphs = [p0, p1, p2]. Then this is what happens when you iterate over the list:

i = 0, length = 3, paragraphs[0] == p0  => paragraphs = [p1, p2]
i = 1, length = 2, paragraphs[1] == p2  => paragraphs = [p1]
i = 2, length = 1, END

So in this example, p1 is not deleted because it is skipped.

Solution 2:

The length of your HTMLCollection changes when you remove an item. A way to do it is to use a while loop

while(paragraphs.length > 0) {
   paragraphs[0].remove();
}

Solution 3:

Why this doesn't work for you:

The HTMLCollection is mutating (changing) while you are removing nodes, the the length gets out-of-sync with the "real" length of the HTMLCollection array.

Lets say you have an array of 2 DOM nodes, and you are iterating it. it should iterate 2 times. The demo below illustrate this perfectly and i easy to follow:

first iteration - removes the first node and then i is incremented.
second iteration - now i equals to 1 but the paragraphs.length is now also 1 because only one paragraph is left at this point.

This results in an impossible scenario where an array with length of 1 is asked to access an item at position 1, and the only position available is 0 (since Arrays start from position 0...)

Accessing a position which doesn't exist in an Array (or Array-like object HTMLCollection) is illegal.

var paragraphs = document.getElementsByTagName('p')

for (var i = 0; i <= paragraphs.length; i++) {
   console.log(i, paragraphs.length)
   paragraphs[i].remove()
}

<p>1</p>
<p>2</p>

Possible fix: delay the removal of the nodes

In the below demo the removal of nodes is made after all cycles of iteration has been made (setTimeout delays code execution), and the key here is to utilize the third parameter and pass the node which will be cached as the argument for the timeout callback:

var paragraphs = document.getElementsByTagName('p')

for (var i = 0, len = paragraphs.length; i < len; i++) {
   setTimeout(node => node.remove(),0 , paragraphs[i])
}

<p>Pellentesque habitant....</p>
<p></p>
<p> </p>
<p>    </p>
<p>&nbsp;</p>
<p> &nbsp;</p>

Possible fix: on each iteration check the node exists

Also it's important not to increment i since the length of the array keeps shrinking, the first item gets removed on every iteration until no more items are left

var paragraphs = document.getElementsByTagName('p')

for (var i = 0, len = paragraphs.length; i < len; ) {
   if(  paragraphs[i] )
     paragraphs[i].remove()
}

<p>1</p>
<p>2</p>
<p>3</p>

Possible fix: I would almost always prefer a reverse-iterator

var paragraphs = document.getElementsByTagName('p')

for (var i = paragraphs.length; i--; ){
    paragraphs[i].remove() // could also use `paragraphs[0]`. "i" index isn't necessary
}

<p>1</p>
<p>2</p>
<p>3</p>