16
Dom Traversal for Fun and Profit
During my time writing funny words in an IDE to make the computer do what I want, I dabbled in a little web scraping for cash.
I kept forgetting how to target certain parts of the page that I wanted to scrape and organise within my program.
So below, I'm putting together a few notes to share with my future self and you :)
Let's start with a little boilerplate HTML that we can work with.
<div class="grandparent" id="grandparent-id">
<!-- top level grandparent -->
<div class="parent"> <!-- first parent -->
<div class="child" id="child-one"></div> <!-- child 1 -->
<div class="child"></div> <!-- child 2 -->
</div>
<div class="parent"> <!-- second parent -->
<div class="child"></div> <!-- child 3 -->
<div class="child" id="child-four"></div> <!-- child 4 -->
</div>
</div>
There should only be one unique ID name per page. So we call getElement (singular).
const grandparent = document.getElementById("grandparent-id")
Calling get elements (plural) returns an HTMLCollection of elements from the DOM (both the parents in the HTML above). However, when trying to use Array methods on this collection you'll get an error.
We can get around this by wrapping the returned collection of elements inside an array, then we're able to use array methods on that content.
const parent = Array.from(document.getElementsByClassName("parent"))
This gives us a single element (the first one that appears in the DOM tree) by targeting the DOM using CSS selectors.
const grandparent = document.querySelector("#grandparent-id") // id
const grandparent = document.querySelector(".grandparent") // class
Similar to Get Elements by ID, this gives all the elements that match our query. However, this returns a NodeList, which allows us to use Array methods.
const grandparent = document.querySelectorAll("#grandparent-id") // id
const grandparent = document.querySelectorAll(".grandparent") // class
First, we want to target the top grandparent node. From there we can grab all of the children underneath.
Even though we're using QuerySelector which usually gives us a NodeList, when calling on the children, we get back an HTMLCollection!! Annoying.
So we'll need to create an Array from the returned children.
const grandparent = document.querySelector(".grandparent")
const parents = Array.from(grandparent.children)
const parentOne = parents[0] // etc
We can also drill down into the parent's children
const children = parentOne.children
We can use QuerySelector on NodeLists that we've already captured to go straight to the child level and skip the parents.
const childFour = document.querySelector("#child-four")
const parent = childFour.parent
This works very similar to QuerySelector, but instead of going down the DOM it moves upwards.
It takes a CSS argument which moves up the DOM to find the closest element that has the passed selector.
const childFour = document.querySelector("#child-four")
const grandparent = childFour.closest(".grandparent")
We can use QuerySelector on NodeLists that we've already captured to go straight to the child level and skip the parents.
const grandparent = document.querySelector(".grandparent")
const childOne = grandparent.querySelector(".child")
This gets the next element along from where you currently are. Instead of going up and down, it's like we're going sideways through the DOM.
const childOne = document.querySelector("#child-one")
const childTwo = childOne.nextElementSibling
const childFour = document.querySelector("#child-four")
const childThree = childFour.previousElementSibling
16