20
ActiveSupport's #descendants Method: A Deep Dive
This article was originally written by Jonathan Miles on the Honeybadger Developer Blog.
Rails adds many things to Ruby's built-in objects. This is what some call a "dialect" of Ruby and is what allows Rails developers to write lines like 1.day.ago
.
Most of these extra methods live in ActiveSupport. Today, we're going to look at perhaps a lesser-known method that ActiveSupport adds directly to Class: descendants
. This method returns all the subclasses of the called class. For example, ApplicationRecord.descendants
will return the classes in your app that inherit from it (e.g., all the models in your application). In this article, we'll take a look at how it works, why you might want to use it, and how it augments Ruby's built-in inheritance-related methods.
First, we'll provide a quick refresher on Ruby's inheritance model. Like other object-oriented (OO) languages, Ruby uses objects that sit within a hierarchy. You can create a class, then a subclass of that class, then a subclass of that subclass, and so on. When walking up this hierarchy, we get a list of ancestors. Ruby also has the nice feature that all entities are objects themselves (including classes, integers, and even nil), whereas some other languages often use "primitives" that are not true objects, usually for the sake of performance (such as integers, doubles, booleans, etc.; I'm looking at you, Java).
Ruby and, indeed, all OO languages, has to keep track of ancestors so that it knows where to look up methods and which ones take precedence.
class BaseClass
def base
"base"
end
def overridden
"Base"
end
end
class SubClass < BaseClass
def overridden
"Subclass"
end
end
Here, calling SubClass.new.overridden
gives us "SubClass"
. However, SubClass.new.base
is not present in our SubClass definition, so Ruby will go through each of the ancestors to see which one implements the method (if any). We can see the list of ancestors by simply calling SubClass.ancestors
. In Rails, the result will be something like this:
[SubClass,
BaseClass,
ActiveSupport::Dependencies::ZeitwerkIntegration::RequireDependency,
ActiveSupport::ToJsonWithActiveSupportEncoder,
Object,
PP::ObjectMixin,
JSON::Ext::Generator::GeneratorMethods::Object,
ActiveSupport::Tryable,
ActiveSupport::Dependencies::Loadable,
Kernel,
BasicObject]
We won't dissect this whole list here; for our purposes, it's enough to note that SubClass
is at the top, with BaseClass
below it. Also, note that BasicObject
is at the bottom; this is the top-level Object in Ruby, so it will always be at the bottom of the stack.
Things get a bit more complicated when we add modules into the mix. A module is not an ancestor in the class hierarchy, yet we can "include" it into our class so Ruby has to know when to check the module for a method, or even which module to check first in the case of multiple modules being included.
Some languages do not allow this kind of "multiple inheritance", but Ruby even goes a step further by letting us choose where the module gets inserted into the hierarchy by whether we include or prepend the module.
Prepended modules, as their name somewhat suggests, are inserted into the list ancestors before the class, basically overriding any of the class' methods. This also means you can call "super" in a prepended module's method to call the original class' method.
module PrependedModule
def test
"module"
end
def super_test
super
end
end
# Re-using `BaseClass` from earlier
class SubClass < BaseClass
prepend PrependedModule
def test
"Subclass"
end
def super_test
"Super calls SubClass"
end
end
The ancestors for SubClass now look like this:
[PrependedModule,
SubClass,
BaseClass,
ActiveSupport::Dependencies::ZeitwerkIntegration::RequireDependency,
...
]
With this new list of ancestors, our PrependedModule
is now first-in-line, meaning Ruby will look there first for any methods we call on SubClass
. This also means that if we call super
within PrependedModule
, we will be calling the method on SubClass
:
> SubClass.new.test
=> "module"
> SubClass.new.super_test
=> "Super calls SubClass"
Included modules, on the other hand, are inserted into the ancestors after the class. This makes them ideal for intercepting methods that would otherwise be handled by the base class.
class BaseClass
def super_test
"Super calls base class"
end
end
module IncludedModule
def test
"module"
end
def super_test
super
end
end
class SubClass < BaseClass
include IncludedModule
def test
"Subclass"
end
end
With this arrangement, the ancestors for SubClass now look like this:
[SubClass,
IncludedModule,
BaseClass,
ActiveSupport::Dependencies::ZeitwerkIntegration::RequireDependency,
...
]
Now, SubClass is the first point of call, so Ruby will only execute methods in IncludedModule
if they are not present in SubClass
. As for super
, any calls to super
in the SubClass
will go to IncludedModule
first, while any calls to super
within IncludedModule
will go to BaseClass
.
Put another way, an included module sits between a subclass and its base class in the ancestor hierarchy. This effectively means they can be used to 'intercept' methods that would otherwise be handled by the base class:
> SubClass.new.test
=> "Subclass"
> SubClass.new.super_test
=> "Super calls BaseClass"
Because of this "chain of command", Ruby has to keep track of a classes ancestors. The reverse is not true, though. Given a particular class, Ruby does not need to track its children, or "descendants", because it will never need this information to execute a method.
Astute readers may have realized that if we were using multiple modules in a class, then the order we include (or prepend) them could produce different results. For example, depending on the methods, this class:
class SubClass < BaseClass
include IncludedModule
include IncludedOtherModule
end
and this class:
class SubClass < BaseClass
include IncludedOtherModule
include IncludedModule
end
Could have behave quite differently. If these two modules had methods with the same name, then the order here will determine which one is taking precedence and where calls to super
would be resolved to. Personally, I'd avoid having methods that overlap each other like this as much as possible, specifically to avoid having to worry about things like the order the modules are included.
While it's good to know the difference between include
and prepend
for modules, I think a more real-world example helps to show when you might choose one over the other. My main use-case for modules like these is with Rails engines.
Probably one of the most popular Rails engines is devise. Let's say we wanted to change the password digest algorithm being used, but first, a quick disclaimer:
My day-to-day use of modules has been to customize the behavior of a Rails engine that holds our default business logic. We are overriding the behavior of code we control. You can, of course, apply the same method to any piece of Ruby, but I would not recommend overriding code that you do not control (e.g., from gems maintained by other people), as any change to that external code could be incompatible with your changes.
Devise's password digest happens here in the Devise::Models::DatabaseAuthenticatable module:
def password_digest(password)
Devise::Encryptor.digest(self.class, password)
end
# and also in the password check:
def valid_password?(password)
Devise::Encryptor.compare(self.class, encrypted_password, password)
end
Devise allows you to customize the algorithm being used here by creating your own Devise::Encryptable::Encryptors
, which is the correct way to do it. For demonstration purposes, however, we'll be using a module.
# app/models/password_digest_module
module PasswordDigestModule
def password_digest(password)
# Devise's default bcrypt is better for passwords,
# using sha1 here just for demonstration
Digest::SHA1.hexdigest(password)
end
def valid_password?(password)
Devise.secure_compare(password_digest(password), self.encrypted_password)
end
end
begin
User.include(PasswordDigestModule)
# Pro-tip - because we are calling User here, ActiveRecord will
# try to read from the database when this class is loaded.
# This can cause commands like `rails db:create` to fail.
rescue ActiveRecord::NoDatabaseError, ActiveRecord::StatementInvalid
end
To get this module loaded, you'll need to call Rails.application.eager_load!
in development or add a Rails initializer to load the file. By testing it out, we can see it works as expected:
> User.create!(email: "one@test.com", name: "Test", password: "TestPassword")
=> #<User id: 1, name: "Test", created_at: "2021-05-01 02:08:29", updated_at: "2021-05-01 02:08:29", posts_count: nil, email: "one@test.com">
> User.first.valid_password?("TestPassword")
=> true
> User.first.encrypted_password
=> "4203189099774a965101b90b74f1d842fc80bf91"
In our case here, both include
and prepend
would have the same result, but let's add a complication. What if our User model implements its own password_salt
method, but we want to override it in our module methods:
class User < ApplicationRecord
# Include default devise modules. Others available are:
# :confirmable, :lockable, :timeoutable, :trackable and :omniauthable
devise :database_authenticatable, :registerable,
:recoverable, :rememberable, :validatable
has_many :posts
def password_salt
# Terrible way to create a password salt,
# purely for demonstration purposes
Base64.encode64(email)[0..-4]
end
end
Then, we update our module to use its own password_salt
method when creating the password digest:
def password_digest(password)
# Devise's default bcrypt is better for passwords,
# using sha1 here just for demonstration
Digest::SHA1.hexdigest(password + "." + password_salt)
end
def password_salt
# an even worse way of generating a password salt
"salt"
end
Now, include
and prepend
will behave differently because which one we use will determine which password_salt
method Ruby executes. With prepend
, the module will take precedence, and we get this:
> User.last.password_digest("test")
=> "a94a8fe5ccb19ba61c4c0873d391e987982fbbd3.salt"
Changing the module to use include
will instead mean that the User class implementation takes precedence:
> User.last.password_digest("test")
=> "a94a8fe5ccb19ba61c4c0873d391e987982fbbd3.dHdvQHRlc3QuY2"
Generally, I reach for prepend
first because, when writing a module, I find it easier to treat it more like a subclass and assume any method in the module will override the class' version. Obviously, this is not always desired, which is why Ruby also gives us the include
option.
We've seen how Ruby keeps track of class ancestors to know the order-of-precedence when executing methods, as well as how we insert entries into this list via modules. However, as programmers, it can be useful to iterate through all of a class' descendants, too. This is where ActiveSupport's #descendants
method comes in. The method is quite short and easily duplicated outside Rails if needed:
class Class
def descendants
ObjectSpace.each_object(singleton_class).reject do |k|
k.singleton_class? || k == self
end
end
end
ObjectSpace is a very interesting part of Ruby that stores information about every Ruby Object currently in memory. We won't dive into it here, but if you have a class defined in your application (and it's been loaded), it will be present in ObjectSpace. ObjectSpace#each_object
, when passed a module, returns only objects that match or are subclasses of the module; the block here also rejects the top level (e.g., if we call Numeric.descendants
, we don't expect Numeric
to be in the results).
Don't worry if you don't quite get what's happening here, as more reading on ObjectSpace is probably required to really get it. For our purposes, it's enough to know that this method lives on Class
and returns a list of descendant classes, or you may think of it as the "family tree" of that class' children, grandchildren, etc.
In the 2018 RailsConf, Ryan Laughlin gave a talk on 'checkups'. The video is worth a watch, but we'll just extract one idea, which is to periodically run through all rows in your database and check if they pass your models' validity checks. You may be surprised how many rows in your database don't pass the #valid?
test.
The question, then, is how do we implement this check without having to manually maintain a list of models? #descendants
is the answer:
# Ensure all models are loaded (should not be necessary in production)
Rails.application.load! if Rails.env.development?
ApplicationRecord.descendants.each do |model_class|
# in the real world you'd want to send this off to background job(s)
model_class.all.each do |record|
if !record.valid?
HoneyBadger.notify("Invalid #{model.name} found with ID: #{record.id}")
end
end
end
Here, ApplicationRecord.descendants
gives us a list of every model in a standard Rails application. In our loop, then, model
is the class (e.g., User
or Product
). The implementation here is pretty basic, but the result is this will iterate through every model (or, more accurately, every subclass of ApplicationRecord) and call .valid?
for every row.
For most Rails developers, modules are not commonly used. This is for a good reason; if you own the code, there are usually easier ways to customize its behavior, and if you don't own the code, there are risks in changing its behavior with modules. Nevertheless, they have their use-cases, and it is a testament to Ruby's flexibility that not only can we change a class from another file, we also have the option to choose where in the ancestor chain our module appears.
ActiveSupport then comes in to provide the inverse of #ancestors
with #descendants
. This method is seldom used as far as I've seen, but once you know it's there, you'll probably find more and more uses for it. Personally, I've used it not just for checking model validity, but even with specs to validate that we are correctly adding attribute_alias
methods for all our models.
20