Pluck is an
ActiveRecord calculation method introduced in Nov 2011 and is designed to return a collection of values by performing a single column
SELECT query as direct
Now what problem could this help solve and how does it affect us?
How often have you run into some code similar to this?
users = User.all => [#<User id: 1, email: 'email@example.com', active: true>, #<User id: 2, email: 'firstname.lastname@example.org', active: false>] users.map(&:email) => ['email@example.com', 'firstname.lastname@example.org'] # I've separated the AR result from the ruby #map for clarity # this would normally be called as User.all.map(&:email)
What is happening here?
We are returning
all records of the
ActiveRecord object and asking for a new collection containing each
.select will return AR records as the result, but will only return with the explicitly requested attributes. The
User class is still being instantiated for each result. Depending on your memory constraints this may not be a problem, but nevertheless it’s using up memory that just didn’t need to be used.
emails = User.select(:email) => [#<User email: 'email@example.com'>, #<User email: 'firstname.lastname@example.org'>] emails.map(&:email) => ['email@example.com', 'firstname.lastname@example.org']
We’re getting closer. At least we are now only pulling back the data we care about. This is only marginally better than our first attempt and it feels weird. We’ve asked to only select a particular attribute and then we have to ask for it to be the only thing in a collection.
Lets try one more time using the
User.pluck(:email) => ['email@example.com', 'firstname.lastname@example.org']
Whoa! Well that’s a bit different. Definitely less code but what is it doing?
.pluck is a (misplaced?) calculation method that, like described above, is returning an array of results based on the database column that was requested. Where the previous example returned full AR
User records and then performed the
map, this example went directly to the database with a
SQL call and returned only the result needed. In many cases this is far cleaner and definitely more efficient.
In addition, other relation methods can be chained together and affect the end result query that is generated.
SELECT email FROM 'users' WHERE 'users'.'active' = 't'