Rails Dispatch

Rails news delivered fresh

Presented by Engine Yard

ActiveRecord 3.0’s big new feature is a brand new finder API that is backed by ActiveRelation, or Arel for short. Arel is a Ruby implementation of relational algebra. This week’s screencast will be a hands-on demonstration of the new ActiveRecord finder API and an article that will cover the topic in further depth. The content is suitable for beginners.

Also, take note of the new Q&A feature at Rails Dispatch. If you’ve got a question for a Rails Dispatch contributor, be sure to submit it here. Take a look at the recently asked questions, too, and vote for the one’s you’re most interested in!

The relational model

Let’s start with a gentle introduction. What is the relational model anyway? Well, you all most likely already have a general feel of what it is. SQL is an implementation of the relational model. I’m going to ignore the inconsistencies for the purpose of this article.

The relational model was first proposed by E.F. Codd roughly forty years ago as a way to represent data. Before his proposal, database systems used ad-hoc methods to represent data. He proposed the relational model to provide a declarative method for specifying data and queries.

The relational model consists of two major concepts: relations and operators. Relations handle structuring the data—operators handle manipulating that data. Relations are similar to SQL tables, views, and query results. Operators are the various elements of the SQL language, such as SELECT, WHERE, and JOIN.

One of the most important properties of the relational model is the property of closure. This means that you can take any relation, perform any operation, and you’ll get back another relation. This means that any query can be used as an input of another query (which is just the concept of subqueries in SQL).

ActiveRecord 3.0

So now, instead of ActiveRecord being a tool to execute SQL queries, it is a tool for building relations. The difference is subtle, but important. Using the relational model is the correct abstraction, as we will see throughout the remainder of the article. First, let me point out a few important changes.

@posts = Post.order("created_at DESC")

In the previous example, @posts is an instance of the Relation class. This differs from previous versions of ActiveRecord where @posts would have been an array of Post instances.

Also, it is important to note that relations should be considered immutable. A relation is to relational algebra just as a number (like 2) is to elementary algebra. It would be a crazy world if 2 could be mutated to represent a different number.

Chainability

The most obvious win of the new API is that operations can be chained. This is due to the property of closure that I mentioned above. For example, the two following code snippets are equivalent.

Post.where(:author => "carllerche", :category => "awesome")
Post.where(:author => "carllerche").where(:category => "awesome")

It’s not just #where that can be chained, but any query method. For example:

Post.where(:author => "carllerche").order("created_at desc")

The relation object could even be stored to a variable and used multiple times to define subqueries. Perhaps, something like:

my_posts = Post.where(:author => "carllerche")
my_awesome_posts = my_posts.where(:category => "awesome")
my_lame_posts = my_posts.where(:category => "lame") # empty

Laziness

Relations are lazily materialized. This means that the query is not triggered until results are actually use. This makes creating relations virtually free. The obvious use case is fragment caching. The relations could be built whether or not a fragment is cached.

def index
  @posts = Post.where(:author => "carllerche")
end
<% cache do %>
  <% @posts.each do |post| %>
    <h2><%= post.title %></h2>
    <%= post.some_expensive_method %>
  <% end %>
<% end %>

If the template fragment is cached, then @posts.each will never be called, which is the moment that the query actually gets executed. Previously, the controller had to be wrapped in a condition to check whether or not the fragment was already cached.

The new named_scope

#named_scope has always been one of my favorite features. It’s a clean and elegant way to abstract often used queries. In ActiveRecord 3.0, it’s even cleaner and more elegant. Take the following example:

class Post < ActiveRecord::Base
  scope :my_awesome_posts, where(:author => "carllerche").where(:category => "awesome")
end

Note how the query API is exactly the same between building scopes and building queries. Everything that can be done with building queries can be done with building scopes. All a scope really is, is a relation that can be accessed via a method. So, it can be manipulated just the same as any other relation.

Post.my_awesome_posts.order("created_at desc")

In my opinion, the uniformity between the various APIs that has been achieved is quite nice. The above scope example is more or less equivalent to doing:

class Post < ActiveRecord::Base
  def self.my_awesome_posts
    where(:author => "carllerche").where(:category => "awesome")
  end
end

But, it’s such a common idiom, that the single line API is definitely worth it. This means that all three ways of querying (finders, named scopes, and with_scope) use exactly the same API. One nice side effect is that refactoring code from a controller to a model is literally a matter of copy-and-paste.

Available finder methods

Relations have the following finder methods available to them:

  • where
  • order
  • limit
  • offset
  • includes
  • joins
  • select
  • having
  • group
  • lock
  • readonly
  • from

These methods are analogous to the hash keys by the same name in ActiveRecord 2.x except for #where and #having, which replace the :conditions hash key. The main difference being, as explained above, they will return new relation objects. More detailed documentation on how to use these new query methods can be found at the Rails guide on ActiveRecord querying.

Under the hood

Behind ActiveRecord’s shiny new API is Arel. Arel is composed of two major parts. The first is the algebra side of things which handles representing relations and operators. This is what ActiveRecord uses behind the scenes for its query API. The second component is the set of engines, which materialize a relation. This materialization happens when a relation is enumerated (usually via a call to each).

An engine is a class that responds to CRUD. In other words, it must implement #create, #read, #update, and #delete, where each method accepts a single argument: a relation instance. There are currently two engines that are built for Arel: a SQL engine and an in-memory engine. The SQL engine is what ActiveRecord uses. There should be no reason why custom engines could not be built. For example, it should be possible to build an engine that implements Twitter’s API. This would be an exercise left up to the community.

Conclusion

This has only been a brief introduction to the new ActiveRecord 3.0 query API and Arel integration that’s coming with Rails 3.0. The current features are just the start. As work continues, Arel will be developed further and the integration will become tighter. If you want to follow along, the source for Arel can be found here on GitHub. If you want to learn more about the relational model, the wiki page is a great introduction. So is Database in Depth: Relational Theory for Practitioners.