Arel and Nested Queries

Arel is good at facilitating conditional clauses. This post mainly focuses on how to write nested queries with Arel.

Let's look at a simple nested select example:

SELECT *
FROM (
  SELECT * 
  FROM users
  WHERE users.id < 100 ) as old_users

One way to implement this is mixing ActiveRecord model and Arel.

inside_query = User.where("'id' < 100")
outside_query = User.scoped 
outside_query = outside_query.from(Arel.sql("(#{inside_query.to_sql})")).project(Arel.star)

But this solution drags ActiveRecord into the mix. It might have impact on performance.

Another way is to use Arel directly.

user = Arel::Table.new('user')
query = user.where(user[:id].lt(100)).project(Arel.star)
Arel::SelectManager.new(Arel::Table.engine, Arel.sql("(#{query.to_sql})"))

Last line of code is to select from an inner query defined earlier. The inner query needs to be firstly cast to SQL and then wrapped in Arel. Then SelectManager can view that as a table where to select from.

Here is another more complicated example of using Arel for nested queries. This query is to count No. of regular customers and No. of repeated customers, given a time span. The invoices table contains customer id info and payments table has payment information (paid, unpaid). Those two tables join at invoices id. If a customer has made 1 payment, he's a regular customer, and if he's made at least 2 payments, he's a repeated customer.

SELECT Sum(CASE 
             WHEN query.count > 0 
                  AND ( query.status = 'paid' ) THEN 1 
             ELSE 0 
           END) AS customer, 
       Sum(CASE 
             WHEN query.count >= 2 
                  AND ( query.status = 'paid' ) THEN 1 
             ELSE 0 
           END) AS repeat_customer 
FROM   (SELECT invoices.customer_id, 
               Count(invoices.customer_id) AS count, 
               payment.pay_status as status
        FROM   invoices 
               LEFT OUTER JOIN payments 
                            ON invoices.id = payments.invoice_id 
        WHERE  invoices.created_at >= '2014-07-01' 
               AND invoices.created_at <= '2015-07-03' 
        GROUP  BY invoices.customer_id, 
                  payment.pay_status) query 

Translate SQL to Arel:

invoices = Arel::Table.new('invoices')
payments = Arel::Table.new('payments')
query = invoices
  .join(payments, Arel::Nodes::OuterJoin)
    .on(invoices[:id].eq(payments[:invoice_id]))
  .where(invoices[:created_at].gteq('2014-07-01'))
  .where(invoices[:created_at].lteq('2015-07-03'))
  .project(invoices[:customer_id], invoices[:customer_id].count.as('count'))
  .group(invoices[:customer_id], payments[:status].as('newstatus'))

query = query.as(Arel.sql('query'))

manager = Arel::SelectManager.new(invoices.engine)
manager.project(
  Arel::SqlLiteral.new(<<-SQL
    sum(case when query.count >=1 and (query.newstatus='paid') then 1 else 0 end) as customer,
    sum(case when query.count >=2 and (query.newstatus='paid') then 1 else 0 end) as repeat_customer
    SQL
    )
  )
  .from(query)