Ruby Performance Tricks –Posted by Sergey Potapov

Sep 5th, 2012

Ruby Performance Tricks —Posted by Sergey Potapov

http://greyblake.com/blog/2012/09/02/ruby-perfomance-tricks/

I did some benchmarks to find out which alternatives to write code work faster. I wanna share it with you. All benchmarks are made against ruby 1.9.3p194 MRI.

Do not use exceptions for a control flow

The next example is pretty stupid but it shows how exceptions slow against conditional statements.

require 'benchmark'

class Obj
  def with_condition
    respond_to?(:mythical_method) ? self.mythical_method : nil
  end

  def with_rescue
    self.mythical_method
  rescue NoMethodError
    nil
  end
end

obj = Obj.new
N = 10_000_000

puts RUBY_DESCRIPTION

Benchmark.bm(15, "rescue/condition") do |x|
  rescue_report     = x.report("rescue:")    { N.times { obj.with_rescue  } }
  condition_report  = x.report("condition:") { N.times { obj.with_if      } }
  [rescue_report / condition_report]
end

MRI 1.9.3:

ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-linux] user system total real rescue: 111.530000 2.650000 114.180000 (115.837103) condition: 2.620000 0.010000 2.630000 ( 2.633154) rescue/condition: 42.568702 265.000000 NaN ( 43.991767)

MRI 1.8.7 (REE has similar result):

ruby 1.8.7 (2011-12-28 patchlevel 357) [x86_64-linux]
                        user     system      total        real
rescue:            80.510000   0.940000  81.450000 ( 81.529022)
if:                 3.320000   0.000000   3.320000 (  3.330166)
rescue/condition:  24.250000        inf       -nan ( 24.481970)

String concatenation

Avoid using += to concatenate strings in favor of << method. The result is absolutely the same: add a string to the end of an existing one. What is the difference then?

See the example:

str1 = "first"
str2 = "second"
str1.object_id       # => 16241320

str1 += str2    # str1 = str1 + str2
str1.object_id  # => 16241240, id is changed

str1 << str2 str1.object_id  # => 16241240, id is the same

When you use += ruby creates a temporal object which is result of str1 + str2. Then it overrides str1 variable with reference to the new built object. On other hand << modifies existing one. As a result of using += you have the next disadvantages:

More calculation to join strings.
Redundant string object in memory (previous value of str1), which approximates time when GC will trigger.

How += is slow? Basically it depends on length of strings you have operation with.

require 'benchmark'

N = 1000
BASIC_LENGTH = 10

5.times do |factor|
  length = BASIC_LENGTH * (10 ** factor)
  puts "_" * 60 + "\nLENGTH: #{length}"

  Benchmark.bm(10, '+= VS <

Output:

____________________________________________________________
LENGTH: 10
                 user     system      total        real
+=           0.000000   0.000000   0.000000 (  0.004671)
<<           0.000000   0.000000   0.000000 (  0.000176)
+= VS <<          NaN        NaN        NaN ( 26.508796)
____________________________________________________________
LENGTH: 100
                 user     system      total        real
+=           0.020000   0.000000   0.020000 (  0.022995)
<<           0.000000   0.000000   0.000000 (  0.000226)
+= VS <<          Inf        NaN        NaN (101.845829)
____________________________________________________________
LENGTH: 1000
                 user     system      total        real
+=           0.270000   0.120000   0.390000 (  0.390888)
<<           0.000000   0.000000   0.000000 (  0.001730)
+= VS <<          Inf        Inf        NaN (225.920077)
____________________________________________________________
LENGTH: 10000
                 user     system      total        real
+=           3.660000   1.570000   5.230000 (  5.233861)
<<           0.000000   0.010000   0.010000 (  0.015099)
+= VS <<          Inf 157.000000        NaN (346.629692)
____________________________________________________________
LENGTH: 100000
                 user     system      total        real
+=          31.270000  16.990000  48.260000 ( 48.328511)
<<           0.050000   0.050000   0.100000 (  0.105993)
+= VS <<   625.400000 339.800000        NaN (455.961373)

Be careful with calculation within iterators

Assume you need to write a function to convert an array into a hash where keys and values are same as elements of the array:

func([1, 2, 3])  # => {1 => 1, 2 => 2, 3 => 3}

The next solution would satisfy the requirements:

def func(array)
  array.inject({}) { |h, e| h.merge(e => e) }
end

And would be extremely slow with big portions of data because it contains nested methods (inject and merge), so it’s O(n²) algorithm. But it’s obviously that it must be O(n) . Consider the next:

def func(array)
  array.inject({}) { |h, e| h[e] = e; h }
end

In this case we do only one iteration over an array without any hard calculation within the iterator.

See the benchmark:

require 'benchmark'

def n_func(array)
  array.inject({}) { |h, e| h[e] = e; h }
end

def n2_func(array)
  array.inject({}) { |h, e| h.merge(e => e) }
end

BASE_SIZE = 10

4.times do |factor|
  size   = BASE_SIZE * (10 ** factor)
  params = (0..size).to_a
  puts "_" * 60 + "\nSIZE: #{size}"
  Benchmark.bm(10) do |x|
    x.report("O(n)" ) { n_func(params)  }
    x.report("O(n2)") { n2_func(params) }
  end
end

Output:

____________________________________________________________
SIZE: 10
                user     system      total        real
O(n)        0.000000   0.000000   0.000000 (  0.000014)
O(n2)       0.000000   0.000000   0.000000 (  0.000033)
____________________________________________________________
SIZE: 100
                user     system      total        real
O(n)        0.000000   0.000000   0.000000 (  0.000043)
O(n2)       0.000000   0.000000   0.000000 (  0.001070)
____________________________________________________________
SIZE: 1000
                user     system      total        real
O(n)        0.000000   0.000000   0.000000 (  0.000347)
O(n2)       0.130000   0.000000   0.130000 (  0.127638)
____________________________________________________________
SIZE: 10000
                user     system      total        real
O(n)        0.020000   0.000000   0.020000 (  0.019067)
O(n2)      17.850000   0.080000  17.930000 ( 17.983827)

It’s an obvious and trivial example. Just keep in mind to not do hard calculation within iterators if it’s possible.

Use bang! methods

In many cases bang methods do the same as there non-bang analogues but without duplication an object. The previous example with merge! would be much faster:

require 'benchmark'

def merge!(array)
  array.inject({}) { |h, e| h.merge!(e => e) }
end

def merge(array)
  array.inject({}) { |h, e| h.merge(e => e) }
end

N = 10_000
array = (0..N).to_a

Benchmark.bm(10) do |x|
  x.report("merge!") { merge!(array) }
  x.report("merge")  { merge(array)  }
end

Output:

                 user     system      total        real
merge!       0.010000   0.000000   0.010000 (  0.011370)
merge       17.710000   0.000000  17.710000 ( 17.840856)

Use instance variables

Accessing instance variable directly is about two times faster than accessing them with accessor methods:

require 'benchmark'

class Metric
  attr_accessor :var

  def initialize(n)
    @n   = n
    @var = 22
  end

  def run
    Benchmark.bm(10) do |x|
      x.report("@var") { @n.times { @var } }
      x.report("var" ) { @n.times { var  } }
      x.report("@var =")     { @n.times {|i| @var = i     } }
      x.report("self.var =") { @n.times {|i| self.var = i } }
    end
  end
end

metric = Metric.new(100_000_000)
metric.run

Output:

                 user     system      total        real
@var         6.980000   0.010000   6.990000 (  7.193725)
var         13.040000   0.000000  13.040000 ( 13.131711)
@var =       7.960000   0.000000   7.960000 (  8.242603)
self.var =  14.910000   0.010000  14.920000 ( 15.960125)

Parallel assignment is slower

require 'benchmark'

N = 10_000_000

Benchmark.bm(15) do |x|
  x.report('parallel') do
    N.times do
      a, b = 10, 20
    end
  end

  x.report('consequentially') do |x|
    N.times do
      a = 10
      b = 20
    end
  end
end

Output:

                      user     system      total        real
parallel          1.900000   0.000000   1.900000 (  1.928063)
consequentially   0.880000   0.000000   0.880000 (  0.879675)

Dynamic method defention

What is the faster way to define method dynamically: class_eval with a code string or using define_method? Which way generated methods work faster?

require 'benchmark'

class Metric
  N = 1_000_000

  def self.class_eval_with_string
    N.times do |i|
      class_eval(<

Output:

                             user     system      total        real
class_eval with string 219.840000   0.720000 220.560000 (221.933074)
define_method           61.280000   0.240000  61.520000 ( 62.070911)
string method            0.110000   0.000000   0.110000 (  0.111433)
dynamic method           0.150000   0.000000   0.150000 (  0.156537)

So class_eval works slower but it’s preferred since methods generated with class_evaland a string of code work faster.

Links

How Slow Are Ruby Exceptions
6 Optimization Tips for Ruby MRI (NOTE: Symbol#to_proc was ported to Ruby and it’s not slow anymore)
“Writing Efficient Ruby Code” by Dr. Stefan Kaes
Top 10 Ruby on Rails performance tips
20 Ruby Performance Tips
A Beginner’s Guide to Big O Notation

Danke.

Posted by Sergey Potapov Sep 2nd, 2012  accessor, concatenation, exception, performance, reader,ruby, string, variable, writer

37signals Earns Millions Each Year. Its CEO’s Model? His Cleaning Lady

Sep 3rd, 2012

37signals Earns Millions Each Year. Its CEO’s Model? His Cleaning Lady

Don’t build a fast company, Jason Fried tells Fast Company. Build a slow one.

Jason Fried is a founder and CEO of 37signals, a software company based in Chicago. Fried also treats 37signals as something of a laboratory for innovative workplace practices–such as a recent experiment in shortening the summer workweek to just four days. We caught up with Fried to learn how employees are like fossil fuels, how a business can be like a cancer, and how one of the entrepreneurs he admires most is his cleaning lady. FAST COMPANY: You have your employees only work four-day weeks in the summer. JASON FRIED: Sometimes people are not really used to working just four days and actually want to stay to get more work done. You’re saying you have people who actually want to stay the fifth day? When we first started this a few years ago, there was a small sense of guilt in a few corners. People were like, “I have stuff to get done, it’s Thursday, so I’m gonna work Friday and just get it done. But we actually preferred that they didn’t. There are very few things that can’t wait till Monday. How many employees would stay to work Fridays? I don’t know. Because you weren’t there! We don’t track things in that way. I don’t look at that. I don’t want to encourage that kind of work. I want to encourage quality work. As CEO, wouldn’t it simply be rational to let people work the fifth day for you if they wanted? If you’re a short-term thinker you’d think so, but we’re long-term thinkers. We’re about being in business for the long haul and keeping the team together over the long haul. I would never trade a short-term burst for a long-term decline in morale. That happens a lot in the tech business: They burn people out and get someone else. I like the people who work here too much. I don’t want them to burn out. Lots of startups burn people out with 60, 70, 80 hours of work per week. They know that both the people or the company will flame out or be bought or whatever, and they don’t care, they just burn their resources. It’s like drilling for as much oil as you possibly can. You can look at people the same way. Are we reaching “peak people”? It seems like in a lot of companies we are. There’s a shortage of talent out there, and if there’s a shortage of resources, you want to conserve those resources. So you think there’s a slash-and-burn mentality in the tech world? For sure. I think there’s a lot of lottery-playing going on right now. Companies staffing up, raising a bunch of money, hiring a bunch of people, and burning them out in the hopes that they’ll hit the lottery. You seem like too nice a guy to name names–but do you have certain companies in mind? I won’t name names. I used to name names. But I think all you have to do is read TechCrunch. Look at what the top stories are, and they’re all about raising money, how many employees they have, and these are metrics that don’t matter. What matters is: Are you profitable? Are you building something great? Are you taking care of your people? Are you treating your customers well? In the coverage of our industry as a whole, you’ll rarely see stories about treating customers well, about people building a sustainable business. TechCrunch to me is the great place to look to see the sickness in our industry right now. Our magazine is called Fast Company, but it sounds like you want to build a slow company. I’m a fan of growing slowly, carefully, methodically, of not getting big just for the sake of getting big. I think that rapid growth is typically of symptom of… there’s a sickness there. There’s a great quote by a guy named Ricardo Semler, author of the book Maverick. He said that only two things grow for the sake of growth: businesses and tumors. We have 35 employees at 37signals. We could have hundreds of employees if we wanted to–our revenues and profits support that–but I think we’d be worse off. What industries do you look to for inspiration, if not the tech world? I take my inspiration from small mom-and-pop businesses that have been around for a long time. There are restaurants all over the place that I like to go to that have been around a long time, 30 years or more, and thinking about that, that’s an incredible run. I don’t know what percentage of tech companies have been around 30 years. The other interesting thing about restaurants is you could have a dozen Italian restaurants in the city and they can all be successful. It’s not like in the tech world, where everyone wants to beat each other up, and there’s one winner. Those are the businesses I find interesting–it could be a dry cleaner, a restaurant, a clothing store. Actually, my cleaning lady, for example, she’s great. Your business icon is your cleaning lady? She’s on her own, she cleans people’s homes, she’s incredibly nice. She brings flowers every time she cleans, and she’s just respectful and nice and awesome. Why can’t more people be like that? She’s been doing it some twenty-odd years, and that’s just an incredible success story. To me that’s far more interesting than a tech company that’s hiring a bunch of people, just got their fourth round of financing for 12 million dollars, and they’re still losing money. That’s what everyone talks about as being exciting, but I think that’s an absolutely disgusting scenario when it comes to business.

Google Chrome Keyboard and Mouse Shortcuts

Aug 30th, 2012

Google Chrome Keyboard and mouse shortcuts

Keyboard and mouse shortcuts

Windows keyboard shortcuts

Tab and window shortcuts

Ctrl+N	Opens a new window.
Ctrl+T	Opens a new tab.
Ctrl+Shift+N	Opens a new window in incognito mode.
Press Ctrl+O, then select file.	Opens a file from your computer in Google Chrome.
Press Ctrl and click a link. Or click a link with your middle mouse button (or mousewheel).	Opens the link in a new tab in the background .
Press Ctrl+Shift and click a link. Or press Shift and click a link with your middle mouse button (or mousewheel).	Opens the link in a new tab and switches to the newly opened tab.
Press Shift and click a link.	Opens the link in a new window.
Ctrl+Shift+T	Reopens the last tab you’ve closed. Google Chrome remembers the last 10 tabs you’ve closed.
Drag a link to a tab.	Opens the link in the tab.
Drag a link to a blank area on the tab strip.	Opens the link in a new tab.
Drag a tab out of the tab strip.	Opens the tab in a new window.
Drag a tab out of the tab strip and into an existing window.	Opens the tab in the existing window.
Press Esc while dragging a tab.	Returns the tab to its original position.
Ctrl+1 through Ctrl+8	Switches to the tab at the specified position number on the tab strip.
Ctrl+9	Switches to the last tab.
Ctrl+Tab or Ctrl+PgDown	Switches to the next tab.
Ctrl+Shift+Tab or Ctrl+PgUp	Switches to the previous tab.
Alt+F4	Closes the current window.
Ctrl+W or Ctrl+F4	Closes the current tab or pop-up.
Click a tab with your middle mouse button (or mousewheel).	Closes the tab you clicked.
Right-click, or click and hold either the Back or Forward arrow in the browser toolbar.	Displays your browsing history in the tab.
Press Backspace, or Alt and the left arrow together.	Goes to the previous page in your browsing history for the tab.
Press Shift+Backspace, or Alt and the right arrow together.	Goes to the next page in your browsing history for the tab.
Press Ctrl and click either the Back arrow, Forward arrow, or Go button in the toolbar. Or click either button with your middle mouse button (or mousewheel).	Opens the button destination in a new tab in the background.
Double-click the blank area on the tab strip.	Maximizes or minimizes the window.
Alt+Home	Opens your homepage in your current window.

Google Chrome feature shortcuts

Alt+F or Alt+E	Opens the wrench menu, which lets you customize and control settings in Google Chrome.
Ctrl+Shift+B	Toggles the bookmarks bar on and off.
Ctrl+H	Opens the History page.
Ctrl+J	Opens the Downloads page.
Shift+Esc	Opens the Task Manager.
Shift+Alt+T	Sets focus on the first tool in the browser toolbar. You can then use the following shortcuts to move around in the toolbar: Press Tab, Shift+Tab, Home, End, right arrow, and left arrow to move focus to different items in the toolbar. Press Space or Enter to activate toolbar buttons, including page actions and browser actions. Press Shift+F10 to bring up any associated context menu (e.g. browsing history for the Back button). Press Esc to return focus from the toolbar back to the page.
F6 or Shift+F6	Switches focus to the next keyboard-accessible pane. Panes include: Highlights the URL in the address bar Bookmarks bar (if visible) The main web content (including any infobars) Downloads bar (if visible)
Ctrl+Shift+J	Opens Developer Tools.
Ctrl+Shift+Delete	Opens the Clear Browsing Data dialog.
F1	Opens the Help Center in a new tab (our favorite).
Ctrl+Shift+M	Switch between multiple users.

Address bar shortcuts

Use the following shortcuts in the address bar:

Type a search term, then press Enter.	Performs a search using your default search engine.
Type a search engine keyword, press Space, type a search term, and press Enter.	Performs a search using the search engine associated with the keyword.
Begin typing a search engine URL, press Tab when prompted, type a search term, and press Enter.	Performs a search using the search engine associated with the URL.
Ctrl+Enter	Adds www. and .com to your input in the address bar and open the resulting URL.
Type a URL, then press Alt+Enter.	Opens the URL in a new tab.
Ctrl+L or Alt+D	Highlights the URL.
Ctrl+K or Ctrl+E	Places a ‘?’ in the address bar. Type a search term after the question mark to perform a search using your default search engine.
Press Ctrl and the left arrow together.	Moves your cursor to the preceding key term in the address bar
Press Ctrl and the right arrow together.	Moves your cursor to the next key term in the address bar
Ctrl+Backspace	Deletes the key term that precedes your cursor in the address bar
Select an entry in the address bar drop-down menu with your keyboard arrows, then press Shift+Delete.	Deletes the entry from your browsing history, if possible.
Click an entry in the address bar drop-down menu with your middle mouse button (or mousewheel).	Opens the entry in a new tab in the background.
Press Page Up or Page Down when the address bar drop-down menu is visible.	Selects the first or last entry in the drop-down menu.

Webpage shortcuts

Ctrl+P	Prints your current page.
Ctrl+S	Saves your current page.
F5 or Ctrl+R	Reloads your current page.
Esc	Stops the loading of your current page.
Ctrl+F	Opens the find bar.
Ctrl+G or F3	Finds the next match for your input in the find bar.
Ctrl+Shift+G, Shift+F3, or Shift+Enter	Finds the previous match for your input in the find bar.
Click the middle mouse button (or mousewheel).	Activates auto-scrolling. As you move your mouse, the page automatically scrolls according to the direction of the mouse.
Ctrl+F5 or Shift+F5	Reloads your current page, ignoring cached content.
Press Alt and click a link.	Downloads the target of the link.
Ctrl+U	Opens the source of your current page.
Drag a link to bookmarks bar	Saves the link as a bookmark.
Ctrl+D	Saves your current webpage as a bookmark.
Ctrl+Shift+D	Saves all open pages as bookmarks in a new folder.
F11	Opens your page in full-screen mode. Press F11 again to exit full-screen.
Ctrl and +, or press Ctrl and scroll your mousewheel up.	Enlarges everything on the page.
Ctrl and -, or press Ctrl and scroll your mousewheel down.	Makes everything on the page smaller.
Ctrl+0	Returns everything on the page to normal size.
Space bar	Scrolls down the web page.
Home	Goes to the top of the page.
End	Goes to the bottom of the page.
Press Shift and scroll your mousewheel.	Scrolls horizontally on the page.

Text shortcuts

Ctrl+C	Copies highlighted content to the clipboard.
Ctrl+V or Shift+Insert	Pastes content from the clipboard.
Ctrl+Shift+V	Paste content from the clipboard without formatting.
Ctrl+X or Shift+Delete	Deletes the highlighted content and copies it to the clipboard.

Advanced Caching: Part 1 - Caching Strategies

Aug 29th, 2012

Advanced Caching: Part 1 – Caching Strategies

First, let’s start with a brief overview of the different types of caching. We’ll start from 50,000ft and work our way down.

HTTP Caching: Uses HTTP headers (Last-Modified, ETag, If-Modified-Since, If-None-Match, Cache-Control) to determine if the browser can use a locally stored version of the response or if it needs to request a fresh copy from the origin server. Rails makes it easy to use HTTP caching, however the cache is managed outside your application. You may have notice the config.cache_control and Rack::Cache, Rack::ETag, Rack::ConditionalGet middlewares. These are used for HTTP caching.

Page Caching: PRAISE THE GODS if you actually can use page caching in your application. Page caching is the holy grail. Save the entire thing. Don’t hit the stack & give some prerendered stuff back. Great for worthless applications without authentication and other highly dynamic aspects. This essentially works like HTTP caching, but the response will always contain the entire page. With page caching the application is skipping the work.

Action Caching: Essentially the same as page caching, except all the before filters are run allowing you to check authentication and other stuff that may have prevented the request form rendering.

Fragment Caching: Store parts of views in the cache. Usually for caching partials or large bits of HTML that are independent from other parts. IE, a list of top stories or something like that.

Rails.cache: All cached content except cached pages are stored in the Rails.cache. We’ll use this fact that later. You can cache arbitrary content in the Rails cache. You may cache a large complicated query that you don’t want to wait to reinstantiate a ton of ActiveRecord::Base objects.

Under the Hood All the caching layers are built on top of the next one. Page caching and HTTP caching are different because they do not use Rails.cache The cache is essentially a key-value store. Different things can be persisted. Strings are most common (for HTML fragments). More complicated objects can be persisted as well. Let’s go through some examples of manually using the cache to store things. I am using memcached with dalli for all these examples. Dalli is the default memcached driver.

Rails.cache.write takes two values: key and a value

Rails.cache.write ‘foo’, ‘bar’ => true

We can read an object back

Rails.cache.read ‘foo’ => “bar”

We can store a complicated object as well

hash = { :this => { :is => ‘a hash’ }} Rails.cache.write ‘complicated-object’, object Rails.cache.read ‘complicated-object’ => {:this=>{:is=>“a hash”}}

If we want something that doesn’t exist, we get nil

Rails.cache.read ‘we-havent-cached-this-yet’ => nil

“Fetch” is the most common pattern. You give it a key and a block

to execute to store if the cache misses. The blocks’s return value is

then written to the cache. The block is not executed if there is a

hit.

Rails.cache.fetch ‘huge-array’ do
huge_array = Array.new
1000000.times { |i| huge_array << i }
huge_array # retrun value is stored in cache
end => [huge array] # took some time to generate Rails.cache.read ‘huge-array’ => [huge array] # but returned instantly

You can also delete everything from the cache

Rails.cache.clear => [true] Those are the basics of interacting with the Rails cache. The rails cache is a wrapper around whatever functionality is provided by the underlying storage system. Now we are ready to move up a layer.

Understanding Fragment Caching Fragment caching is taking rendered HTML fragments and storing them in the cache. Rails provides a cache view helper for this. Its most basic form takes no arguments besides a block. Whatever is rendered during the block will be written back to the cache. The basic principle behind fragment caching is that it takes much less time fetch pre-rendered HTML from the cache, then it takes to generate a fresh copy. This is appallingly true. If you haven’t noticed, view generation can be very costly. If you have cachable content and are not using fragment caching then you need to implement this right away! Let’s say you have generated a basic scaffold for a post:

$ rails g scaffold post title:string content:text author:string Let’s start with the most common use case: caching information specific to one thing. IE: One post. Here is a show view:

Title: <%= @post.title %>

Content: <%= @post.content %>

Let’s say we wanted to cache fragment. Simply wrap it in cache and Rails will do it.

<%= cache “post-#{@post.id}” do %>

<b>Title:</b>
<%= @post.title %>

<b>Content:</b>
<%= @post.content %>

<% end %> The first argument is the key for this fragment. The rendered HTML is stored with this key: views/posts-1. Wait what? Where did that ‘views’ come from? The cache view helper automatically prepends ‘views’ to all keys. This is important later. When you first load the page you’ll see this in the log:

Exist fragment? views/post-2 (1.6ms) Write fragment views/post-2 (0.9ms) You can see the key and the operations. Rails is checking to see if the specific key exists. It will fetch or write it. In this case, it has not been stored so it is written. When you reload the page, you’ll see a cache hit:

Exist fragment? views/post-2 (0.6ms) Read fragment views/post-2 (0.0ms) There we go. We got HTML from the cache instead of rendering it. Look at the response times for the two requests:

Completed 200 OK in 17ms (Views: 11.6ms | ActiveRecord: 0.1ms) Completed 200 OK in 16ms (Views: 9.7ms | ActiveRecord: 0.1ms) Very small differences in this case. 2ms different in view generation. This is a very simple example, but it can make a world of difference in more complicated situations.

You are probably asking the question: “What happens when the post changes?” This is an excellent question! What well if the post changes, the cached content will not be correct. It is up to us to remove stuff from the cache or figure out a way to get new content from the cache. Let’s assume that our blog posts now have comments. What happens when a comment is created? How can handle this?

This is a very simple problem. What if we could figure out a solution to this problem: How can we create a cache miss when the associated object changes? We’ve already demonstrated how we can explicitly set a cache key. What if we made a key that’s dependent on the time the object was last updated? We can create a key composed of the record’s ID and its updated_at timestamp! This way the cache key will change as the content changes and we will not have to expire things manually. (We’ll come back to sweepers later). Let’s change our cache key to this:

<% cache “post-#{@post.id}”, @post.updated_at.to_i do %> Now we can see we have a new cache key that’s dependent on the object’s timestamp. Check out the rails log:

Exist fragment? views/post-2/1304291241 (0.5ms) Write fragment views/post-2/1304291241 (0.4ms) Cool! Now let’s make it so creating a comment updates the post’s timestamp:

class Comment < ActiveRecord::Base belongs_to :post, :touch => true end Now all comments will touch the post and change the updated_at timestamp. You can see this in action by touch’ing a post.

Post.find(1).touch

Exist fragment? views/post-2/1304292445 (0.4ms) Write fragment views/post-2/1304292445 (0.4ms) This concept is known as: auto expiring cache keys. You create a composite key with the normal key and a time stamp. This will create some memory build up as objects are updated and no longer fresh. Here’s an example. You have that fragment. It is cached. Then someone updates the post. You now have two versions of the fragment cached. If there are 10 updates, then there are 10 different versions. Luckily for you, this is not a problem for memcached! Memcached uses a LRU replacement policy. LRU stands for Least Recently Used. That means the key that hasn’t been requested in the longest time will be replaced by newer content when needed. For example, assume your cache can only hold 10 posts. The next update will create a new key and hence new content. Version 0 will be deleted and version 11 will be stored in the cache. The total amount of memory is cycled between things that are requested. There are two things to consider in this approach. 1: You will not be able to ensure that content is kept in the cache as long as possible. 2. You will never have to worry about expiring things manually as long as timestamps are updated in the model layer. I’ve found it is orders of magnitude easier to add a few :touch => true’s to my relationships than it is to maintain sweepers. More on sweepers later. We must continue exploring cache keys.

Rails uses auto-expiring cache keys by default. The problem is they are not mentioned at all the documentation or in the guides. There is one very handy method: ActiveRecord::Base.cache_key. This will generate a key like this: posts/2-20110501232725. This is the exact same thing we did ourselves. This method is very important because depending on what type of arguments you pass into the cache method, a different key is generated. For the time being, this code is functionally equal to our previous examples.

<%= cache @post do %> The cache helper takes different forms for arguments. Here are some examples:

cache ‘explicit-key’ # views/explicit-key cache @post # views/posts/2-1283479827349 cache [@post, ‘sidebar’] # views/posts/2-2348719328478/sidebar cache [@post, @comment] # views/posts/2-2384193284878/comments/1-2384971487 cache :hash => :of_things # views/localhost:3000/posts/2?hash_of_things If an Array is the first arguments, Rails will use cache key expansion to generate a string key. This means calling doing logic on each object then joining each result together with a ‘/’. Essentially, if the object responds to cache_key, it will use that. Else it will do various things. Here’s the source for expand_cache_key:

def self.expand_cache_key(key, namespace = nil) expanded_cache_key = namespace ? “#{namespace}/” : “”

prefix = ENV[“RAILS_CACHE_ID”] || ENV[“RAILS_APP_VERSION”] if prefix

expanded_cache_key << "#{prefix}/"

end

expanded_cache_key <<

if key.respond_to?(:cache_key)
  key.cache_key
elsif key.is_a?(Array)
  if key.size > 1
    key.collect { |element| expand_cache_key(element) }.to_param
  else
    key.first.to_param
  end
elsif key
  key.to_param
end.to_s

expanded_cache_key end This is where all the magic happens. Our simple fragment caching example could easily be converted into an idea like this: The post hasn’t changed, so cache the entire result of /posts/1. You can do with this action caching or page caching.

Moving on to Action Caching Action caching is an around filter for specific controller actions. It is different from page caching since before filters are run and may prevent access to certain pages. For example, you may only want to cache if the user is logged in. If the user is not logged in they should be redirected to the log in page. This is different than page caching. Page caching bypasses the rails stack completely. Most web applications of legitimate complexity cannot use page caching. Action caching is the next logical step for most web applications. Let’s break the idea down: If the post hasn’t changed, return the entire cached page as the HTTP response, else render the show view, cache it, and return that as the HTTP response. Or in code:

Note: you cannot run this code! This is just an example of what’s

happening under the covers using concepts we’ve already covered.

Rails.cache.fetch ‘views/localhost:3000/posts/1’ do @post = Post.find params[:id] render :show end Declaring action caching is easy. Here’s how you can cache the show action:

class PostsController < ApplicationController

caches_action :show

def show

# do stuff

end end Now refresh the page and look at what’s been cached.

Started GET “/posts/2” for 127.0.0.1 at 2011-05-01 16:54:43 -0700 Processing by PostsController#show as HTML Parameters: {“id”=>“2”} Read fragment views/localhost:3000/posts/2 (0.5ms) Rendered posts/show.html.erb within layouts/application (6.1ms) Write fragment views/localhost:3000/posts/2 (0.5ms) Completed 200 OK in 16ms (Views: 8.6ms | ActiveRecord: 0.1ms) Now that the show action for post #2 is cached, refresh the page and see what happens.

Started GET “/posts/2” for 127.0.0.1 at 2011-05-01 16:55:27 -0700 Processing by PostsController#show as HTML Parameters: {“id”=>“2”} Read fragment views/localhost:3000/posts/2 (0.6ms) Completed 200 OK in 1ms Damn. 16ms vs 1ms. You can see the difference! You can also see Rails reading that cache key. The cache key is generated from the url with action caching. Action caching is a combination of a before and around filter. The around filter is used to capture the output and the before filter is used to check to see if it’s been cached. It works like this:

Execute before filter to check to see if cache key exists? Key exists? – Read from cache and return HTTP Response. This triggers a render and prevents any further code from being executed. No key? – Call all controller and view code. Cache output using Rails.cache and return HTTP response. Now you are probably asking the same question as before: “What do we do when the post changes?” We do the same thing as before: we create a composite key with a string and a time stamp. The question now is, how do we generate a special key using action caching?

Action caching generates a key from the current url. You can pass extra options using the :cache_path option. Whatever is in this value is passed into url_for using the current parameters. Remember in the view cache key examples what happened when we passed in a hash? We get a much different key than before:

views/localhost:3000/posts/2?hash_of_things Rails generated a URL based key instead of the standard views key. This is because you may different servers. This ensures that each server has it’s own cache key. IE, server one does not collide with server two. We could generate our own url for this resource by doing something like this:

url_for(@post, :tag => @post.updated_at.to_i) This will generate this url:

http://localhost:3000/posts/1?tag=234897123978 Notice the ?tag=23481329847. This is a hack that aims to stop browsers from using HTTP caching on specific urls. If the URL has changed (timestamp changes) then the browser knows it must request a fresh copy. Rails 2 used to do this for assets like CSS and JS. Things have changed with the asset pipeline.

Here’s an example of generating a proper auto expring key for use with action caching.

caches_action :show, :cache_path => proc { |c| # c is the instance of the controller. Since action caching # is declared at the class level, we don’t have access to instance # variables. If cache_path is a proc, it will be evaluated in the # the context of the current controller. This is the same idea # as validations with the :if and :unless options # # Remember, what is returned from this block will be passed in as # extra parameters to the url_for method. post = Post.find c.params[:id] { :tag => post.updated_at.to_i } end This calls url_for with the parameters already assigned by it through the router and whatever is returned by the block. Now if you refresh the page, you’ll have this:

Started GET “/posts/2” for 127.0.0.1 at 2011-05-01 17:11:22 -0700 Processing by PostsController#show as HTML Parameters: {“id”=>“2”} Read fragment views/localhost:3000/posts/2?tag=1304292445 (0.5ms) Rendered posts/show.html.erb within layouts/application (1.7ms) Write fragment views/localhost:3000/posts/2?tag=1304292445 (0.5ms) Completed 200 OK in 16ms (Views: 4.4ms | ActiveRecord: 0.1ms) And volia! Now we have an expiring cache key for our post! Let’s dig a little deeper. We know the key. Let’s look into the cache and see what it actually is! You can see the key from the log. Look it up in the cache.

Rails.cache.read ‘views/localhost:3000/posts/2?tag=1304292445’ => “<!DOCTYPE html>\n\n…..” It’s just a straight HTML string. Easy to use and return as the body. This method works well for singular resources. How can we handle the index action? I’ve created 10,000 posts. It takes a good amount of time to render that page on my computer. It takes over 10 seconds. The question is, how can we cache this? We could use the most recently updated post for the time stamp. That way, when one post is updated, it will move to the top and create a new cache key. Here is the code without any action caching:

Started GET “/posts” for 127.0.0.1 at 2011-05-01 17:18:11 -0700 Processing by PostsController#index as HTML Post Load (54.1ms) SELECT “posts”.* FROM “posts” ORDER BY updated_at DESC LIMIT 1 Read fragment views/localhost:3000/posts?tag=1304292445 (1.5ms) Rendered posts/index.html.erb within layouts/application (9532.3ms) Write fragment views/localhost:3000/posts?tag=1304292445 (36.7ms) Completed 200 OK in 10088ms (Views: 9535.6ms | ActiveRecord: 276.2ms) Now with action caching:

Started GET “/posts” for 127.0.0.1 at 2011-05-01 17:20:47 -0700 Processing by PostsController#index as HTML Read fragment views/localhost:3000/posts?tag=1304295632 (1.0ms) Completed 200 OK in 11ms Here’s the code for action caching:

caches_action :index, :cache_path => proc {|c| { :tag => Post.maximum(‘updated_at’) } } We’ll come back to this situation later. This is a better way to do this. Points to the reader if they know the problem.

These are simple examples designed to show you who can create auto expiring keys for different situations. At this point we have not had to expire any thing ourselves! The keys have done it all for us. However, there are some times when you want more precise control over how things exist in the cache. Enter Sweepers.

Sweepers Sweepers are HTTP request dependent observers. They are loaded into controllers and observe models the same way standard observers do. However there is one very important different. They are only used during HTTP requests. This means if you have things being created outside the context of HTTP requests sweepers will do you no good. For example, say you have a background process running that syncs with an external system. Creating a new model will not make it to any sweeper. So, if you have anything cached. It is up to you to expire it. Everything I’ve demonstrated so far can be done with sweepers.

Each cache* method has an opposite expire* method. Here’s the mapping:

caches_page , expire_page caches_action , expire_action cache , expire_fragment Their arguments work the same with using cache key expansion to find a key to read or delete. Depending on the complexity of your application, it may be easy to use sweepers or it may be impossible. It’s easy to use sweepers with these examples. We only need to tie into the save event. For example, when a update or delete happens we need to expire the cache for that specific post. When a create, update, or delete happens we need to expire the index action. Here’s what the sweeper would look like:

class PostSweeper < ActionController::Caching::Sweeper observe Post

def after_create(post)

expire_action :index
expire_action :show, :id => post
# this is the same as the previous line
expire_action :controller => :posts, :action => :show, :id => @post.id

end end

then in the controller, load the sweeper

class PostsController < ApplicationController cache_sweeper :post_sweeper end I will not go into much depth on sweepers because they are the only thing covered in the rails caching guide. The work, but I feel they are clumsy for complex applications. Let’s say you have comments for posts. What do you do when a comment is created for a post? Well, you have to either create a comment sweeper or load the post sweeper into the comments controller. You can do either. However, depending on the complexity of your model layer, it may quickly become infeasible to do cache expiration with sweepers. For example, let say you have a Customer. A customer has 15 different types of associated things. Do you want to put the sweeper into 15 different controllers? You can, but you may forget to at some point.

The real problem with sweepers is that they cannot be used once your application works outside of HTTP requests. They can also be clumsy. I personally feel it’s much easier to create auto expiring cache keys and only uses sweepers when I want to tie into very specific events. I’d also argue that any well designed system does not need sweepers (or at least in very minimally).

Now you should have a good grasp on how the Rails caching methods work. We’ve covered how fragment caching uses the current view to generate a cache key. We introduced the concept of auto expiring cache keys using ActiveRecord#cache_key to automatically expire cached content. We introduced action caching and how it uses url_for to generate a cache key. Then we covered how you can pass things into url_for to generate a time stamped key to expire actions automatically. Now that we understand these lower levels we can move up to page caching and HTTP caching.

Page Caching Page caching bypasses the entire application by serving up a file in /public from disk. It is different from action or fragment caching for a two reasons: content is not stored in memory and content is stored directly on the disk. You use page caching the same way you use action caching. This means you can use sweepers and and all the other things associated with them. Here’s how it works.

Webserver accepts an incoming request: GET /posts File exists: /public/posts.html posts.html is returned Your application code is never called. Since pages are written like public assets they are served as such. You will expliclity have to expire them. Warning! Forgetting to expire pages will cause you greif because you application code will not be called. Here’s an example of page caching:

PostsController < ApplicationController caches_page :index

def index

# do stuff

end When the server receives a request to GET /posts it will write the response from the application to /public/posts.html. The .html part is the format for that request. For example you can use page caching with JSON. GET /posts.json would generate /public/posts.json.

Page caching is basically poor man’s HTTP caching without any real benefits. HTTP caching is more useful.

I’ve not covered page caching in much depth because it’s very likely that if you’re reading this page caching is not applicable to your application. The Rails guides cover page caching in decent fashion. Follow up there if you need more information.

HTTP Caching HTTP caching is the most complex and powerful caching strategy you can use. With great power comes great responsiblity. HTTP caching works at the protocol level. You can configure HTTP caching so the browser doesn’t even need to contact your server at all. There are many ways HTTP caching can be configured. I will not cover them all here. I will give you an overview on how the system works and cover some common use cases.

How It Works HTTP caching works at the protocol level. It uses a combination of headers and response codes to indicate weather the user agent should make a request or use a locally stored copy instead. The invalidation or expiring is based on ETags and Last-Modified timestamps. ETag stands for “entity tag”. It’s a unique fingerprint for this request. It’s usually a checksum of the respnose body. Origin servers (computers sending the source content) can set either of these fields along with a Cache-Control header. The Cache-Control header tells the user agent what it can do with this response. It answers questions like: how long can I cache this for and am I allowed to cache it? When the user agent needs to make a request again it sends the ETag and/or the Last-Modified date to the origin server. The origin server decides based on the ETag and/or Last-Modified date if the user agent can use the cached copy or if it should use new content. If the server says use the cached content it will return status 304: Not Modified (aka fresh). If not it should return a 200 (cache is stale) and the new content which can be cached.

Let’s use curl to see how this works out:

$ curl -I http://www.example.com HTTP/1.1 200 OK Cache-Control: max-age=0, private, must-revalidate Content-length: 822 Content-Type: text/html Date: Mon, 09 Jul 2012 22:46:29 GMT Last-Modified: Mon, 09 Jul 2012 21:22:11 GMT Status: 200 OK Vary: Accept-Encoding Connection: keep-alive The Cache-Control header is a tricky thing. There are many many ways it can be configured. Here’s the two easiest ways to break it down: private means only the final user agent can store the response. Public means any server can cache this content. (You know requests may go through many proxies right?). You can specify an age or TTL. This is how long it can be cached for. Then there is another common situation: Don’t check with the server or do check with the server. This particular Cache-Control header means: this is a private (think per user cache) and check with the server everytime before using it.

We can trigger a cache hit by sending the apporiate headers with the next request. This response only has a Last-Modified date. We can send this date for the server to compare. Send this value in the If-Modified-Since header. If the content hasn’t changed since that date the server should return a 304. Here’s an example using curl:

$ curl -I -H “If-Modified-Since: Mon, 09 Jul 2012 21:22:11 GMT” http://www.example.com HTTP/1.1 304 Not Modified Cache-Control: max-age=0, private, must-revalidate Date: Mon, 09 Jul 2012 22:55:53 GMT Status: 304 Not Modified Connection: keep-alive This response has no body. It simply tells the user agent to use the locally stored version. We could change the date and get a different response.

$ curl -I -H “If-Modified-Since: Sun, 08 Jul 2012 21:22:11 GMT” http://www.example.com HTTP/1.1 200 OK Cache-Control: max-age=0, private, must-revalidate Content-length: 822 Content-Type: text/html Date: Mon, 09 Jul 2012 22:57:19 GMT Last-Modified: Mon, 09 Jul 2012 21:22:11 GMT Status: 200 OK Vary: Accept-Encoding Connection: keep-alive Caches determine freshness based on the If-None-Match and/or If-Modified-Since date. Using our existing 304 response we can supply a random etag to trigger a cache miss:

$ curl -I -H ‘If-None-Match: “foo”’ -H “If-Modified-Since: Mon, 09 Jul 2012 21:22:11 GMT” http://www.example.com HTTP/1.1 304 Not Modified Cache-Control: max-age=0, private, must-revalidate Date: Mon, 09 Jul 2012 22:55:53 GMT Status: 304 Not Modified Connection: keep-alive Etags are sent using the If-None-Match header. Now that we understand the basics we can move onto higher level discussion.

Rack::Cache HTTP caching is implemented in the webserver itself or at the application level. It is implemented at the application level in Rails. Rack::Cache is a middleware that sits at the top of the stack and intercepts requests. It will pass requests down to your app and store their contents. Or will it call down to your app and see what ETag and/or timestamps it returns for validation purposes. Rack::Cache acts as a proxy cache. This means it must respect caching rules described in the Cache-Control headers coming out of your app. This means it cannot cache private content but it can cache public content. Cachable content is stored in memcached. Rails configures this automatically.

I’ll cover one use case to illustrate how code flows through middleware stack to the actual app code and back up. Let’s use a private per user cache example. Here’s the cache control header: max-age-0, private, must-revalidate. Pretend this is some JSON API.

The client sends initial request to /api/tweets.json Rack::Cache sees the request and ignores it since there is no caching information along with it. Application code is called. It returns a 200 response with a date and the some Cache-Control header. The client makes another request to /api/tweets.json with an If-Modified-Since header matching the date from the previous request. Rack::Cache sees that his request has cache information associated with it. It checks to see how it should handle this request. According to the Cache-Control header it has expired and needs to be checked to see if it’s ok to use. Rack::Cache calls the application code. Application returns a response with the same date. Rack::Cache recieves the response, compares the dates and determines that it’s a hit. Rack::Cache sends a 304 back. The client uses response body from request in step 1. HTTP Caching in Rails Rails makes it easy to implement HTTP caching inside your controllers. Rails provides two methods: stale? and fresh_when. They both do the same thing but in opposite ways. I prefer to use stale? because it makes more sense to me. stale? reminds more of Rails.cache.fetch so I stick with that. stale? works like this: checks to see if the incoming request ETag and/or Last-Modified date matches. If they match it calls head :not_modified. If not it can call a black of code to render a response. Here is an example:

def show @post = Post.find params[:id] stale? @post do

respond_with @post

end end Using stale? with an ActiveRecord object will automatically set the ETag and Last-Modified headers. The Etag is set to a MD5 hash of the objects cache_key method. The Last-Modified date is set to the object’s updated_at method. The Cache-Control header is set to max-age=0, private, must-revalidate by default. All these values can be changed by passing in options to stale? or fresh_when. The methods take three options: :etag, :last_modified, and :public. Here are some more examples:

allow proxy caches to store this result

stale? @post, :public => true do respond_with @post end

Let’s stay your posts are frozen and have no modifications

stale? @post, :etag => @post.posted_at do respond_with @post end Now you should understand how HTTTP caching works. Here are the important bits of code inside Rails showing it all works.

File actionpack/lib/action_controller/metal/conditional_get.rb, line 39

def fresh_when(record_or_options, additional_options = {}) if record_or_options.is_a? Hash

options = record_or_options
options.assert_valid_keys(:etag, :last_modified, :public)

else

record  = record_or_options
options = { :etag => record, :last_modified => record.try(:updated_at) }.merge(additional_options)

end

response.etag = options[:etag] if options[:etag] response.last_modified = options[:last_modified] if options[:last_modified] response.cache_control[:public] = true if options[:public]

head :not_modified if request.fresh?(response) end Here is the code for fresh?. This code should help you if you are confused on how resquests are validated. I found this code much easier to understand than the official spec.

def fresh?(response) last_modified = if_modified_since etag = if_none_match

return false unless last_modified || etag

success = true success &&= not_modified?(response.last_modified) if last_modified success &&= etag_matches?(response.etag) if etag success end

Index

<li><a href="http://www.broadcastingadam.com/2012/07/advanced_caching_part_1-caching_strategies">Caching Strategies</a></li>
<li><a href="http://www.broadcastingadam.com/2012/07/advanced_caching_part_2-using_strategies">Using Strategies Effectively</a></li>
<li><a href="http://www.broadcastingadam.com/2012/07/advanced_caching_part_3-static_assets">Handling Static Assets</a></li>
<li><a href="http://www.broadcastingadam.com/2012/07/advanced_caching_part_4-stepping_outside_the_http_request">Stepping Outside the HTTP Request</a></li>
<li><a href="http://www.broadcastingadam.com/2012/07/advanced_caching_part_5-tag_based_caching">Tag Based Caching</a></li>
<li><a href="http://www.broadcastingadam.com/2012/07/advanced_caching_part_6-fast_json_apis">Fast JSON APIs</a></li>
<li><a href="http://www.broadcastingadam.com/2012/07/advanced_caching_part_7-tips_and_tricks">Tips and Tricks</a></li>
<li><a href="http://www.broadcastingadam.com/2012/07/advanced_caching_part_8-conclusion">Conclusion</a></li>

Contact Me

Find a problem or have a question about this post? @adman65 on Twitter or Adman65 on #freenode. Find me in (#rubyonrails or #sproutcore). You can find my code on GitHub or hit me up on Google+.

Rails Counter Cache

Aug 27th, 2012

Rails counter cache

这次就是讲用_count字段来缓存has_many的计数

看Project和Task的例子:

Projects


<% for project in @projects %>
  
<% end %>

    <%= link_to project.name, poject_path(project) %>
    (<%= pluralize project.tasks.size, 'task' %>)

上面的页面代码对所有的@projects显示tasks.size，看下log:

SQL (0.006385)  SELECT count(*) AS count_all FROM tasks WHERE (tasks.project_id = 326)
SQL (0.000220)  SELECT count(*) AS count_all FROM tasks WHERE (tasks.project_id = 327)
SQL (0.000383)  SELECT count(*) AS count_all FROM tasks WHERE (tasks.project_id = 328)
SQL (0.000197)  SELECT count(*) AS count_all FROM tasks WHERE (tasks.project_id = 329)
SQL (0.000215)  SELECT count(*) AS count_all FROM tasks WHERE (tasks.project_id = 330)

上面显示了对每个project都使用SQL来count tasks，我们采用eager loading看看能否改进性能:

class ProjectsController < ApplicationController
  def index
    @projects = Project.find(:all, :include => :tasks)
  end
end

再来看看log:

Project Lood Incluing Associations (0.000954)  SELECT projects.'id' AS t0_r0, projects.'name' AS t0_r1, tasks.'id'
AS t1_r0, tasks.'name' AS t1_r1, tasks.'project_id' AS t1_r2 FROM projects LEFT OUTER JOIN tasks ON tasks.project
_id = projects.id

我们看到，使用eager loading确实只用一条SQL语句就完成工作，但是缺点是把tasks表所有的字段信息都取出来了，很多信息是 
没有用的。 

我们来看看更好的解决方案:

ruby script/generate migration add_tasks_count

我们新建一个migration，给projects表添加一个叫tasks_count的列:

class AddTasksCount < ActiveRecord::Migration
  def self.up
    add_column :projects, :tasks_count, :integer, :default => 0

    Project.reset_column_information
    Project.find(:all).each do |p|
      p.update_attribute :tasks_count, p.tasks.length
    end
  end

  def self.down
    remove_column :projects, :tasks_count
  end
end

我们还需要告诉Task类开启counter cache:

class Task < ActiveRecord::Base
  belongs_to :projects, :counter_cache => true
end

好了，我们把ProjectsController的index方法改回lazy loading，刷新页面，再看看log:

Project Lood (0.000295)  SELECT * FROM projects

Rails里的Magic Column Names

Aug 27th, 2012

Rails里的Magic Column Names

Active Record有一些富有“魔力”的列名:

created_at， created_on， updated_at， updated_on 在create或者update一行时Rails对at形式的列用timestamp自动更新，对on形式的列用date自动更新

lock_version 如果一个表有lock_version这个列，则Rails会跟踪一行的版本号并执行乐观锁

type 单表继承时跟踪一行的type

id 表的默认主键名

xxx_id 对以复数形式的xxx命名的表的引用的默认外键名

xxx_count 对子表xxx维护一个计数器cache

position acts_as_list时用来表示一个list中该行的position

parent_id acts_as_tree时用来表示该行的parent的id

The Rails Flash Isn’t Just for Messages

Aug 24th, 2012

The Rails flash isn’t just for messages

The Rails flash is typically used for short messages:

app/controllers/sessions_controller.rb

<code>redirect_to root_url, notice: "You have been logged out."</code>

But it can be used for more than that, any time that you redirect and want to pass along some state without making it part of the URL.

These are some things I’ve used it for.

Identifiers for more complex messages

Maybe you want to show a more complex message after signing up, containing things like links and bullet points.

Rather than send all that in the flash, you can send some identifier that your views know how to handle.

This could be the name of a partial:

app/controllers/users_controller.rb

class UsersController < ApplicationController
  def create
   @user = actually_create_user
   flash[:partial] = "welcome"
   redirect_to some_path
 end
end

app/views/layouts/application.html.haml

- if flash[:partial]
 = render partial: "shared/flashes/#{flash[:partial]}"

app/views/shared/flashes/_welcome.html.haml

%p Welcome!
 %ul
   %li= link_to("Do this!", this_path)
   %li= link_to("Do that!", that_path)

Or just a flag:

app/controllers/users_controller.rb

<code>flash[:signed_up] = true redirect_to root_path</code>

app/views/welcomes/show.html.haml

<code>- if flash[:signed_up] %p Welcome!</code>

Pass on the referer

Say you have some filter redirecting incoming requests. Maybe you’re detecting the locale and adding it to the URL, or verifying credentials.

You can use the flash to make sure the redirected-to controller gets the original referer.

app/controllers/application_controller.rb

class ApplicationController < ActionController::Base
 before_filter :make_locale_explicit

  private
  def make_locale_explicit
    if params[:locale].blank? && request.get?
      flash[:referer] = request.referer
      redirect_to params.merge(locale: I18n.locale)
    end
  end
end

Now, any controller that cares about the referer could get it with:

<code>flash[:referer] || request.referer</code>

Google Analytics events

Say you want to track a Google Analytics event event with JavaScript when a user has signed up. You could do something like this.

Send event data from the controller:

app/controllers/users_controller.rb

class UsersController < ApplicationController
  def create
    @user = actually_create_user
    flash[:events] = [ ["_trackEvent", "users", "signup"] ]
    redirect_to some_path
  end
end

Then turn it into JavaScript in your view:

app/helpers/layout_helper.rb

def analytics_events
  Array(flash[:events]).map do |event|
    "_gaq.push(#{raw event.to_json});"
  end.join("\n")
end

app/views/layouts/application.html.haml

:javascript
  = analytics_events

The flash vs. params

You may have considered that any of the above could have be done with query parameters instead. Including common flash messages:

app/controllers/sessions_controller.rb

redirect_to root_url(notice: "You have been logged out.")

app/views/layouts/application.html.haml

- if params[:notice]
  %p= params[:notice]

Using the flash means that the passed data doesn’t show in the URL, so it won’t happen twice if the link is shared, bookmarked or reloaded. Also the URL will be a little cleaner.

Additionally, the user can’t manipulate the flash, as it’s stored in the session. This adds some protection. If the flash partial example above used params, a user could pass in ../../admin/some_partial to see things they shouldn’t.

Fin

I’d love to hear about what unconventional uses you’ve put the flash to!

A Practical Guide to Use Spine.JS in Real World App.

Aug 21st, 2012

A Practical Guide to Use Spine.JS in Real World App.

To give users the best possible fluid experience, we designed Pragmatic.ly and complied with the single page application standard. We believe that could make users focus on building product rather than spending time on project management itself. A wide range of technology solutions are available to make a single page application. Current trends suggest moving core application from server to client side and keeping server load at minimum for better performance by pure data APIs. Pragmatic.ly took on the challange to cater to this need by developing server side in Rails, Spine.js at client side.

Why Spine.JS

There are many different JavaScript MVC frameworks such as Backbone.js, Spine.js, Knockout.js,Ember.js, etc. There are too many choices and when I started Pragmatic.ly, I was struggling with the problem of which one I should pick up. Instead of wasting time on choosing I did a quick review by comparing the documents and then decided to choose Spine.js to start with. With months of development so far, I’m glad that Spine.js works pretty well and below are the great benefits I have found in using Spine.js.

Simple and lightweight. It’s easy to dive into the core and extend as you need to.
MVC pattern at its core. It’s very similar to the Rails counterparts. So I’m very comfortable with it from the first day.
Rails integration. It can’t be easier to use Rails as the backend data API in Spine.js app. And the spine-rails gem is another great addition.
Asynchronous UI. Ideally UIs never block and it will automatically update the data in backend. This brings the fast and very responsive user interface.

If you want to get a brief review among different frameworks, check out this article written by Gordon L. Hemption.

How we use Spine.js in Pragmatic.ly

We use spine-rails to generate the Spine.app structure, very similar to Rails app structure.

├── app

│ ├── controllers

│ │ ├── center

│ │ │ ├── filter_controller.js.coffee

│ │ │ └── tickets_controller.js.coffee

│ │ ├── center_content_controller.coffee

│ │ ├── comments_controller.js.coffee

│ │ ├── header

│ │ │ └── project_nav_controller.js.coffee

│ │ ├── header_controller.coffee

│ │ ├── iterations_controller.coffee

│ │ ├── left_sidebar_controller.coffee

│ │ ├── projects_controller.coffee

│ │ ├── right_sidebar_controller.coffee

│ │ ├── sidebars

│ │ │ ├── left_iteration.js.coffee

│ │ │ ├── left_people.js.coffee

│ │ │ ├── right_activities.js.coffee

│ │ │ └── right_detail_section.js.coffee

│ │ ├── tickets_controller.coffee

│ │ └── users_controller.js.coffee

│ ├── index.js.coffee

│ ├── lib

│ │ ├── constants.js.coffee

│ │ ├── eco-helpers.js

│ │ └── view.js.coffee

│ ├── models

│ │ ├── comment.js.coffee

│ │ ├── iteration.js.coffee

│ │ ├── project.js.coffee

│ │ ├── ticket.js.coffee

│ │ └── user.js.coffee

│ └── views

│ ├── comments

│ │ ├── audit.jst.eco

│ │ ├── form.jst.eco

│ │ └── plain.jst.eco

│ ├── iterations

│ │ ├── section.jst.eco

│ │ └── show.jst.eco

│ ├── projects

│ │ ├── edit.jst.eco

│ │ ├── form.jst.eco

│ │ ├── new.jst.eco

│ │ └── switch.jst.eco

│ ├── tickets

│ │ ├── section.jst.eco

│ │ ├── show.jst.eco

│ │ └── toolbar.jst.eco

│ └── users

│ ├── people.jst.eco

│ └── show.jst.eco

├── application.js

├── bootstrap.js.coffee

└── dashboard.js.coffee

view raw Pragmatic.ly JS Structure This Gist brought to you by GitHub.

So basically it’s controllers, models and views.

Controllers

There are two kinds of Controllers in Pragmatic.ly. In Spine, Controllers are considered the glue of an application, adding and responding to DOM events, rendering templates and ensuring that views and models are kept in sync. For example,

class App.LeftIterationController extends Spine.Controller

el: ‘.sidebar #iterations’

elements:

’ul.list’: ‘list’

constructor: ->

super

App.Iteration.bind ‘create’, @addIteration

App.Iteration.bind ‘refresh’, @refreshIterations

release: ->

super

App.Iteration.unbind ‘create’, @addIteration

App.Iteration.unbind ‘refresh’, @refreshIterations

addIteration: (iteration) =>

iteration.unbind()

view = new App.IterationItem(item: iteration)

@list.append(view.render().el)

refreshIterations: (iterations) =>

@addIteration iteration for iteration in iterations

view raw left_iteration_controller.js.coffee This Gist brought to you byGitHub.

We split the page into multiple blocks and each block is a Spine Controller. Talking the above example, LeftIterationsController is the Controller to manage the iterations list in the left sidebar.

Then what’s the other kind? The answer is Routes! We extract the routes to the dedicated controllers now. It will setup the routes and respond to the navigation event. Then it will prepare the data and trigger the event to let another controller handle it to render templates. For example,

class App.TicketsController extends Spine.Controller

constructor: ->

super

@routes

”/tickets”: @index

”/tickets/:id” : (params) ->

@show(params.id)

index: ->

tickets = App.Ticket.all()

App.Ticket.trigger “tickets:index”, tickets

show: (id) ->

ticket = App.Ticket.find(id)

$.publish ‘ticket:switch’, ticket

view raw ticket_routing_controller.js.coffee This Gist brought to you byGitHub.

Models

Models manage data for the application. It’s very similar to Rails models. I just want to mention one thing though – as we moved the logic from server side to client side, there was no need to translate 1:1 on the client side. Instead, encapsulate the data into model which is suitable for the page based on the user.

class App.Project extends Spine.Model

@configure ‘Project’, ‘id’, ‘name’, ‘description’, ‘owner_id’, ‘uid’

@extend Spine.Model.Ajax

@extend Spine.Model.Dirty

validate: ->

’name required’ unless @name

inviteUser: (email) ->

App.Invitation.create(project_id: @id, email: email)

view raw project.js.coffee This Gist brought to you by GitHub.

Views

Views are about building and maintaining the DOM elements. Views in Spine are very simple and don’t have the built-in UI binding. So most of the time you should let Controller observe the Model and get notified when the model changes, then update the view accordingly.

By doing all the view rendering client-side, you should use JavaScript templating solution to define templates for views as markup containing tempalte variables. There are a number of good candidates, such as Mustache, jQuery.tmpl and Eco.

I use Eco in Pragmatic.ly. The Erb-like syntax and CoffeeScript support is a big triumph. However, you should know that every eco template generates the same helpers which will increase the file size. You can use this gist to avoid the problem which will register the global helpers and inject into the Eco templates.

# Put this file in lib/

require ‘sprockets/eco_template’

class CleanEcoTemplate < Sprockets::EcoTemplate

FROM = ” (function() {”

TO = “}).call(__obj);”

def evaluate(scope, locals, &block)

content = Eco.compile(data)

from = content.index(FROM)

to = content.rindex(TO)

content = content[from…to] + TO

<<-JS

function(__obj) {

if (!__obj) __obj = {};

var __helpers = window.ecoHelpers;

var __out = [];

var __sanitize = __helpers.sanitize;

var __capture = __helpers.captureFor(__obj, __out);

var __rememberSafe = __obj.safe;

var __rememberEscape = __obj.escape;

__obj.safe = __helpers.safe;

__obj.escape = __helpers.escape;

#{content}

__obj.safe = __rememberSafe;

__obj.escape = __rememberEscape;

return __out.join(”);

};

end

view raw clean_eco_template.rb This Gist brought to you by GitHub.

# Must include eco-helpers.js before eco files

(function(global) {

var ecoHelpers = {

sanitize: function(value) {

if (value && value.ecoSafe) {

return value;

} else if (typeof value !== ‘undefined’ && value != null) {

return ecoHelpers.escape(value);

} else {

return ”;

}

safe: function(value) {

if (value && value.ecoSafe) {

return value;

} else {

if (!(typeof value !== ‘undefined’ && value != null)) value = ”;

var result = new String(value);

result.ecoSafe = true;

return result;

}

escape: function(value) {

return (” + value)

.replace(/&/g, ‘&’)

.replace(/</g, ‘<’)

.replace(/>/g, ‘>’)

.replace(/”/g, ‘"’);

captureFor: function(obj, out) {

return (function(callback) {

var length = out.length;

callback.call(obj);

return ecoHelpers.safe(out.splice(length, out.length - length).join(”));

});

}

};

global.ecoHelpers = ecoHelpers;

})(window);

view raw eco-helpers.js This Gist brought to you by GitHub.

# Put this file in config/initializers

require ‘clean_eco_template’

Rails.application.assets.register_engine ‘.eco’, CleanEcoTemplate

view raw eco_template.rb This Gist brought to you by GitHub.

Problems

So that’s how we use Spine.js to power Pragmatic.ly. It works very well but still have some limitations.

By default, you can only monitor the whole Model change event and update the view accordingly. For example, even the username is not changed, you still have to update the views containing that data. There is a “change:field” event in Backbone.js which allow you only to update the view when that field of data changed. I like that. So I made a plugin to support that. Check the Gist out.

Spine ?= require(‘spine’)

Include =

  savePrevious: ->

    @constructor.records[@id].previousAttributes = @attributes()

Spine.Model.Dirty =

  extended: ->

    @bind ‘refresh’, ->

      @each (record) -> record.savePrevious()

    @bind ‘save’, (record) ->

      if record.previousAttributes?

        for key in record.constructor.attributes when key of record

          if record[key] isnt record.previousAttributes[key]

            record.trigger(‘change:’+key, record[key])

      record.savePrevious()

    @include Include

view raw dirty.js.coffee This Gist brought to you by GitHub.

So the model object can bind the event “change:#{field} to trigger event when the field value is changed.

By default it’s off and if need this feature, the model should extend Spine.Model.Dirty.

A sample case.

class User extends Spine.Model

  @extend Spine.Model.Dirty

end

view raw Usage:This Gist brought to you by GitHub.
The Ajax plugin in Spine.js plays very nice with backend REST APIs, such as Rails. For example, creating the model will trigger a “CREATE /collections” event to the server and updating the model will trigger a “PUT /collections/id” event, seemlessly. However, nested resources in Rails is very common but Spine lacks to support that. Either you have to trigger requests to top-level URL or setup the request yourself. I have done a dirty hack to support scoping. It’s dirty but works.

class App.Ticket extends Spine.Model

  @configure ‘Ticket’, “id”, “project_id”

  @scope: ->

    ”projects/#{current.project_id}”

  scope: ->

    ”projects/#{@project_id}”

view raw url_scoping.js.coffee This Gist brought to you by GitHub.
Asynchronous UI is cool and works for 99% situations. But in a real world app you have to deal with errors like bugs or network failures. Spine doesn’t have the default error handling for this situation and leave all work to you. It’s fine but you should know that for that 1% situation, you have to spend lots of time to avoid the impact.

Test

I would like to cover how Test works in Pragmatic.ly in another post. To give a quick overview, we use Jasmine for JS test and JSCoverage for measuring code coverage. Nice pair!

About Pragmatic.ly

Pragmatic.ly is a fast and easy to use project management tool featuring real time collaboration. It’s an elegant project management service built for developers with love.

Now that you’ve read so far, you should follow me @yedingding!

Conditional Comments in Html

Aug 20th, 2012

Conditional comments in html

Conditional comments only work in IE, and are thus excellently suited to give special instructions meant only for IE. They are supported from IE 5 onwards.

Conditional comments work as follows:

<!--[if IE 6]>
Special instructions for IE 6 here
<![endif]-->

Their basic structure is the same as an HTML comment (). Therefore all other browsers will see them as normal comments and will ignore them entirely.
IE, though, has been programmed to recognize the special <!--[if IE]> syntax, resolves the if and parses the content of the conditional comment as if it were normal page content.
Since conditional comments use the HTML comment structure, they can only be included in HTML files, and not in CSS files. I’d have preferred to put the special styles in the CSS file, but that’s impossible. You can also put an entire new <link> tag in the conditional comment referring to an extra style sheet.

Example

Below I added a lot of conditional comments that print out messages according to your IE version.

Note however, that if you use multiple Explorers on one computer, the conditional comments will render as if all these Explorer versions are the highest Explorer version available on your machine (usually Explorer 6.0).

Test

Below are a few conditional comments that reveal the IE version you’re using.

According to the conditional comment this is not IE

Code

The syntax I use is:

<p>
<!--[if IE]>
According to the conditional comment this is IE<br />
<![endif]-->
<!--[if IE 6]>
According to the conditional comment this is IE 6<br />
<![endif]-->
<!--[if IE 7]>
According to the conditional comment this is IE 7<br />
<![endif]-->
<!--[if IE 8]>
According to the conditional comment this is IE 8<br />
<![endif]-->
<!--[if IE 9]>
According to the conditional comment this is IE 9<br />
<![endif]-->
<!--[if gte IE 8]>
According to the conditional comment this is IE 8 or higher<br />
<![endif]-->
<!--[if lt IE 9]>
According to the conditional comment this is IE lower than 9<br />
<![endif]-->
<!--[if lte IE 7]>
According to the conditional comment this is IE lower or equal to 7<br />
<![endif]-->
<!--[if gt IE 6]>
According to the conditional comment this is IE greater than 6<br />
<![endif]-->
<!--[if !IE]> -->
According to the conditional comment this is not IE<br />
<!-- <![endif]-->
</p>

Note the special syntax:

gt: greater than
lte: less than or equal to

Also note the last one. It has a different syntax, and its contents are shown in all browsers that are not IE:

<!--[if !IE]> -->

CSS hack?

Are conditional comments CSS hacks? Strictly speaking, yes, since they can serve to give special style instructions to some browsers. However, they do not rely on one browser bug to solve another one, as all true CSS hacks do. Besides, they can be used for more than CSS hacks only (though that rarely happens).

Since conditional comments are not based on a browser hack but on a deliberate feature I believe they are safe to use. Sure, other browsers could implement conditional comments, too (though as yet none have done so), but they’re unlikely to react to the specific query <!—[if IE]>.

I use conditional comments, though sparingly. First I see if I can find a real CSS solution to an Explorer Windows problem. If I can’t, though, I don’t hesitate to use them.

Comment tag

A reader told me IE8 and below also support the (non-standard) <comment> tag.

<p>This is <comment>not</comment> IE.</p>

This isnotIE.

This tag might be a replacement for the !IE conditional comment, but only if you target IE8 and below.

Journey Through the JavaScript MVC Jungle – by Addy Osmani

Aug 16th, 2012

Journey Through The JavaScript MVC Jungle — By Addy Osmani

When writing a Web application from scratch, it’s easy to feel like we can get by simply by relying on a DOM manipulation library (like jQuery) and a handful of utility plugins. The problem with this is that it doesn’t take long to get lost in a nested pile of jQuery callbacks and DOM elements without any real structure in place for our applications.

In short, we’re stuck with spaghetti code. Fortunately there are modern JavaScript frameworks that can assist with bringing structure and organization to our projects, improving how easily maintainable they are in the long-run.

What Is MVC, Or Rather MV*?

These modern frameworks provide developers an easy path to organizing their code using variations of a pattern known as MVC (Model-View-Controller). MVC separates the concerns in an application down into three parts:

Models represent the domain-specific knowledge and data in an application. Think of this as being a ‘type’ of data you can model — like a User, Photo or Note. Models should notify anyone observing them about their current state (e.g Views).
Views are typically considered the User-interface in an application (e.g your markup and templates), but don’t have to be. They should know about the existence of Models in order to observe them, but don’t directly communicate with them.
Controllers handle the input (e.g clicks, user actions) in an application and Views can be considered as handling the output. When a Controller updates the state of a model (such as editing the caption on a Photo), it doesn’t directly tell the View. This is what the observing nature of the View and Model relationship is for.

JavaScript ‘MVC’ frameworks that can help us structure our code don’t always strictly follow the above pattern. Some frameworks will include the responsibility of the Controller in the View (e.g Backbone.js) whilst others add their own opinionated components into the mix as they feel this is more effective.

For this reason we refer to such frameworks as following the MV* pattern, that is, you’re likely to have a View and a Model, but more likely to have something else also included.

Note: There also exist variations of MVC known as MVP (Model-View-Presenter) and MVVM (Model-View ViewModel). If you’re new to this and feel it’s a lot to take in, don’t worry. It can take a little while to get your head around patterns, but I’ve written more about the above patterns in my online book Learning JavaScript Design Patterns in case you need further help.

When Do You Need A JavaScript MV* Framework?

When building a single-page application using JavaScript, whether it involves a complex user interface or is simply trying to reduce the number of HTTP requests required for new Views, you will likely find yourself inventing many of the pieces that make up an MV* framework like Backbone or Ember.

At the outset, it isn’t terribly difficult to write an application framework that offers someopinionated way to avoid spaghetti code, however to say that it is equally as trivial to write something of the standard of Backbone would be a grossly incorrect assumption.

There’s a lot more that goes into structuring an application than tying together a DOM manipulation library, templating and routing. Mature MV* frameworks typically not only include many of the pieces you would find yourself writing, but also include solutions to problems you’ll find yourself running into later on down the road. This is a time-saver that you shouldn’t underestimate the value of.

So, where will you likely need an MV* framework and where won’t you?

If you’re writing an application that will likely only be communicating with an API or back-end data service, where much of the heavy lifting for viewing or manipulating that data will be occurring in the browser, you may find a JavaScript MV* framework useful.

Good examples of applications that fall into this category are GMail and Google Docs. These applications typically download a single payload containing all the scripts, stylesheets and markup users need for common tasks and then perform a lot of additional behavior in the background. It’s trivial to switch between reading an email or document to writing one and you don’t need to ask the application to render the whole page again at all.

If, however, you’re building an application that still relies on the server for most of the heavy-lifting of Views/pages and you’re just using a little JavaScript or jQuery to make things a little more interactive, an MV framework may be overkill. There certainly are complex Web applications where the partial rendering of views can* be coupled with a single-page application effectively, but for everything else, you may find yourself better sticking to a simpler setup.

The Challenge Of Choice: Too Many Options?

The JavaScript community has been going through something of a renaissance over the last few years, with developers building even larger and more complex applications with it as time goes by. The language still greatly differs from those more classic Software engineers are used to using (C++, Java) as well as languages used by Web developers (PHP, Python, .Net etc). This means that in many cases we are borrowing concepts of how to structure applications from what we have seen done in the past in these other languages.

In my talk “Digesting JavaScript MVC: Pattern Abuse or Evolution”, I brought up the point that there’s currently too much choice when it comes to what to use for structuring your JavaScript application. Part of this problem is fueled by how different JavaScript developers interpret how a scalable JavaScript application should be organized — MVC? MVP? MVVM? Something else? This leads to more frameworks being created with a different take on MV* each week and ultimately more noise because we’re still trying to establish the “right way” to do things, if that exists at all. Many developers believe it doesn’t.

We refer to the current state of new frameworks frequently popping up as ‘Yet Another Framework Syndrome’ (or YAFS). Whilst innovation is of course something we should welcome, YAFS can lead to a great deal of confusion and frustration when developers just want to start writing an app but don’t want to manually evaluate 30 different options in order to select something maintainable. In many cases, the differences between some of these frameworks can be very subtle if not difficult to distinguish.

TodoMVC: A Common Application For Learning And Comparison

There’s been a huge boom in the number of such MV* frameworks being released over the past few years.

Backbone.js, Ember.js, AngularJS, Spine, CanJS … The list of new and stable solutions continues to grow each week and developers can quickly find themselves lost in a sea of options. From minds who have had to work on complex applications that inspired these solutions (such as Yehuda Katz and Jeremy Ashkenas), there are many strong contenders for what developers should consider using. The question is, what to use and how do you choose?

We understood this frustration and wanted to help developers simplify their selection process as much as possible. To help solve this problem, we created TodoMVC — a project which offers the same Todo application implemented in most of the popular JavaScript MV* frameworks of today — think of it as speed dating for frameworks. Solutions look and feel the same, have a common feature set, and make it easy for us to compare the syntax and structure of different frameworks, so we can select the one we feel the most comfortable with or at least, narrow down our choices.

This week we’re releasing a brand new version of TodoMVC, which you can find more details about lower down in the apps section.

In the near future we want to take this work even further, providing guides on how frameworks differ and recommendations for which options to consider for particular types of applications you may wish to build.

Our Suggested Criteria For Selecting A Framework

Selecting a framework is of course about more than simply comparing the Todo app implementations. This is why, once we’ve filtered down our selection of potential frameworks to just a few, it’s recommend to spend some time doing a little due diligence. The framework we opt for may need to support building non-trivial features and could end up being used to maintain the app for years to come.

What is the framework really capable of? Spend time reviewing both the source code of the framework and official list of features to see how well they fit with your requirements. There will be projects that may require modifying or extending the underlying source and thus make sure that if this might be the case, you’ve performed due diligence on the code.
Has the framework been proved in production? i.e Have developers actually built and deployed large applications with it that are publicly accessible? Backbone has a strong portfolio of these (SoundCloud, LinkedIn) but not all frameworks do. Ember is used in number of large apps, including the user tools in Square. JavaScriptMVC has been used to power applications at IBM amongst other places. It’s not only important to know that a framework works in production, but also being able to look at real world code and be inspired by what can be built with it.
Is the framework mature? We generally recommend developers don’t simply “pick one and go with it”. New projects often come with a lot of buzz surrounding their releases but remember to take care when selecting them for use on a production-level app. You don’t want to risk the project being canned, going through major periods of refactoring or other breaking changes that tend to be more carefully planned out when a framework is mature. Mature projects also tend to have more detailed documentation available, either as a part of their official or community-driven docs.
Is the framework flexible or opinionated? Know what flavor you’re after as there are plenty of frameworks available which provide one or the other. Opinionated frameworks lock (or suggest) you to do things in a specific way (theirs). By design they are limiting, but place less emphasis on the developer having to figure out how things should work on their own.
Have you really played with the framework? Write a small application without using frameworks and then attempt to refactor your code with a framework to confirm whether it’s easy to work with or not. As much as researching and reading up on code will influence your decision, it’s equally as important to write actual code using the framework to make sure you’re comfortable with the concepts it enforces.
Does the framework have a comprehensive set of documentation? Although demo applications can be useful for reference, you’ll almost always find yourself consulting the official framework docs to find out what its API supports, how common tasks or components can be created with it and what the gotchas worth noting are. Any framework worth it’s salt should have a detailed set of documentation which will help guide developers using it. Without this, you can find yourself heavily relying on IRC channels, groups and self-discovery, which can be fine, but are often overly time-consuming when compared to a great set of docs provided upfront.
What is the total size of the framework, factoring in minification, gzipping and any modular building that it supports? What dependencies does the framework have? Frameworks tend to only list the total filesize of the base library itself, but don’t list the sizes of the librarys dependencies. This can mean the difference between opting for a library that initially looks quite small, but could be relatively large if it say, depends on jQuery and other libraries.
Have you reviewed the community around the framework? Is there an active community of project contributors and users who would be able to assist if you run into issues? Have enough developers been using the framework that there are existing reference applications, tutorials and maybe even screencasts that you can use to learn more about it?

Dojo And Rise Of The JavaScript Frameworks

As many of us know, the Dojo toolkit was one of the first efforts to provide developers a means to developing more complex applications and some might say it in-part inspired us to think more about the needs of non-trivial applications. I sat down to ask Dojos Dylan Schiemann, Kitson Kelly, and James Thomas what their thoughts were on the rise of JavaScript MV* frameworks.

Q: Didn’t Dojo already solve all of this? Why hasn’t it been the dominent solution for developers wishing to build more structured (and more non-trivial) applications?

Years ago, while the JavaScript landscape evolved from adding simple Ajax and chrome to a page, Dojo was evangelizing a “toolkit” approach to building complex Web applications.

Many of those features were way ahead of most developers needs. With the emergence of the browser as the dominant application platform, many of the innovations pioneered in The Dojo Toolkit now appear in newer toolkits. MVC was just another package that Dojo has provided for quite some time, along with modular code packages, OO in JS, UI widgets, cross-browser graphics, templating, internationalization, accessibility, data stores, testing frameworks, a build system and much, much more.

JavaScript libraries shouldn’t end at “query”, which is why Dojo, early on, focussed on completing the picture for enterprise grade application development. This is the same focus that is has today with MVC, it’s just another “tool in the arsenal”.

Why is Dojo not the dominant toolkit? Its goal was never to be the only choice. The goal was to provide an open collection of tools that could be used with anything else, within projects, and liberally copied into other work as well. Dojo was criticized for being slow and even after that was addressed, it was criticized for being slow. Trying to shake that perception is challenging. It is very hard to document a feature-rich toolkit. There are 175 sub-packages in Dojo 1.8 and over 1,400 modules.

That is not only a challenge from a documentation purpose, it also means that there isn’t one thing that Dojo does. Which is good if you are building software, but very difficult when you are starting out trying to figure out where to start. These are all things we have been trying to work on for Dojo 1.8, in the form of tutorials and significantly improved documentation.

Q: Why should developers still consider Dojo and what ideas do you have lined up for the future of the project? I hear 1.8 will be another major milestone.

In Dojo 1.8, dojox/mvc takes another step towards full maturity. There has been a lot of investment in time, effort, testing and community awareness into the package. It focuses on providing an MVC model that leverages the rest of Dojo. Coupled with dojox/app, an application framework that is designed to make it easier to build rich applications across desktop and mobile, it makes a holistic framework for creating a client side application.

In the typical Dojo way, this is just one of many viable ways in which to build applications with Dojo.

In 1.8, not only does the MVC sub-module become more mature, it is built upon a robust framework. It doesn’t just give you markup language to create your views, express your models or develop a controller. It is far more then just wiring up some controls to a data source. Because it is leveraging the rest of Dojo, you can draw in anything else you might need.

In Dojo 2.0 we will be looking to take modularity to a new level, so that it becomes even easier to take a bit of this and a bit of that and string it all together. We are also exploring the concepts of isomorphism, where it should be transparent to the end-user where your code is being executed, be it client side or server side and that ultimately it should be transparent to the developer.

The TodoMVC Collection

In our brand new release, Todo implementations now exist for the most popular frameworks with a large number of other commonly used frameworks being worked on in Labs. These implementations have gone through a lot of revision, often taking on board best practice tips and suggestions from framework authors, contributors and users from within the community.

Following on from comments previously made by Backbone.js author Jeremey Ashkenas and Yehuda Katz, TodoMVC now also offers consistent implementations based on an official application specification as well as routing (or state management).

We don’t pretend that more complex learning applications aren’t possible (they certainly are), but the simplicity of a Todo app allows developers to review areas such as code structure, component syntax and flow, which we feel are enough to enable a comparison between frameworks and prompt further exploration with a particular solution or set of solutions.

Our applications include:

For those interested in AMD versions:

Backbone.js + RequireJS (using AMD)
Ember.js + RequireJS (using AMD)

And our Labs include:

Note: We’ve implemented a version of our Todo application using just JavaScript and another using primarily jQuery conventions. As you can see, whilst these applications are functionally equivalent to something you might write with an MVC framework, there’s no separation of concerns and the code becomes harder to read and maintain as the codebase grows.

We feel honored that over the past year, some framework authors have involved us in discussions about how to improve their solutions, helping bring our experience with a multitude of solutions to the table. We’ve also slowly moved towards TodoMVC being almost a defacto app that new frameworks implement and this means it’s become easier to make initial comparisons when you’re reviewing choices.

Frameworks: When To Use What?

To help you get started with narrowing down frameworks to explore, we would like to offer the below high-level framework summaries which we hope will help steer you towards a few specific options to try out.

I want something flexible which offers a minimalist solution to separating concerns in my application. It should support a persistence layer and RESTful sync, models, views (with controllers), event-driven communication, templating and routing. It should be imperative, allowing one to update the View when a model changes. I’d like some decisions about the architecture left up to me. Ideally, many large companies have used the solution to build non-trivial applications. As I may be building something complex, I’d like there to be an active extension community around the framework that have already tried addressing larger problems (Marionette, Chaplin, Aura, Thorax). Ideally, there are also scaffolding tools (grunt-bbb, brunch) available for the solution. Use Backbone.js.

I want something that tries to tackle desktop-level application development for the web. It should be opinionated, modular, support a variation of MVC, avoid the need to wire everything in my application together manually, support persistence, computed properties and have auto-updating (live) templates. It should support proper state management rather than the manual routing solution many other frameworks advocate being used. It should also come with extensive docs and of course, templating. It should also have scaffolding tools available (ember.gem, ember for brunch). Use Ember.js.

I want something more lightweight which supports live-binding templates, routing, integration with major libraries (like jQuery and Dojo) and is optimized for performance. It should also support a way to implement models, views and controllers. It may not be used on as many large public applications just yet, but has potential. Ideally, the solution should be built by people who have previous experience creating many complex applications. Use CanJS.

I want something declarative that uses the View to derive behavior. It focuses on achieving this through custom HTML tags and components that specify your application intentions. It should support being easily testable, URL management (routing) and a separation of concerns through a variation of MVC. It takes a different approach to most frameworks, providing a HTML compiler for creating your own DSL in HTML. It may be inspired by upcoming Web platform features such as Web Components and also has its own scaffolding tools available (angular-seed). Use AngularJS.

I want something that offers me an excellent base for building large scale applications. It should support a mature widget infrastructure, modules which support lazy-loading and can be asynchronous, simple integration with CDNs, a wide array of widget modules (graphics, charting, grids, etc) and strong support for internationalization (i18n, l10n). It should have support for OOP, MVC and the building blocks to create more complex architectures. Use Dojo.

I want something which benefits from the YUI extension infrastructure. It should support models, views and routers and make it simple to write multi-view applications supporting routing, View transitions and more. Whilst larger, it is a complete solution that includes widgets/components as well as the tools needed to create an organized application architecture. It may have scaffolding tools (yuiproject), but these need to be updated. Use YUI.

I want something simple that values asynchronous interfaces and lack any dependencies. It should be opinionated but flexible on how to build applications. The framework should provide bare-bones essentials like model, view, controller, events, and routing, while still being tiny. It should be optimized for use with CoffeeScript and come with comprehensive documentation. Use Spine.

I want something that will make it easy to build complex dynamic UIs with a clean underlying data model and declarative bindings. It should automatically update my UI on model changes using two-way bindings and support dependency tracking of model data. I should be able to use it with whatever framework I prefer, or even an existing app. It should also come with templating built-in and be easily extensible. Use KnockoutJS.

I want something that will help me build simple Web applications and websites. I don’t expect there to be a great deal of code involved and so code organisation won’t be much of a concern. The solution should abstract away browser differences so I can focus on the fun stuff. It should let me easily bind events, interact with remote services, be extensible and have a huge plugin community. Use jQuery.

What Do Developers Think About The Most Popular Frameworks?

As part of our research into MV* frameworks for TodoMVC and this article, we decided to conduct a survey to bring together the experiences of those using these solutions. We asked developers what framework they find themselves using the most often and more importantly, why they would recommend them to others. We also asked what they felt was still missing in their project of choice.

We’ve grouped some of the most interesting responses below, by framework.

EMBER.JS

Pros: The combination of live templates and observable objects has changed the way I write JavaScript. It can be a bit much to wrap your head around at first, but you end up with a nice separation of responsibility. I found that once I have everything set up, adding fairly complex features only takes a couple lines of code. Without Ember, these same features would’ve been hellish to implement. Cons: Ember has yet to reach 1.0. Many things are still in flux, such as the router and Ember data. The new website is very helpful, but there’s still not as much documentation for Ember as there is for other frameworks, specifically Backbone. Also, with so much magic in the framework, it can be a little scary. There’s the fear that if something breaks you won’t be able to figure out exactly why. Oh, and the error messages that ember gives you often suck.

Pros: The key factors: a) Features that let me avoid a lot of boilerplate (bindings, computer properties, view layer with the cool handlebars). b) the core team: I’m a Rails developer and know the work of Yehuda Katz. I trust the guy =) Cons: Documentation. It’s really sad that Ember doesn’t have good documentation, tutorials, screencast like Backbone, Angular or other frameworks. Right now, we browse the code looking for docs which isn’t ideal.

Pros: Convention over configuration. Ember makes so many small decisions for you it’s by far the easiest way to build a client-side application these days. Cons: The learning curve. It is missing the mass of getting started guides that exist for other frameworks like Backbone, this is partly because of the small community, but I think more because of the state of flux the codebase is in pre-1.0.

Pros: Simplicity, bindings, tight integration with Handlebars, ease of enabling modularity in my own code. Cons: I’d like to have a stable integration with ember-data, and integrated localStorage support synced with a REST API, but hey that’s fantasy that one day will surely come true ;-)

BACKBONE.JS

Pros: Simplicity — only 4 core components (Collection, Model, View, Router). Huge community (ecosystem) and lots of solutions on StackOverflow. Higher order frameworks like Marionette or Vertebrae with lots of clever code inside. Somebody might like “low-levelness” — need to write lots of boilerplate code, but get customized application architecture. Cons: I don’t like how extend method works — it copies content of parent objects into new one. Prototypal inheritance FTW. Sometime I miss real world scenarios in docs examples. Also there is a lot of research needed to figure out how to build a bigger app after reading the TODO tutorial. I’m missing official AMD support in projects from DocumentCloud (BB, _). [Note: this shouldn’t be an issue with the new RequireJS shim() method in RequireJS 2.0].

Pros: After the initial brain-warp of understanding how Backbone rolls, it is incredibly useful. Useful as in, well supported, lightweight, and constantly updated in a valid scope. Ties in with natural friends Underscore, jQuery/Zepto, tools that most of my studio’s projects would work with. Cons: The amount of tutorials on how to do things with Backbone is inconsistent and at different periods of Backbones lifespan. I’ve asked other devs to have a look at Backbone, and they would be writing code for v0.3. Un-aware. Whilst not a problem Backbone can fix itself, it is certainly a major dislike associated with the framework. I suppose in theory, you could apply this to anything else, but, Backbone is a recurrent one in my eyes. Hell, I’ve even seen month old articles using ancient Backbone methods and patterns. Whatever dislikes I would have on the framework strictly itself, has been rectified by the community through sensible hacks and approaches. For me, that is why Backbone is great, the community backing it up.

Pros: Provides just enough abstraction without unreasonable opinions — enabling you to tailor it to the needs of the project. Cons: I would re-write (or possibly remove) Backbone.sync. It has baked in assumptions of typical client-initiated HTTP communications, and doesn’t adapt well to the push nature of WebSockets.

Pros: It’s extremely easy to get into, offering a nice gateway to MV* based frameworks. It’s relatively customizable and there are also tons of other people using it, making finding help or support easy. Cons: The fact that there’s no view bindings by default (although you can fix this). Re-rendering the whole view when a single property changes is wasteful. The RESTful API has a lot of positives, but the lack of bulk-saving (admittedly a problem with REST itself, but still) and the difficulty in getting different URI schemes to work on different types of operations sucks.

ANGULARJS

Pros: a) 2-way data binding is incredibly powerful. You tend to think more about your model and the state that it is in instead of a series of events that need to happen. The model is the single source of truth. b) Performance. AngularJS is a small download. It’s templating uses DOM nodes instead of converting strings into DOM nodes and should perform better. c) If you are targeting modern browsers and/or are a little careful, you can drop jQuery from your dependencies too. Cons: I’d like to be able to specify transitions for UI state changes that propgate from a model change. Specifically for elements that use ng-show or ng-hide I’d like to use a fade or slide in in an easy declarative way.

Pros: It’s very intuitive, has excellent documentation. I love their data binding approach, HTML based views, nested scopes. I switched from Backbone/Thorax to Angular and never looked back. A new Chrome extension Batarang integrates with Chrome Developer’s Tools and provides live access the Angular data structures. Cons: I’d like to have a built-in support to such functions as drag’n’drop, however this can be added using external components available on GitHub. I’d also like to see more 3rd party components available for reuse. I think it’s just a matter of time for the ecosystem around AngularJS to get more mature and then these will be available just like they are in communities like jQuery.

Pros: It minimizes drastically the boilerplate code, allows for nice code reuse through components, extends the HTML syntax so that many complex features end up being as simple as applying a directive (attribute) in the HTML, and is super-easily testable thanks to a full commitment to dependency injection. You can write a non-trivial app without jQuery or without directly manipulating the DOM. That’s quite a feat. Cons: Its learning curve is somewhat steeper than Backbone (which is quite easy to master), but the gain is appreciative. Documentation could be better.

KNOCKOUTJS

Pros: I don’t necessarily use it all the time, but KnockoutJS is just fantastic for single page applications. Extremely easy subscribing to live sorting; much better API for so called “collection views” in Backbone using observable arrays. And custom event on observables for effects, etc. Cons: Feel like the API is quite hard to scale, and would probably prefer to wrangle Backbone on the bigger applications. (But that’s also partially due to community support).

Pros: I like the data binding mechanism and feel very comfortable using it. In particular I like how they have replaced templates with control flow binding. Cons: I don’t like that there is no guidance or best practice in terms of application structure. Aside from having a view model, the framework doesn’t help you in defining a well structured view model. It’s very easy to end up with a large unmaintainable function.

DOJO

Pros: Syntactically, Dojo is very simple. It allows for dynamic and robust builds, with the initial loader file being as low as 6k in some cases. It is AMD compatible, making it extremely portable, and comes out-of-the-box with a ton of features ranging from basic dom interactions to complex SVG, VML, and canvas functionality. The widget system, Dijit, is unmatched in it’s ease-of-use and ability to be extended. It’s a very well-rounded and complete toolkit. Cons: The dojo/_base/declare functionality is not 100% strict mode compliant and there is currently some overhead due to backwards compatibility, though this will mostly go away in the Dojo 2.0 release.

Pros: Good components : tabs, datagrid, formManager… Renders the same cross browser. AMD compliant. Easy to test with mocks.Integrates well with other frameworks thks to amd (I ll integrate with JMVC) Cons: Default design for components out of fashion. Not fully html5. So-so documentation Poor templating system (no auto binding).

YUI

Pros: YUI3 is a modular and use-at-will type of component library which includes all of the goodies of Backbone and more. It even (in my opinion) improves upon some of the concepts in Backbone by de-coupling some things (i.e. attribute is a separate module that can be mixed into any object – the event module can be mixed in similarly). Cons: I’d love to see YUI3 support some of the auto-wiring (optional) of Ember. I think that is really the big win for Ember; otherwise, I see YUI3 as a superior component library where I can cherry-pick what I need. I’d also like to see a more AMD-compatible module loader. The loader today works very well; however, it would be nicer if I could start a new projects based on AMD modules and pull in certain YUI3 components and other things from other places that are also using AMD.

JAVASCRIPTMVC

Pros: Has all tools included, just need to run commands and start building. I have used for the last 6 months and it’s been really good. Cons: The only thing I would do is to speed up development of the next version. Developers are aware of problems and fixing issues but its going to be another ¾ months before some issues I want fixed are addressed, but then I could probably patch and do a pull request.

MARIA

Pros: Because Maria is a pure MVC framework that is focused on being just an MVC framework. No more and no less. Its clean and simple. Cons: A little more usage documentation outside of the source code, plus a few more test cases. A tutorial that drives home the real use of MVC with Maria would be good too.

CUJO.JS

Pros: Real apps almost never fit perfectly into an MV* box, and the most important stuff is often outside the box. With cujo.js, you define the box. Yes, cujo.js has high-level MV*-like features for creating views, models, controllers, etc., but every app is different, and no framework can ever be a 100% solution. Rather than try to be all things, cujo.js also provides lower level tools, architectural plumbling, and a rich plugin system that can even be used to integrate and extend other MV* frameworks. Create the architecture that best suits your application, rather than constraining your app to fit inside someone else’s predefined architecture. Cons: The broader JavaScript community is totally unprepared and untrained to take on large-scale applications. Most of us don’t even know that design patterns and architectural patterns exist. Since cujo.js is so different from other frameworks, it needs more than a simple API reference and code snippets. Without tutorials, educational materials, and step-by-step examples, cujo.js might look strange and overwhelming to the untrained eye but documentation is supposed to be coming soon.

EXTJS

Pros: I think ExtJS works best in combination with Ext Designer. It gives it an edge beyond the other GUI frameworks by letting non-programmers mock up the UI so programmers can fill in the blanks. I think comparing it to MVC frameworks like Backbone doesn’t do it justice – its strength lies in creating rich GUIs, not lean Web apps. For rich, commercial back-office applications I think ExtJS remains the best choice when it comes to JavaScript solutions (i.e. not GWT etc). For public-facing Web apps I’d rather have something that gives me more control over the markup (and ideally something that degrades gracefully). Cons: It has a steeper learning curve than many of the other modern structural frameworks. One can argue that if you’re investing in ExtJS for the long-term this time spent learning will pay off, however I think solutions like it should aim to better minimize the time it takes to train teams up in using it.

Pros: I think a big feature of ExtJS 4 is that it throws you into the MVC mindset and the preferred filesystem structure right from the bat. With Dojo the initial tutorials seem to be mostly about augmenting existing websites whereas ExtJS assumes you’re starting from scratch. Using ExtJS doesn’t really “feel” like you’re dealing with HTML at all. The component library is rich enough to let you go a long way without touching more HTML than what is needed to bootstrap your app. It’d be interesting to see how both compare when Web components become more widely supported. This would finally allow manipulating the DOM without being afraid of breaking any widgets or causing your app’s internal state to become inconsistent. Cons: The licensing is considered restrictive and difficult to understand by some. More people would be investing in ExtJS if it was clearer what the upfront and long-term costs of using it are. This isn’t a concern with some other structural solutions but probably isn’t as much a worry for larger businesses.

Pros: ExtJS is a fantastic package for rapidly building out RIAs for internal use. I for one, love to build with HTML and JavaScript, and for me there’s great satisfaction in mucking around at that level. Even though ExtJS makes it feel like you’re not really working with HTML it still offers a great deal of power, especially if you’re using it to create a complex UI. Cons: That said…I absolutely agree that it’s very heavy and I don’t think I’d recommend it for an external facing Web application. My biggest beef with the package overall is actually that it’s more of a PITA to test with than I’d would like. Our tester actually ended up switching to Sikuli because it was becoming too much of a battle trying to work with it in Selenium.

BATMAN

Pros: It has a great and easy to use view bindings system. Plays with Rails very nicely and is all about convention over configuration. Cons: The documentation could be a lot better and I feel Shopify won’t be adding the features that they say that they will.

Don’t Be Afraid To Experiment

Whilst it’s unlikely for a developer to need to learn how to use more than a handfull of these frameworks, I do encourage exploration of those you’re unfamiliar with. There’s more than mountain of interesting facts and techniques that can be learned in this process.

In my case: I discovered that Batman.js required the least hand-written lines of code for an implementation. I’m neither a frequent CoffeeScript nor Batman.js user but that in itself gave me some food for thought. Perhaps I could take some of what made this possible and bring it over to the frameworks I do use. Or, maybe I’d simply use Batman.js in a future project if I found the community and support around it improved over time.

Regardless of whether you end up using a different solution, at the end of the day all you have to gain from exploration is more knowledge about what’s out there.

Going Beyond MV* Frameworks

Whilst the MV* family of patterns are quite popular for structuring applications, they’re limited in that they don’t address any kind of application layer, communication between Views, services that perform work or anything else. Developers may thus find that they sometimes need to explore beyond just MVC — there are times when you absolutely need to take what they have to offer further.

We reached out to developers that have been taking MVC further with their own patterns or extensions for existing frameworks to get some insights on where you need something more.

“In my case, I needed something Composite. I noticed that there were patterns in Backbone apps where developers realized there was a need for an object that coordinated various parts of an application. Most of the time, I’ve seen developers try to solve this using a Backbone construct (e.g a View), even when there isn’t really a need for it. This is why I instead explored the need for an Application Initializer.

I also found that MVC didn’t really describe a way to handle regions of a page or application. The gist of region management is that you could define a visible area of the screen and build out the most basic layout for it without knowing what content was going to be displayed in it at runtime.

I created solutions for region management, application initialization and more in my extension project Marionette. It’s one of a number of solutions that extend upon a framework (or architecture pattern) that developers end up needing when they’re building single-page applications that are relatively complex.

There’s even a TodoMVC Marionette app available for anyone wishing to compare the standard Backbone application with one that goes beyond just MV*.

Derick Bailey — Author of Marionette

“While a good portion of problems can be decomposed into JavaScript MVC, there are some which simply cannot. For example, an application consumes a third party API at runtime, but is not given any information as to how the data will be structured.

I spent almost a year trying to solve that very problem, but eventually I came to the realization that shoehorning it into MV* was not a viable solution. I was dealing with an “amorphous model” and that’s where it all fell apart. In other words, if you don’t have a well-defined model, most modern JavaScript frameworks can’t help you.

That’s where Core J2EE Patterns come in. I got turned on to them while reading PHP Objects, Patterns, and Practice by Matt Zandstra, and I’m glad I did! The J2EE Patterns basically outline a request-driven process, where the URL drives the behavior of the application. In a nutshell, a request is created, modified, and then used to determine the view to render.

I’ve expanded on my experiences with request driven Javascript applications and J2EE patterns for anyone who would like to learn more. ”

Dustin Boston — co-author, Aura

Conclusions

While there are several choices for what to use for structuring your JavaScript Web applications these days, it’s important to be diligent in the selection process – spend time thoroughly evaluating your options in order to make a decision which results in sustainable,maintainable code. Framework diversity fosters innovation, while too much similarity just creates noise.

Projects like TodoMVC can help narrow down your selections to those you feel might be the most interesting or most comfortable for a particular project. Remember to take your time choosing, don’t feel too constrained by using a specific pattern and keep in mind that it’s completely acceptable to build on the solution you select to best fit the needs of your application.

Experimenting with different frameworks will also give you different views on how to solve common problems which will in turn make you a better programmer.

Thanks to my fellow TodoMVC team-member Sindre Sorhus for his help with tweaks and a technical review of this article.

Tags:Essentials,JavaScript,MVC Tools

← Older Blog Archives Newer →