Strip Tags off Rails

Disclaimer: I am not a Ruby on Rails basher. In fact I use it everyday (with some reluctance). My intention is to point out why I do not use Ruby on Rails for most of my work.

There are really small gotchas about Rails. Sometimes not testing everything can really drag you down. I think Rails is still one of the most sensible frameworks that exist. We should still rethink why we use Ruby on Rails for all cases when other frameworks were specifically designed to solve those problems in a better way. I’m referring to Meteor.js for Single Page Applications.

Why are there so many Ruby on Rails developers even though it is slow?

I like to repeat my answer on Quora. Ruby on Rails is no longer the fastest framework for getting a specific task done. I found myself happier with other frameworks on many cases. It is still one of the best tools for the job.

A more sensible way to strip off HTML tags on a content

This sounds simple but the Rails code doesn’t seem to treat it that way. I found a blog post or rant about the slow method in Rails. I tried out the benchmarks and I was not surprised that it’s still that slow with Rails 4.0.3.

I’ve kept the same code for benchmarking methods from the blog but this time, I added the method I use.

require 'rubygems'
require 'action_view'
require 'nokogiri'

include ActionView::Helpers::SanitizeHelper

f = open("news").read

class SimpleStripTags
  def self.strip_tags(content)
    content.gsub(/<\/?[^>]*>/, "")
  end
end

LOOPS = 1000
Benchmark.bmbm do |x|
  x.report("strip_tags (Rails)") { LOOPS.times { strip_tags f }}
  x.report("nokogiri") { LOOPS.times { Nokogiri::HTML(f).text }}
  x.report("simple strip tags") { LOOPS.times { SimpleStripTags.strip_tags(f) }}
end

The author of the post recommends Nokogiri but it is not a good option as you see in the screenshot. Sure, it is fast but it does not work.

In which ways?

  1. It converts HTML entities making a script executable which of course is dangerous.
  2. It failed to remove the other script tag because it is invalid HTML.
  3. It is slower.

I have been using the simple method which always worked for years. It’s so simple that I do not understand why it’s treated differently in Rails. This is the same approach done for Padrino framework. I do not use that but keep similar helpers within my apps.