How to truncate HTML with Ruby on Rails

This was holding me up for a while. I asked around online and did some code searches on Google but wasn’t able to find a good rails solution to truncating a string of HTML coming out of Textile without cutting off tags and making a general mess of things. Links in particular were problematic.

In the end I came up with a solution that I think was pretty easy to implement in my application controller and that I could call from any view.

Details below,
Enjoy!

  #in application_helper.rb 

    # Does NOT behave identical to current Rails truncate method 
    # you must pass options as a hash not just values
    # Sample usage: <%= html_truncate(category.description, :length => 
    # 120, :omission => "(continued...)" ) -%>...

  def html_truncate(html, truncate_length, options={})
    text, result = [], []
    # get all text (including punctuation) and tags and stick them in a hash
    html.scan(/<\/?[^>]*>|[A-Za-z0-9.,\/&#;\!\+\(\)\-"'?]+/).each { |t| text << t }
    text.each do |str|
      if truncate_length > 0
        if str =~ /<\/?[^>]*>/
          previous_tag = str
          result << str
        else
          result << str
          truncate_length -= str.length
        end
      else
        # now stick the next tag with a  that matches the previous 
        # open tag on the end of the result
        if previous_tag && str =~ /<\/([#{previous_tag}]*)>/
          result << str
        end
      end
    end
    return result.join(" ") + options[:omission].to_s
  end
end

I’m open to improvement on this. Perhaps extending “acts_as_textiled” or something. Maybe I could find a neater way to put the two arrays together.

As of the time of this writing, I just finished this and will be putting it into a production environment so that I can evaluate if it’s quick enough to work in production. Since I went with regex, I’m hoping so.

You can see it in action on Seaview Global

Categories

4 Comments

Leave a Reply

Translate »