The Empty String Code Smell in Rails

Sometimes you look at some source code and it just doesn't look right. The code might seem inordinately complex for the simple task being accomplished. Or the developer might be performing certain actions repeatedly, such as checking variables for nil values. This is what is known as a code smell..

Code smells should be corrected at the earliest possible opportunity. In this article, I'll talk about some real-world coding issues that my friend, Robb Kidd, and I recently encountered.

The Repetitive Value-Checking Code Smell

Here's a code smell that I found in a Rails view:

   <% if @item.summary.nil? || @item.summary == '' %>
      <p>Not Defined</p>
   <% else %>
      <p><%= @item.summary %></p>
   <% end %>

This type of code was provided for every field associated with an "item." Not good. Not surprising, though. This is a pretty common code smell from Ruby newbies.

Here's a slightly better modification:

   <% if @item.summary.blank? %>
      <p>Not Defined</p>
   <% else %>
      <p><%= @item.summary %></p>
   <% end %>

The blank? method checks for both nil and empty strings in one method. It even works on arrays:

   arr = []
   arr.blank?

If the array is empty, as it is in the code above, then blank? returns true.

The blank? method is awfully convenient for these types of checks. It also has a companion method, present?, that does the inverse check, i.e. — it checks if a value is present (ensuring that the value is not blank).

We can shorten the original code even further:

   <p><%= @item.summary.blank? ? 'Not Defined' : @item.summary %></p>

Now we've got the original code sample down to one line using what amounts to an inline if-statement. That's pretty good.

But we're still repeating that if-statement for every field displayed. To DRY the code up even further, we could extract that statement into a helper method:

   def format_text(txt)
      txt.blank? ? 'Not Defined' : txt
   end

In the view, the code then becomes:

   <%= format_text(@item.summary) %>

Now the view is simple. Just the way I like it.

The Empty String Code Smell

I still wasn't happy though. You see, the developer had told me that the reason he needed to check for empty strings everywhere was because empty strings were being physically stored in the database.

This was something that Robb Kidd and I began investigating immediately.

Once you start putting empty strings into database fields, you have compromised the usefulness of NULL to represent the absence of information.To me, this code smell was far worse than the original view-related one that had triggered this deep dive into the codebase. I was appalled that empty strings were being stored in the database.

This might not seem like a big deal to many Rails developers. But with the types of high-end enterprise systems that I typically build, high-quality data is vital. Relational databases represent empty fields as NULL. Once you start putting empty strings into database fields, you have compromised the usefulness of NULL to represent the absence of information. You've also doomed everybody who ever works with that database to check for both NULL's and empty strings if they want to determine whether a valid value is present.

You've stored a code smell in the database for all eternity. Not good.

Even worse, the database may be an underlying component of a Rails application, but there's nothing in an enterprise setting that ties it exclusively to the Rails application. If the Rails application is successful, then much of the application's value derives from the integrity of the data that has been stored. Other applications, sometimes implemented in other languages, may be built on top of that database.

By pushing a code smell into the database, we're forcing all future developers, not just Rails developers, to deal with the compromised usefulness of NULL. We may very well hear this from some future Python developer, "Jeez. Those Rails dweebs were amateurs. This wouldn't have happened if they'd used Django."

We should definitely deal with this problem. But first, how does it happen?

The application allows users to enter data via forms. Upon submission, Rails collects all submitted values into the standard params hash that is made available to controllers. Any form field that is a text field or text area will be assigned the empty string as a value. the following controller code perpetuates the empty string into the database:

   Item.create!(params['item'])

Whoa. So I can hear you asking: "So this is a Rails problem?"

Yes. It is a Rails problem, and one that in my humble opinion should never have slipped through the cracks.

But we still need to correct it in our application. So, Robb and I started researching possible solutions.

The best solution we came across was from Henrik Nyh, a Ruby developer in Stockholm, Sweden. He published an excellent code snippet on GitHub's gist sub-site. This is a site at gist.github.com that allows developers to publish short snippets of code that solve problems.

His code is a mix-in module that updates ActiveRecord's write_attribute method to properly handle empty strings.

   module NullifyBlankAttributes

      def write_attribute(attr_name, value)
         new_value = value.presence
         super(attr_name, new_value)
      end

   end

Note the use of the presence method, which takes an argument and returns either the argument itself or nil if the argument was blank.

Now what? What do we do with the module?

First, drop the module in the lib directory of the Rails application. Second, create an initialization file in the config/initializers directory of the Rails application. Let's call it active_record_fixes.rb. The contents of the file will be:

   class ActiveRecord::Base
      include NullifyBlankAttributes
   end

We have just monkey-patched Rails to correct our empty string code smell.

Conclusions

We discovered a code smell in our view code that masked a more pernicous underlying code smell. We got rid of both code smells with the help of a developer from Sweden that we've never met.

There are a few lessons to be learned from all this. First, code smells increase the maintenance profile of your application over time, making it more difficult to add new features and maintain existing ones. Code smells are an indication of a coding problem, and should be fixed as soon as feasible, even if the code is functionally working just fine.

Second, when you have a problem, don't re-invent the wheel. We're an international coding community. Somebody may very well have solved the problem already. Spend some time researching using Google in case somebody's already developed a solution. In this case, Henrik solved the problem nicely a few years ago.

Finally, give back to the community. If you solve a problem that might reasonably impact others in the community, post a solution. Write a blog entry (as I have done). Or post a code snippet on gist.github.com, or answer a question posed on sites like Stack Overflow. Coding is hard enough as it is. Let's share some solutions to make it easier on all of us.



Comments

David Keener By Charles Calvert on Wednesday, July 13, 2011 at 10:50 AM EST

Nice post, Dave. Too many developers treat the database as a dumb persistence layer and don't realize how small things like this can result in dirty data and overly complex code. It's also a good example of how mix-ins can make your life easier in Ruby.


David Keener By Rodrigo Leote on Monday, August 01, 2011 at 08:06 PM EST

Hi,

I'm a C# developer trying to get into the ruby on rails world. I'm quite confused about many things that I know how to do in ASP.Net but have no idea on how to do it in Rails. One of them is the best approach to prevent the user from inserting empty values in the database. That's how I got in your blog. The solution provided is great. I followed the steps you provided but now my rails server doesn't even start. Here is the error message: "uninitialized constant ActiveRecord::Base::NullifyBlankAttributes."

I have no clue how to initialize the constant.




David Keener By Ryan McGeary on Sunday, August 14, 2011 at 03:47 PM EST

Hey David. I've had a Rails plugin around that solves this problem for a while now. I just finally updated it as a gem. Give it a look.

   https://github.com/rmm5t/strip_attributes

It not only converts blanks to nil, but it also strips all attributes that you specify. This is good for when people accidentally add a space to the beginning or end of an email address (typical unique field used for lookups).

It does however solve the problem in a slightly different way. It works as a before_validation hook instead of overriding write_attribute. I've vacillated between the two options -- both have pros and cons.

It requires that you be a little more explicit with which fields should be stripped. After all, you don't always want to strip every attribute, e.g. pre-encrypted passwords or text in a format like markdown.




David Keener By Mark Berry on Wednesday, August 22, 2012 at 10:49 PM EST

Thanks for the NullifyBlankAttributes module. I discovered one issue: .blank? considers "false" to be blank, thus .presence replaces every explicit "false" value with nil. Not good for Boolean fields! In Rails 3.2.3, I dropped this into a single file under config/initializers to get the module working: module NullifyBlankAttributes def write_attribute(attr_name, value) if value.kind_of? FalseClass new_value = value else new_value = value.presence end super(attr_name, new_value) end end class ActiveRecord::Base include NullifyBlankAttributes end


David Keener By Peter Gillard-Moss on Friday, October 19, 2012 at 03:54 PM EST

"Other applications, sometimes implemented in other languages, may be built on top of that database." No, no, no. Shared databases are a really bad idea and breaks encapsulation and all sorts of other good engineering principles. That poor Rails app should be happy in the knowledge that no one else is going to put its dirty hands directly onto its database. See more here: http://ianfnelson.com/archives/2010/11/08/shared-database/


David Keener By dkeener on Saturday, October 20, 2012 at 03:22 PM EST

Hi Peter. Thanks for the insightful comment.

You're preaching to the choir here. I agree that the Rails application should not have to share its database with other applications. Nevertheless, this is the situation that I have often encountered in large companies, particularly those without a widespread understanding of Rails or agile practices.

I generally promote the concept of an API, so that other applications can build on top of a well-designed API that 1) provides a suitable level of abstraction to other applications, and 2) provides the necessary security mechanisms to manage who can access the data and/or change it.

No matter the situation, though, I contend that databases support NULL for a reason, and that we should use if effectively in our applications.




David Keener By dkeener on Saturday, October 20, 2012 at 03:28 PM EST

For Mark, I'm sorry I didn't see your comment sooner. I was on travel at the time; more specifically, I was in Atlanta because I was speaking at the GFIRST Cyber Security Conference hosted by the Department of Homeland Security.

I guess I never envisioned the method being used on boolean fields. The main problem I was trying to solve was all those blank strings that were being recorded in the database, and the defensive code that was required to deal with them. So I really only ever tried it on Strings.

This looks like a great enhancement to the code, so I'll add it in shortly. Thank you very much.



Leave a Comment

Comments are moderated and will not appear on the site until reviewed.

(not displayed)