ArticleS. UncleBob.
TheHungarianAbhorrencePrinciple [add child]

The Hungarian Abhorance Principle

In my previous blog (RubarianNotation) I had posted this little code snippet from a Mad Libs program:
  def translate(text)
questions_and_text = split_questions_from_text(text)
answers_and_text = replace_questions_with_answers(questions_and_text)
answers_and_text.join
end

The variable questions_and_text held on to a list of strings which alternated between questions and text. The question strings looked like this ((an adjective)), and the text was just raw text. So, for example, if my original document looked liks this
((name of someone in room)) took a ((a noun)) today and ((a past-tense verb)) his back.
We would expect the questions_and_text variable to look like this:
["((name of someone in room))",
" took a ",
"((a noun))",
" today and ",
"((a past-tense verb))",
" his back."]
So this variable holds a list of strings of alternating questions and answers.

I did not find the name questions_and_answers satisfying, and started to think that it would be better to use a name more like list_of_strings_of_alternating_questions_and_answers. This variable name is long, so I thought it might be better to create a set of standard abbreviations like ls_alternating_questions_and_answers. And then I stopped myself and realized that I was just reinventing the horror of Hungarian notation. I took this to mean that my program had a design flaw.

So I refactored the program to improve the design. Here's the impact of that refactoring on the previous snippet of code:
  def translate(document)
game = make_game(document)
game.play(GameContext.new(@asker))
end

What a difference! Remarkably, the first line of both snippets parses the incoming text into a convenient form. The second line of both processes that convenient form, asking the approriate questions and replacing the questions with the answers. And yet the implication of the names between the two snippets is remarkably different. In the first case they are rife with implementation detail. In the second, they tell you whats going on at an abstract level.

There is a more important difference between the two snippets. In the first we are operating on data structures, which is why we want to identify the structure of those data structures. In the second we are telling objects what to do, which is why we don't care about the structure of the objects and can focus on their intent instead.

Of course this is just good old OO. But for an old C++/Java/C#-head, like me, this is something of an eye-opener. Ruby has very rich primitive types like lists and maps. You can use them to create lists of lists and maps of lists and lots of other deep and convoluted structures. Indeed, that's what I had done in the first snippet. I had created a semantically rich data structure composed of a list of strings that alternated between raw text and questions. What I have come to realize is that although slinging these rich data structures around is fun, it's also bad.

As I said in the previous post, we want our variables to hold objects that we can tell what to do, we don't want our variables holding data structures that we operate on. If we use the former strategy, then the variable names don't have to encode the structure of the data. Rather they, coupled with the methods that are invoked against them, let us know what we expect the object to do.

So: A general principle:

 The Hungarian Abhorrence Principle

When you are tempted to encode data structure in a variable name (e.g. Hungarian notation), you need to create an object that hides that structure and exposes behavior.


For your edification I have included both versions of the Mad Libs program, as well as the rspecs that define the behavior.
 madlibs_spec.rb
#!/usr/bin/env ruby
require 'spec'
require 'madlibs'

context "Match QUESTION_DELIMITER: " do
specify "no questions" do
"no questions".match(MadLibs::QUESTION_DELIMITER).should_be nil
end
specify "delimit simple question" do
match = "this ((that)) other".match(MadLibs::QUESTION_DELIMITER)
match.should_not_be nil
match[1].should_equal "((that))"
end
specify "delimit complex question" do
match = "this ((that:thing)) other".match(MadLibs::QUESTION_DELIMITER)
match.should_not_be nil
match[1].should_equal "((that:thing))"
end
specify "delimit one question from multiple questions" do
match = "this ((that)) other ((next)) one".match(MadLibs::QUESTION_DELIMITER)
match.should_not_be nil
match[1].should_equal "((that))"
end
specify "split works as expected" do
tokens = "this ((question)) is".split(MadLibs::QUESTION_DELIMITER)
tokens.should_equal ["this ","((question))", " is"]
end

specify "delimits questions with line break" do
match = "this ((that\nthing)) other".match(MadLibs::QUESTION_DELIMITER)
match.should_not_be nil
match[1].should_equal "((that\nthing))"
end
end

context "Match QUESTION_PARSER: " do
specify "question parser matches" do
"no question".should_not_match(MadLibs::QUESTION_PARSER)
"this has ((a question)) in it".should_match(MadLibs::QUESTION_PARSER)
end
specify "question parser parses" do
match = "This has ((a question)) in it".match(MadLibs::QUESTION_PARSER)
match.should_not_be nil
match[1].should_equal "a question"
end
specify "question has line break" do
match = "This has ((a\nquestion)) in it".match(MadLibs::QUESTION_PARSER)
match.should_not_be nil
match[1].should_equal "a\nquestion"
end
specify "question has surrounding whitespace" do
match = "This has (( a question )) in it".match(MadLibs::QUESTION_PARSER)
match.should_not_be nil
match[1].should_equal "a question"
end
end

context "Translate" do
setup do
@asker = mock "asker"
@ml = MadLibs.new(@asker)
end

specify "empty document" do
@ml.translate("").should_equal ""
end

specify "document with no questions" do
@ml.translate("Bob").should_equal "Bob"
end

specify "document with one simple question" do
@asker.should_receive(:ask).once.with("color").and_return("blue")
@ml.translate("The sky is ((color)).").should_equal "The sky is blue."
end

specify "document with two simple questions" do
@asker.should_receive(:ask).with("color").and_return("red")
@asker.should_receive(:ask).with("texture").and_return("rough")
@ml.translate("My ((color)) hankie is ((texture)).").
should_equal "My red hankie is rough."
end

specify "line break inside question treated like space" do
@asker.should_receive(:ask).once.with("a color").and_return("blue")
@ml.translate("The sky is ((a\ncolor)).").should_equal "The sky is blue."
end

specify "named question with no replacement" do
@asker.should_receive(:ask).once.with("color").and_return("blue")
@ml.translate("The sky is ((skycolor:color)).").should_equal "The sky is blue."
end

specify "named question with two colons" do
@asker.should_receive(:ask).once.with("color:sky").and_return("blue")
@ml.translate("The sky is ((skycolor:color:sky)).").should_equal "The sky is blue."
end

specify "named question with replacement" do
@asker.should_receive(:ask).once.with("color").and_return("blue")
@ml.translate("The sky is ((skycolor:color)) on ((skycolor)).").
should_equal "The sky is blue on blue."
end

specify "named question with named replacement" do
@asker.should_receive(:ask).once.with("a noun").and_return("stone")
@ml.translate("I found a ((thing:a noun)), and a ((b:thing)), and a ((b)).").
should_equal "I found a stone, and a stone, and a stone."
end

specify "null question" do
@ml.translate("This (()) is a null question").
should_equal "This is a null question"
end

specify "null name" do
@asker.should_receive(:ask).once.with("flavor").and_return("sour")
@ml.translate("This is ((:flavor)).").should_equal "This is sour."
end

specify "null name and question" do
@ml.translate("This is a null question too ((:)).").
should_equal("This is a null question too .")
end
end

context "console asker:" do
setup do
@reader = mock "reader"
@writer = mock "writer"
@asker = MadLibs::Asker.new(@reader, @writer)
end

specify "asker delivers prompt and asks for input" do
@writer.should_receive(:puts).once.with("Please enter a flavor:")
@reader.should_receive(:gets).once.and_return("sour")
@asker.ask("a flavor").should_equal "sour"
end
end

 madlibs.rb (original)
class MadLibs
QUESTION_DELIMITER=/(\(\(.*?\)\))/m #multiline mode
QUESTION_PARSER=/\(\(\s*(.*?)\s*\)\)/m #multiline mode

def initialize(asker)
@asker = asker
@named_answers = {}
end

def translate(text)
questions_and_text = split_questions_from_text(text)
answers_and_text = replace_questions_with_answers(questions_and_text)
answers_and_text.join
end

def split_questions_from_text(text)
text.split(QUESTION_DELIMITER)
end

def replace_questions_with_answers(questions_and_text)
questions_and_text.map do |question_or_text|
replace_question_with_answer(question_or_text)
end
end

def replace_question_with_answer(question_or_text)
question = extract_question(question_or_text)
if (question)
get_answer(question)
else #text
question_or_text
end
end

def extract_question(question_or_text)
question_match = question_or_text.match(QUESTION_PARSER)
if (question_match)
extracted_question = question_match[1]
return squeeze_whitespace(extracted_question)
else
return nil
end
end

def squeeze_whitespace(text)
text.gsub(/\s+/," ")
end

def get_answer(question)
question,name = parse_question(question)
named_answer = @named_answers[question]
if named_answer
answer = named_answer
else
answer = @asker.ask(question) if question
end
@named_answers[name] = answer if name
answer
end

def parse_question(question)
return [nil,nil] if question.empty?
colon = question.index(":")
if colon
return separate_question_from_name(question,colon)
else
return [question,nil]
end
end

def separate_question_from_name(question,colon)
name = question[0..colon-1]
question = question[colon+1..question.length]
name = nil if name.empty?
question = nil if question.empty?
return [question,name]
end
end

class Asker
def initialize(reader, writer)
@reader = reader
@writer = writer
end

def ask(element)
@writer.puts("Please enter " + element + ":")
response = @reader.gets
return response.chomp
end
end


 madlibs.rb (refactored)
class MadLibs
QUESTION_DELIMITER=/(\(\(.*?\)\))/m #multiline mode
QUESTION_PARSER=/\(\(\s*(.*?)\s*\)\)/m #multiline mode
attr_reader :asker, :named_answers
def initialize(asker)
@asker = asker
@named_answers = {}
end

def translate(document)
game = make_game(document)
game.play(GameContext.new(@asker))
end

def make_game(document)
q_and_t_strings = document.split(QUESTION_DELIMITER)
tokens = q_and_t_strings.map {|s| parse_string_into_token(s)}
MadLibsGame.new(tokens)
end

def parse_string_into_token(the_string)
question_string = match_question_string(the_string)
if (question_string)
parse_question(question_string)
else
Text.new(the_string)
end
end

def match_question_string(question_or_text)
question_match = question_or_text.match(QUESTION_PARSER)
if (question_match)
extracted_question = question_match[1]
return squeeze_whitespace(extracted_question)
else
return nil
end
end

def squeeze_whitespace(text)
text.gsub(/\s+/," ")
end

def parse_question(question)
return Question.new(nil,nil) if question.empty?
colon = question.index(":")
if colon
query,name = separate_question_from_name(question,colon)
return Question.new(query,name)
else
return Question.new(question,nil)
end
end

def separate_question_from_name(question,colon)
name = question[0..colon-1]
question = question[colon+1..question.length]
name = nil if name.empty?
question = nil if question.empty?
return [question,name]
end

class GameContext
attr_reader :asker, :named_answers
def initialize(asker)
@asker = asker
@named_answers = {}
end
end

class MadLibsGame
def initialize(tokens)
@tokens = tokens
end

def play(context)
text_and_answer_strings = @tokens.map {|token| token.process(context)}
text_and_answer_strings.join
end
end

class Question
attr_reader :query, :name

def initialize(query, name)
@query = query
@name = name
end

def process(context)
named_answer = context.named_answers[@query]
if named_answer
answer = named_answer
else
answer = context.asker.ask(@query) if @query
end
context.named_answers[@name] = answer if @name
answer
end

end

class Text
attr_reader :text

def initialize(text)
@text = text
end

def process(context)
@text
end
end

class Asker
def initialize(reader, writer)
@reader = reader
@writer = writer
end

def ask(element)
@writer.puts("Please enter " + element + ":")
response = @reader.gets
return response.chomp
end
end
end


!commentForm -r

 Sat, 21 Oct 2006 13:42:21, Brian Slesinsky, But can you always apply the principle?
I don't think there's a class missing because at some point a string has to enter the system. But you could make a MadLibsGame[?] a top-level class:

hello = MadLibsGame.[?]new("Hello, ((person)).")
goodbye = MadLibsGame.[?]new("Goodbye, ((person)).")

asker = AskerWithMemory.[?]new(stdin, stdout)
hello.play(asker)
goodbye.play(asker)

In this design, the AskerWithMemory[?] would keep track of which questions have already been asked. Maybe it would wrap a bare asker.

According to HAP, I shouldn't want to name them "helloGame" and "goodbyeGame", just "hello" and "goodbye". But, perhaps because I'm new to Ruby, that's what I did at first. (I wouldn't do that in Java because the type declarations serve that purpose.)
 Sat, 21 Oct 2006 11:14:29, Uncle Bob, Applying the principle
Brian said:However, if I had to call your improved translate function, I would want to know what I could pass in as "document". Is this a string, XML object, or some new domain-specific class?

I'm going to put that back on you. How can we apply the HAP to solve this problem. According to the principle the fact that you'd like to see translate(documentString) indicates that there is an object missing from the design. Or prehaps it's just the translate function's name that is to blame?
 Fri, 20 Oct 2006 23:31:48, Brian Slesinsky,
Hmm... I absolutely agree that we should be thinking in terms of higher-level, domain specific classes; I consider passing bare maps and lists around to be a code smell. However, if I had to call your improved translate function, I would want to know what I could pass in as "document". Is this a string, XML object, or some new domain-specific class? I could read the tests for that, but it's convenient to say so right there in the code, and the IDE finds it convenient too.

Also, not knowing Ruby well, I was confused at first by the first batch of tests because I didn't realize that MadLibs[?]::QUESTION_DELIMITER was a regular expression.

For another take on types and naming, see: http://slesinsky.org/brian/code/wrap_strings_for_security.html

Also: s/Abhorance/Abhorrence/ Done. -UB