Another reason to love objects

A major benefit of object-oriented programming is that there’s one and only one place to modify something. If a program’s objects are designed and named intelligently and you need to fix a bug or add new functionality, you know exactly where to go.

A less frequently cited benefit is that users can quickly find out how to use your software, even if you provide little/no documentation (or, even worse, completely outdated and misleading documentation).

Over several years of screen scraping, I’ve used a series of programs, starting with Hpricot, Nokogiri and Mechanize. But these don’t work well with Javascript-based, formless logins, so I’ve just switched some of my code over to Selenium Webdriver. But Selenium’s documentation is a mess because “Selenium” is really several different programs that have been mashed together, and Selenium’s evolving quickly. Selenium’s documentation warns “We are currently updating this document for the Selenium 2.0 release. This means we are currently writing and editing new material, and revising old material.”

My efforts Googling for how to save a page’s HTML to a file kept coming across suggestions to use “.getBodyText().” This, I figured, was the Java form, so “.get_body_text” should be the Ruby form. I found the .get_body_text method in the API docs, but — I discovered after a while — it has apparently been deprecated because it was defined in the “legacy_driver.rb” file. This belongs to the Client version of Selenium, not the WebDriver version. The API docs should not mix the two, but they do.

So then I went looking in the Selenium::WebDriver::Driver class, and — sure enough — there it was. The new method is called “page_source”. Problem solved.

I was slowed down and led astray by outdated documentation, naming changes, and the conflation of two different APIs into one set of API docs. But I found the right answer reasonably quickly, thanks to the beauty of objects.

Posted by James on Tuesday, July 12, 2011