Articles

Reason #427 why I hate proprietary operating systems

At home, I run Linux machines, my wife is on a Mac, and my kids and in-laws are on Windows laptops. (I’ll transition my kids to Linux as they move into programming.)

Because of this heterogeneity, I like to format my external hard drives with multiple partitions, each for a different OS.

But this can be a huge pain. I formatted a 3 TB hard drive with a Windows partition, a Linux partition, and space for a Mac (HFS+) partition. But my wife’s MacBook Pro’s Disk Utility refused to create an HFS+ partition on the third physical partition, complaining that the hard drive has a Master Boot Record. I wasn’t trying to create a bootable partition, but Mac OS didn’t care. Using my Linux machine (and the “hfsprogs” package), I managed to format the partition as HFS+. It shocked me that Linux could create a Mac-formatted partition where a Mac couldn’t.

My wife’s MacBook agreed it was an HFS+ partition in good state, but it still refused to let TimeMachine back up to it because it was a non-journaled HFS+ partition. GParted can’t create a journaled HFS+ partition.

I finally surrendered and threw away all my partitions and let the MacBook Pro take the first physical spot on the hard drive. TimeMachine is finally running. I won’t know whether I can use the unformatted 2TB of space for Windows or Linux till it finishes. Proprietary OSes are so annoying!

Posted by James on Mar 21, 2014

Database tuning: Triggers & materialized views

Had fun today at work tuning a Postgres database that has gotten very slow over the years as it has accumulated many gigabytes of data. This app is frequently rendered inoperable by just one or two users visiting its home page, which is obviously a bad situation. (Luckily, it’s an in-house tool used almost exclusively by a single user, which is why it hasn’t received more love before now.)

Over the past week, I’ve been recording the slowest queries, and today I started attacking them. The easiest-to-fix were the ones caused by missing indexes. Another problem I found was unnecessary overhead from two compound indexes that were indexing the same two columns with opposite orderings; I turned one into a single-column index, which should produce similar read performance and superior write performance.

A third fix I proposed was adding a field for the calculated value of md5(email). Some queries have been doing full-table searches of md5(email). I don’t understand why that’s necessary, but having to calculate md5() for every row in the table and then scanning the whole table sounds pretty inefficient. So I created a named function for calculating md5(email) and a trigger that calls the function whenever a table record is added or modified. Doing this at the database layer makes sense because Rails doesn’t need to know anything about md5(email).

I also created my first Postgres materialized view today. Another query can occasionally take 40+ seconds on our server. The same query normally runs orders of magnitude faster, so I’m not sure what causes such long delays. But it’s doing a join that involves calculating a count on a large table. My first thought was to add a counter cache, but that didn’t make sense when I looked at the table layout. I instead made a materialized view, which worked well on my static copy of the production database. But when I went to the Postgres documentation, I discovered two flaws with Postgres 9.3’s materialized view implementation: 1) Updating the materialized view is a manual process; and, 2) Updating the materialized view takes a full lock on the view. So I’m not sure it’s worth pushing to production, but I’m glad to read that Postgres devs are already working to improve the implementation of materialized views.

Posted by James on Mar 21, 2014

Chef pain point: Modifying 2+ lines but not an entire file

I’m suffering some pain modifying server configuration files with Chef.

Chef::Util::FileEdit is great for replacing one line with another, as many times as desired:

ruby_block "provide dovecot with custom MySQL connection info" do
  block do
    file = Chef::Util::FileEdit.new("/etc/dovecot/dovecot-sql.conf.ext")
    file.search_file_replace_line(/#driver = /,"driver = mysql")
    file.search_file_replace_line(/#connect = /,"connect = host=127.0.0.1 dbname=mail user=mailuser password=new_pw")
    file.search_file_replace_line(/default_pass_scheme/,"default_pass_scheme = SHA512-CRYPT")
    file.search_file_replace_line(/password_query/,"password_query = select email as user, password from users where email = '%u';")
    file.write_file
  end
end

And templates are great for replacing entire files:

template "/etc/dovecot/conf.d/10-master.conf" do
  source "10-master.conf.erb"
  mode 0640
  owner "vmail"
  group "dovecot"
end

But I can’t figure out how to replace a multi-line code block in a file. The articles I’ve read suggest Chef tries to force users into replacing whole files. cassianoleal answers a question about how to do so with, “As you said yourself, the recommended Chef pattern is to manage the whole file.” Why must we copy entire files to replace one block of code with another? Chef 11 apparently includes “partials,” which let you insert multi-line code elements. Chef::Util::FileEdit also lets you do that. But the ability to insert multiple lines doesn’t enable deleting multi-line code blocks. I probably could copy the entire file into memory, search-and-replace the multi-line segment with a regex and write the modified file back, but shouldn’t this be a built-in Chef tool?

Posted by James on Mar 17, 2014

Reverting commits with Git without losing history

Today at work, I decided to roll back the previous few commits I had made, but I didn’t want to git reset --hard and throw away history or mess up anyone else who might have pulled from my branch, so I decided to git revert, but I wasn’t quite sure the syntax.

I pulled out my normally reliable name-brand search engine and, after an unusually long search, found the “answer.” But it wasn’t quite right. It failed to revert one of the commits I wanted to revert. So I’m putting the answer here in hopes it saves someone else some pain.

I make three commits below and then revert the last two…

mkdir test_git_revert
cd test_git_revert/
git init .
vim a.txt
git add a.txt
git commit -m "create a.txt"
vim b.txt
git add b.txt
git commit -m "create b.txt"
vim c.txt
git add c.txt
git commit -m "create c.txt"
git log

  commit b59b5ecddc5284358da38635dc0829f629be11a7

  Author: James Lavin <james@fakedomain.com>

  Date:   Thu Mar 13 16:58:42 2014 -0400

      create c.txt

  commit 31fec743e007f94eb4738d1108c79b38dfa6cff0

  Author: James Lavin <james@fakedomain.com>

  Date:   Thu Mar 13 16:58:19 2014 -0400

      create b.txt

  commit a9d88ae06cedf5296297705142020a5264c839b8

  Author: James Lavin <james@fakedomain.com>

  Date:   Thu Mar 13 16:57:56 2014 -0400

      create a.txt

To revert the previous two commits and keep the first, I ran the following:

git revert --no-edit a9d88ae06cedf..b59b5ecddc5284

which is equivalent to:

git revert --no-edit <last_good_commit_SHA>..<last_bad_commit_SHA>

The output:

[master 4323ac0] Revert "create c.txt"

 1 file changed, 1 deletion(-)

 delete mode 100644 c.txt

[master c49aa86] Revert "create b.txt"

 1 file changed, 1 deletion(-)

 delete mode 100644 b.txt

I then confirmed with git log:

commit c49aa86bd04addb0a585417534bdb02638800e17

Author: James Lavin <james@fakedomain.com>

Date:   Thu Mar 13 16:59:26 2014 -0400

    Revert "create b.txt"

    This reverts commit 31fec743e007f94eb4738d1108c79b38dfa6cff0.

commit 4323ac0c5bccda28fc263ca7c8ff9d4d9f88a14c

Author: James Lavin <james@fakedomain.com>

Date:   Thu Mar 13 16:59:26 2014 -0400

    Revert "create c.txt"

    This reverts commit b59b5ecddc5284358da38635dc0829f629be11a7.

commit b59b5ecddc5284358da38635dc0829f629be11a7

Author: James Lavin <james@fakedomain.com>

Date:   Thu Mar 13 16:58:42 2014 -0400

    create c.txt

commit 31fec743e007f94eb4738d1108c79b38dfa6cff0

Author: James Lavin <james@fakedomain.com>

Date:   Thu Mar 13 16:58:19 2014 -0400

    create b.txt

commit a9d88ae06cedf5296297705142020a5264c839b8

Author: James Lavin <james@fakedomain.com>

Date:   Thu Mar 13 16:57:56 2014 -0400

    create a.txt

To check again, I ran git diff a9d88ae06cedf52 and got blank output, indicating I was where I was after the first commit.

To triple check, I ran ls and saw only the file I added to Git in the first commit:

a.txt

Posted by James on Mar 13, 2014

Banging my head over a Chef cookbook glitch and inflexible Librarian

After frustration over the deprecation of Berkshelf 2 (“WARNING: It is advised at this time that you use Berkshelf 3. Berkshelf 2 is no longer being actively developed and has a number of significant issues related to dependency resolution that Berkshelf 3 fixes”), deprecation of Vagrant Berkshelf and the minimal documentation of still-in-beta Berkshelf 3, I decided to give librarian-chef a try.

Librarian initially seemed to work beautifully. But it fell down when it hit a bug in a cookbook I was trying to use:

================================================================================
Recipe Compile Error in /root/chef-solo/cookbooks-2/postfix-dovecot/recipes/default.rb
================================================================================

ArgumentError
-------------
You must supply a name when declaring a package resource

Cookbook Trace:
---------------
  /root/chef-solo/cookbooks-2/postfixadmin/recipes/default.rb:60:in `from_file'
  /root/chef-solo/cookbooks-2/postfix-dovecot/recipes/postfixadmin.rb:22:in `from_file'
  /root/chef-solo/cookbooks-2/postfix-dovecot/recipes/default.rb:23:in `from_file'

Relevant File Content:
----------------------
/root/chef-solo/cookbooks-2/postfixadmin/recipes/default.rb:

 59:  
 60>> package pkg_php_mbstring do
 61:    not_if do pkg_php_mbstring.nil? end
 62:    action :install
 63:  end

For my OS, pkg_php_mbstring is nil. But that means “package pkg_php_mbstring” translates to “package nil,” which is illegal.

Fixing this tiny bug would be easy if I controlled the cookbook, but Librarian does. I cloned the Github repo and modified the offending lines, but Librarian wouldn’t let me use my version of the cookbook because it’s included by another cookbook, which is included by another cookbook. I would have to clone them all and redefine the entire dependency chain to get my tiny bug fix in.

This is a known weakness of Librarian:

one of the major annoyances with Chef: since librarian-chef managed these cookbooks and the directories
in which they lived, the workflow for editing them was tedious and hackish. Editing stuff you don't own
[involves]:
* clone down other cookbook
* move librarian-managed cookbook somewhere else
* symlink your cloned cookbook in its place so knife can find it
* knife cookbook upload <cookbook>
* whack moles
* git commit
* remove the symlink to the cloned cookbook
* put the librarian-managed cookbook back where it was (or delete it)
* bundle exec librarian-chef update <- to update your Cheffile.lock to have the right version of the
  cookbook you just edited
* bundle exec librarian-chef install <- to install the version you just specified

There are ways to make the above process shorter such as scripting steps 2, 3, 7, and 8 as well as
saving steps 9 and 10 until you are totally done working. And, to be fair, librarian-chef served the
very important purpose at one time. However, the process of editing stuff you don't own is still
tedious and less than ideal.

I really want to love Chef, but I keep hitting issues like this. Sigh.

Posted by James on Mar 04, 2014