PURVEYOR OF FINE WORDS

September 29, 2008

September 29 2008

Spammers

The spammers are getting tricky — they’ve somehow managed to edit my wordpress templates to include hidden UL elements with spam links now…yeesh.

One lonely comment

December 23, 2007

December 23 2007

The best page in the universe

According to Google, I am the second-most popular “purveyor of [insert genre here]” in the world, bested only by the purveyor of the world’s finest teas, Upton Tea Import. Being second in this list is lamentable, but under the circumstances not a terrible position considering that I have a better rank than the leading purveyor of fine needlework and supplies, and the purveyor of EarthBalls and Giant Globes. Gloating aside, how the moniker “purveyor of” came into being merits some discussion. C.M. recently asked,

You use the line “purveyor of fine words.” Before commandeering this line, did you look into its etymology? For example, what is correct “fine purveyor of…” or “purveyor of fine…”? Oddly, there is not much online by way of a discussion. There are of course several instances of people using the phrases both ways. I did come across a book about the history of purveyance and it talked about “fine purveyors” as those who procured better cuts of meat or poultry, as opposed to the “coarse purveyors.” However, these days, everyone claims to be a “purveyor of fine something”. I just wonder if they are interchangeable or if one is more correct than the other. For obvious reasons, you seemed like a good person to ask, being a self-titled “purveyor of fine words” and all.

Well, I chose the tagline ‘purveyor of fine words’ as a response to the typical self-deprecating blog name that is so common these days — ones that mix and match words like ‘rambling’, ‘thoughts’, ‘random’, ‘drivel’, ‘brain farts’. I subscribe to one blog that is titled, “Continuing Intermittent Incoherency“, which sounds like the author picked up some kind of Mad-Lib for blog names for inspiration. “Randomised nonsense” and “The Solipsistic Sayings of a Random Infidel” also seem to have been derived from the same template.

Perhaps these titles are a byproduct of today’s disclaimer-ridden society, where consumers are too moronic to realize that a cup of coffee contains scalding hot liquid, or that a pack of peanuts “may contain nuts”, or that power tool enthusiasts should not “attempt to stop a chainsaw with [their] hand”. In the online world, this warning zealotry translates into prefacing statements with redundant acronyms like FWIW or IMHO, which authors use to ostensibly indemnify themselves against criticism. “IMHO, you’re nothing but a fucktard and the best part of you ran down the crack of your momma’s ass”, becomes a quaint jest I suppose. In order to buck this trend, I opted to go big instead and inflate myself to gourmet proportions, and thus I promoted myself to a purveyor of fine words.

In response to C.M.’s question, I don’t have any more insight into the etymology of the phrase, as mine merely parodies Dean & Deluca’s tag line of “Purveyors of Fine Foods and Kitchenware”. I would say that “purveyors of fine…” is much more prevalent than “fine purveyors…” insofar as it’s difficult to explain the difference between a “purveyor” and a “fine purveyor” (maybe the purveyor is very attractive?), whereas the difference between “food” and “fine food” immediately conjures up contrasting images of corn dogs and Iranian caviar.

4 editorials

July 16, 2007

July 16 2007

IE does not bubble form <select> element onchange events

When developing dynamically generated forms, you often want to attach a single event handler to the main form object, and have that handle the events generated by the form elements, thus saving you the trouble of constantly attaching event handlers to newly generated elements. However, IE 6 and 7 do not bubble the onchange event beyond the originating select element, meaning that you have to explicitly attach an onchange handler to every select you generate. All other current browsers bubble the event properly.

Here is a test form for checking if your browser registers the onchange event beyond the firing select element. Changing the select options should trigger an alert dialog box.

onchange listener attached to parent <div> node

onchange listener attached to parent <form> node

onchange listener attached to actual <select> node

2 editorials

January 14, 2007

January 14 2007

Concise Adblock Filter Set Explained

Adblock is the single most useful Firefox plugin available today. Just like watching sitcoms with automatic commercial-skip, adblock’s banner ad supression system elicits a smug sense of satisfaction even after browsing through your 10,000th ad-free web page. However, a huge barrier to adoption seems to be the lack of a default filter set, so when you first install adblock, nothing happens.

The main issue is that adblock does not have any intelligence as to the content that is included with a webpage; it is just a generic regex-based filter system, so it is only as effective as the filters that you provide. There are plenty of pre-made lists available but they tend to be overly-aggressive in what is supressed, resulting in occasional broken pages and/or pages that dead-end because adblock has removed the “Next” button. The most dangerous public set seems to be the EasyList, which has a 360+ item block list. Evidence that the creators know of its greedy nature is their inclusion of a 20+ item whitelist to manually compensate what was initially blocked. Even more unstable is the EasyElement list that searches through the DOM to remove suspected elements directly from the main document — a list of 570+ substrings to search for.

Intead of using such a large, reactive list of simple and site-specific string matches that tries to supress 100% of ads, I posit that you only need 2 adblock filters to eliminate 70-80% of ads, and still be confident that legitimate content isn’t being flagged as a false positive. By getting into the heads of HTML writers, we can pick out the most common patterns used to include ads and create regex patterns to suppress the ads.

  1. /(\b|_)ad(x|s?)(\b|_)/
    This regex looks for any element that contains the string ‘ad’, ‘ads’, or ‘adx’ surrounded by a word boundary, because the vast majority of web sites partition their ads into a single directory or serve them through a single script. The word boundary check is crucial to this filter because just searching for the characters ‘ad’ is ineffective. Instead, the word boundary restriction means that adblock will supress elements that contain strings like ‘ads.server.com’ or ‘www.server.com/ads/’ or ’server.com/ad_server.php’, but not ‘adobe.com’ or ’server.com/adjustment’.
  2. /ad.*\d+[xX]\d+/
    This regex exploits the common technique of ad designers to use the image dimensions in their element name, i.e., “server.com/newads.php?location=top&size=468×80″. Like the previous rule, we don’t just exclude any element that has dimensions, but qualify that by searching for the string ‘ad’ as well.

At this point, your browsing experience will be significantly improved, but you can bump up your block rate to about 80-90% with a few more simple substring matches. There are many well known ad providers that exist solely to deliver ads, so we can consildate those in composite filter rules:

  1. /a(2\.yimg|dserv|dvert|tdmt|twola)/
    This rule collects all the ad serving systems that start with ‘a’: Yahoo, Atlas, AOLTimeWarner, and generic ad serving systems.
  2. /b(anners|logads)/
    falkag.net

    These pick up anything labeled with ‘banner’, the ‘blogads’ network, or Falk AdSolutions.

Realistically, reducing the ad load by 90% should be more than sufficient for anyone. Chasing that last 10% — and whitelisting the collateral damage — will always be a losing battle. Your time is better used reading the content that is on the page you requested in the first place.

3 editorials

September 9, 2006

September 9 2006

Positions filled

Effective immediately, I have a new title at work — actually 6 new titles…

Internets Strategerist
Internets Strategerist

Sr. Tube Developer
Sr. Tube Developer

The Decider
The Decider

Guapo
Guapo

Scrabblista
Scrabblista

Assistant to the Regional Manager
Assistant to the Regional Manager

Bonus points if you can match all the cards with their respective references:

  • The Office
  • Senator Ted Stevens
  • G.W. Bush & Will Ferrell
  • Victor’s Taqueria
  • G.W. Bush
  • My desk at work

5 editorials

August 21, 2006

August 21 2006

Yum install GPG error

When using yum install, sometimes the old GPG keys installed with rpm are obsolete, resulting in an error like the following:

warning: rpmts_HdrFromFdno: Header V3 DSA signature: NOKEY, key ID db42a60e
public key not available for autoconf-2.59-5.noarch.rpm
Retrieving GPG key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora
The GPG key at file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora (0×4F2A6FD2)
is already installed but is not the correct key for this package.
Check that this is the correct key for the “Fedora Core 4 - i386 - Base” repository.

To fix, just add the new keys to rpm (changing the path for you particular install):

rpm --import /etc/pki/rpm-gpg/RPM*

Some forum posts have suggested disabling GPG (i.e. gpgcheck=0), which can be a foolish thing to do. You want to maintain some level of assurance that what you’re downloading is legit.

2 editorials

July 1, 2006

July 1 2006

del.icio.us direc.tor update

del.icio.us has changed their API host, which breaks the current direc.tor because of the xmlHTTPRequest’s domain security policy. To fix this, the del.icio.us guys have added a couple public pages to allow direc.tor to continuing functioning. Now, you must first browse to https://api.del.icio.us before starting the direc.tor bookmarklet. Let me know if you are having any issues.

10 editorials

May 24, 2006

May 24 2006

How to create PHP extensions (aka .so objects)

Although PHP has a great library of functions, many of them are not included in the standard build, or haven’t been included into the popular package installers like yum or apt-get. The man page doesn’t leave you with much instruction, other than something like “compile PHP with the flag –with-pspell[=dir]”. At this point you have 2 options:

  • Recompile PHP
    Find your existing PHP configure command and append this new flag, and recompile PHP. This takes a while and is generally quite bothersome, if not unacceptable like when you’re in a production environment.
  • Create a dynamic extension
    Compile a separate file (usually ending in .so) that you copy into a PHP directory, and edit php.ini. If you are running multiple machines on the same OS, you can just copy the file to all those machines as well. Much easier, and you can turn it on and off at will.

Here’s how to create the extension for modules that appear in the PHP manual on a linux-based system (for third-party extensions, it’s most likely the same).

  1. Check that you have the PHP development package, which often comes in a separate package. Yum lists it as php-devel. You’ll need its components in a few steps.
  2. Download the PHP source code for whatever version you’re currently running
  3. In the source code, there is an ext/ directory that should contain a subdirectory for the module that you’re looking for. Change to that subdirectory, i.e. ext/pspell/.
  4. Type phpize
    (This won’t work if you didn’t verify step 1)
  5. Type ./configure –with-pspell=/usr
    Replace the red portion with the text that is specified in the PHP man page for the module you want. For example, MySQL improved would be something like --with-mysqli=/usr/local/mysql/bin/mysql_config. Be aware that the path is sometimes a base directory, and sometimes needs to point to a specific file. Read the PHP docs carefully.
  6. Type make
  7. When finished, the compiler should tell you where it created the .so file (most likely in the modules/ subdirectory of your current location). Copy the .so file to your PHP extensions directory, i.e. /usr/lib/php/modules. If you don’t know this, it’s listed in your php.ini file under the extension_dir parameter. You’ll need root access to do this.
  8. Finally, tell PHP about your new extension by adding one line to php.ini:
    extension=pspell.so
    Alternatively, if you already have a bunch of extensions installed, you can place it in your /etc/php.d directory in it’s own ini file for a cleaner installation approach.
  9. Restart apache, if you’re using it
  10. Check phpinfo() to verify that your new module is installed

3 editorials

May 24 2006

Ajax spell check as you type

I’ve been looking for a nice web Javascript spell checker, and came across a great implementation by Emil that he named LiteSpellChecker. It mimics the spell checker in MS Word by underlining misspelled words and presenting a nice substitute word selection menu. The javascript takes a standard <textarea> element, erases the background, and inserts a shadow <div> underneath that holds the redline segments.
spell check screenshot in Firefox OS X

Bug Fixes (based on 2005-7-24 version of LiteSpellChecker)

The current implementation on his site has some bugs, so I’ve started to tackle some of them:

  • Fixed: Redlines get misaligned when scrolling with arrow keys in Firefox
    When scrolling long text using Firefox, the redlines become misaligned as you move towards the bottom of the page. Once they begin to misalign, you have to manually refresh in order get it right again.
  • Fixed: Ignore word function breaks if there are non-word segments in the text
    If the text contains characters like ++ or @@, then the ignore words function fails. This problem crops up whenever you have wiki markup syntax.
  • Fixed: Numbers are flagged as misspelled
    Any word that contains a number is now ignored by the spell checker.

Performance Improvements

Since many of my users will be working with long documents, performance is key.

  • Sped up ignore function by almost 5x
    The ignore word function loops through all words and feeds it back into the spellchecker. In long documents, the lag becomes noticable, and also causes a flicker of all the redlines. Instead of blindly looping through all words, I changed it to a case-insensitive search for the ignored word, and only processed the matching words. Venkman reports a speed up from 278.2ms to 57.95ms on a Powerbook G4.
  • Improved responsiveness while editing long text
    Since the spell checker updates the spelling status on every keystroke, the UI becomes unusable when editing long text. I added a spell check delay that suspends the spell checker while typing continuously. You can adjust the sensitivity via the SPELL_CHECK_DELAY variable.

Download my source code. Important: read Emil’s demo page before attempting to do anything with these files.

5 editorials

December 7, 2005

December 7 2005

Flickr Prints Review

I ordered a couple prints from Flickr’s new print service to check out the quality. Here are my comments:

  • Wait time: 8 days via USPS: 11/28 - submitted order; 11/30 - order shipped; 12/7 - order received; The package was delivered from Norcross, GA, so I’m assuming that the west coast has the longest wait.
  • Package contents: index print, photos, Flickr sticker, “Inspected by #179″ note.
  • Print quality: The glossy 4×6 prints I ordered were printed on Fujifilm Fujicolor Crystal Archive paper, most likey by a Frontier laser system. The photo quality is what you’d expect from this standard pairing, and works great for snsapshot prints. No Ofoto-style color tweaking going on here.
  • Metadata: Neither the index print or the text on the back of the photos show the title that you enter into Flickr, or the EXIF capture date. Only the date of development is printed, along with “Yahoo!_Flickr” on the back — not too helpful.

Flickr print service sample

Overall, a well executed service that has been long awaited on Flickr. I’ll probably try a matte print, and larger sizes soon.

One lonely comment

Linking

Links provided by kottke.org.

Offering

Syndicating