Applescript: Getting unique items in a list update

I love getting questions about the contents of, or topics related to, my site. Most recently, I was emailed a question about one of the older functions I have in the Applescript section. In particular, it was one for getting unique items in a list. Here’s the function…

on GetUniqueItems(sourceList)
	set itemCount to (get count of items in sourceList)
	set compiledList to {}
	--get the first item to kick off the list
	repeat with x from 1 to itemCount
		set itemFound to false
		set itemX to item x of sourceList
		if x < itemCount then
			repeat with y from (x + 1) to itemCount
				set itemY to item y of sourceList
				if itemY is itemX then set itemFound to true
			end repeat
		else
			repeat with y from 0 to (itemCount - 1)
				set itemY to item y of sourceList
				if itemY is itemX then set itemFound to true
			end repeat
		end if
		if itemFound is false then
			set end of compiledList to itemX
			exit repeat
		end if
	end repeat
	--if no items are found
	if (get count of items in compiledList) is 0 then
		return compiledList
	end if
	--find the rest of the unique items
	repeat with x from 1 to itemCount
		set itemFound to false
		set itemX to item x of sourceList
		set resultCount to (get count of items in compiledList)
		repeat with y from 1 to resultCount
			set itemY to item y of compiledList
			if itemY is itemX then set itemFound to true
		end repeat
		if itemFound is false then set end of compiledList to itemX
	end repeat
	return compiledList
end GetUniqueItems

The question was focused on why I go through the source list more than once. As soon as I saw the function after the question, I knew they were right that something was wrong. My answer essentially explained that this was one of the first useful home-brewed functions I wrote, and since it worked, it stuck, as working code is wont to do. But, honestly, I’ve reviewed this code a dozen times and it has me completely baffled as to how it works. I think there is even a whole block on there that can come out and nothing would change.

I started writing my first Applescripts in 2005, which was also my first serious foray into programming. The last time I thought about if...then statements was in high school writing BASIC for the Commodore 64 in high school. This function, according to my notes, was written in 2007 when my needs and skills were becoming more robust. This function is currently in use in several scripts today with nary an error. But, nine years of experience later and immediately that function is absolutely cringe-worthy (though only to a point considering when I wrote it), so I rewrote it. Et voilà…

on getUniqueItems(src)

	set srcCount to (count src)

	set unq to {}

	repeat with x from 1 to srcCount

		set srcItem to item x of src

		set unqCount to (count unq)
		set match to false

		repeat with y from 1 to unqCount
			set unqItem to item y of unq
			if srcItem = unqItem then
				set match to true
			end if
		end repeat

		if match is false then
			set end of unq to srcItem
		end if

	end repeat

	return unq

end getUniqueItems

Hindsight being 20/20 and all that, this a “duh!” moment. There are a couple important things to note about this.

First, my test data for these types of functions is reliable but small. This is O(n2) on the low-end of things, but almost invariably Applescripts very rarely ever deal with data sets large enough where O(nk) has enough of an impact to get a coffee and sandwich while waiting. My personal experience and preference is that if that were the case, then I need to go find a more appropriate tool for data prep.

Second (and last), this block…

set unqCount to (count unq)
set match to false

repeat with y from 1 to unqCount
	set unqItem to item y of unq
	if srcItem = unqItem then
		set match to true
	end if
end repeat

if match is false then
	set end of unq to srcItem
end if

…could be replaced with this common Applescript hook…

if unq does not contain srcItem then
	set end of unq to srcItem
end if

The only problem with this, as I see it, is when trying to compare custom data types as opposed to core data types. This is great if I only ever worked with Applescript’s core data types, like string, number, date, and the like. But almost all of my Applescript code has been targeted to Adobe’s Creative Cloud, which brings a wealth of custom objects with loads of properties with which to work. I think leaving in the extra code (and any possible hits on speed since this not baked into the language like contains) is reasonable for the sake of easy customization later. By way of example, this…

-- compare memory addresses
if srcItem = unqItem then

…becomes this in a pinch…

-- compare object properties
if foo of srcItem = foo of unqItem then

…or even…

-- deep comparison
if my customCompare(foo of srcItem, foo of unqItem) then

So, a bit of extra code for the win. I suppose I could set up a hash table implementation to improve upon the O(n2)O(nk) range of complexity, but with Applescript work, again, it’s really not worth it.

That was a really great question on a number of levels. Not just that this shows that people actually read the site on occasion and finds something useful, which is the core goal of the site (this blog is really more of just a place to vent that offers me more flexibility than other blogging sites or social media) but also to be compelled to review and improve old code and find just how far I have advanced over the years. Win-win.