I've spent a bit of time this week thinking about QuickSort. More specifically, I've been thinking about the complexity of the Quicksort worst case. On an already sorted set, Quicksort has a complexity of O(n^2), which I guess I understand on a basic level. Generally speaking, when the set is already sorted, Quicksort needs to compare every element to every other element each time it partitions the set. Partition n times, doing n comparisons, then you have n^2 comparisons. Makes sense, but instead of just leaving it, I've been trying to grasp it more definetly. In my mind, the comparisons on subsequent sets should be getting smaller. ie n-1, n-2, n-3 comparisons and so on.
I thought perhaps coding up a measurement of the complexity would help me wrap my head around it.
After a bit of messing around trying to decide what should be part of the complexity measurement, you'll see I decided on counting the check on the length of the array, and the greater than and less than checks on values. It all seemed to make perfect sense, but my numbers kept coming out wrong.
On an already sorted array of length 3, instead of getting back 9 operations, I got 11. A set of 6? 41 operations. It was making me insane, but then I realized that the actual complexity of the worst case ended up being O(n^2 + (n - 1)). The n-1 comes from the length checks on all of the empty sets. This promptly made me remember the way big O notation is handled, where the slowest growing elements of the complexity are discarded. What is the growth of n-1 in comparison to the speed of n^2? Insignificant.
It was a nice bonus to figure that all out, but I still don't think I'm satisifed. The issue with n-1 should be a lesson that understanding the complexity of these algorithms is less about actual exact measurement, and more about mathematical understanding of the situation. So, while I think I "get" it enough to get by, I'm going to spend a bit of time re-familiarizing myself with the math.
Anyway, I put the code up here. There is also an implementation of the in-place version of Quicksort there as a bonus.
As a quick aside, I realize that picking the first element in the array as a pivot is bad practice, as it often triggers worst case behavior. That was my intention here.
Update: I've spent a bit more time reading after writing this post, and I've come to the following understanding of Quicksort: in the best case, we evenly divide the array into roughly half the size, so we recursively build log n levels before we reach our base case, a list of size one. On each level, there is O(n) amount of work done, which gives us the case, O(n log n). By the same token, when we encounter the worst case, we must recursively divide the set n times before we get to the base case, giving us n levels with O(n) complexity, coming out to the worst case O(n^2).
Instead of a blog post here, this week I opted to write about bread over at my (and Hillary's) cooking blog: Us Versus Dinner. If you're into that sort of thing, or just like looking at pictures of tasty bread, give it a look. I'll be back here next week with something more appropriately nerdy.
Version control is the software miracle that nobody seems to know about until they get their first job. I've read plenty of forum posts from computer science students asking "What don't they teach you in school?", and inevitably the first response is about version control. It was true for me.
Even for solo development VC is great, and since I've known about it, I've hosted all of my code in a private Subversion repository. This makes working on my code from anywhere easy, and has the added benefit of keeping a backup of my work.
Just as I was becoming comfortable with Subversion, newer distributed version control systems popped up. As the hype around both Git and Mercurial grew, I decided it would be in my best interest to bring myself up to speed. At work we considered switching to Mercurial, mainly on the grounds that it had wider operating system compatibility, so I created myself a Bitbucket account to aid in learning. At the time, Github was exclusively git, and Bitbucket exclusively mercurial.
Both Bitbucket and Github brand themselves as "social coding" websites, but even as I began to use my Bitbucket for all my non-private projects, I never really thought about it. My main reasoning for using it was as an added facet to my resume. Because of the nature of programming, it is sometimes difficult to express your ability level, especially in an interview. Being able to point someone at a big pile of code you've written is helpful.
I really discovered why the "social" aspect of these sites is a big selling point a couple of weeks ago. I posted about GreyHolder, a little jQuery plugin that I wrote and released on my Bitbucket. What I didn't realize was that is contained a bug that could potentially throw away user input. Not very cute behavior for something that is supposed to make life easier on the user. I didn't notice the bug, but my friend Dan did. Because the code was on Bitbucket, he was able to grab it, fix the bug and then send me a pull request to grab the fix. That meant that all I had to do to fix the problem was click a button accepting his new code.
I don't know if you've ever fixed a bug with the press of a button, but it's great, and it should be (and is) the main selling point for code hosting sites. That capability is the reason I'm going to be using code hosting for all my future projects, even the private ones.
I've never been the biggest fan of Facebook. I have an account, and I check in on it occasionally, but it has never really rubbed me the right way. Why that was the case wasn't clear to me until recently.
Facebook reminds me of AOL. When internet penetration was really ramping up, but before broadband was widely available, AOL was the way that almost everyone got online. AOL wasn't (isn't?) just internet access, it was an entire monolithic internet user experience. For most people, AOL email, AOL chatrooms, AOL instant messaging and AOL content was the internet, and the AOL internet was not very compatible with the rest of the internet. This is what Facebook reminds me of.
Facebook seems to be aiming for the same all encompassing online experience. Facebook can feasibly replace your email, twitter, instant messager, blog, photo sharing service and any online games you might play. There have also been changes recently to try and make facebook your source for news. I suppose all of this is convenient for the user, but I don't like having my entire online presence tied up with one company. Not only does it give them a lot of power, but it nudges you into using whatever implementation of a given technology they are pushing, with no mind to quality. I don't mind facebook keeping track of my friends comings and goings, but when it comes to disseminating information, I prefer twitter. I'd rather maintain my own blog and use a third party instant messaging client because while those service are on facebook, other services do them better.
When I look at the online presences of prominent technology people, they also seem to be spread across different services, picking and choosing the ones that they like, and unlike facebook, I can still check them out without strapping into the service they use. While part of the reason people are so locked into facebook is the lower technological bar for entry, I think as people become more savvy and other services become easier, facebook will start to lose some traction.
Now, AOL is still around, but in an extremely hobbled form. Will the same thing happen to facebook? It doesn't seem super likely, but at the peak of AOL, it didn't seem very likely they were going very far either. AOL had the misfortune of being struck down by broadband, and maybe facebook will have it's own broadband come and strike it down. Perhaps everyone will just leave. Tough to say.
The other day, in line with my goals for the year, I was working on my family gift distribution project White Elephant. One of my goals for the project has been to push myself toward better page and user interaction design. I've always been a back end sort of guy, telling myself that because I had the know how to code things that I had no reason to care about how those things looked or felt to a user. I've come to realize that assumption is pretty stupid. Taking time to not only make things look pretty, but also make them easy to use is what sets apart outstanding projects from the crummy ones. At this point, I'm a ways from creating the cool things that I want, so actively making myself practice is a step in the right direction.
So back to the main point. I was putting a very stripped down form into a confined space, and instead of having a label next to each input, I thought it would be neat to have some greyed out text that told the user what went in the box. When they clicked the input, the text would disappear and the text they typed would look normal, not greyed out. If you clicked away without typing anything, the original hint text would come back. It looks like this: