Sorting and efficiency are two central topics in computer science, yet manage to be overlooked by most computer scientists of the 21st century. Sorting simply means to organize objects in a particular order, generally ascending or descending, and efficiency is a measure of performance versus time, in computer science this is with regards to programs. This we we combined the two together in order to see how something we take to be so trivial (sorting) is actually measured and pitting different sorting algorithms against one another in terms of efficieny.
Before we can analyze anything, we need to be aware of these "sorting algorithms". It turns out wikipedia lists a plethora of sorting algorithms, including bubble, insertion, selection, quick, merge, and tim sort. Consequently the prior list is sorted in order from least to most efficient. But what does it mean to be efficient? As a computer scientist we measure efficiency in terms of constant time and and the number of iterations we use within our algorithm. This "time complexity" lives within a notation denoted by "Big-O" and can represent an algorithms worst, average, and best expected run time with respect to constant time. The most common run times we use to analyze the run times of programs are denoted by the following; O(n), O(lg n), O(n lg n), O(n^2), O(2^n), O(n!), and O(1). It should be mentioned that the lower number within the brackets represent the most efficient run time.
Lets analyze selection sort, the algorithm can be found below:
def select(L):
"""Produce copy of L in sorted order
>>> select([3, 1, 2])
[1, 2, 3]
"""
L1 = L[:]
for i in range(len(L1)):
m = min(L1[i:])
j = L1.index(m,i)
L1[i], L1[j] = L1[j], L1[i]
return L1
We have a list <L1> which is being iterated through by the for loop, assuming our list length is n, iterating through n items will take O(n) time. Everything else within the loop is O(1) time approximately because we generalize variable assignments and such to this order, rendering the program O(n). This makes sense, but the line where we copy <L> to <L1> could also take O(n) steps if we were to imagine it with a for loop. This is to say the program has O(n) + O(n) time, but this is would simply be 2*O(n) which is a constant multiplied by our scaling time complexity. The key to time complexity is that we only care about scaling variables, therefore we drop the 2 and end up with O(n) time.
With this being said, tim sort is the most interesting sorting algorithm brought up this week seeing that it is a compilation of many sorts and utilizes certain aspects for certain scenarios, yielding the most optimal results. Although quick sort may be slower than selection sort when it comes to really small lists, tim sort would use the better of the two in the best case.
No comments:
Post a Comment