Tuesday, November 12, 2013

Since the university is closed on Monday and Tuesday, today's post will focus on last week's material.
Last week's lecture covered mostly on big-O analysis. One thing that I have learned is that although big-O analysis gives a rough idea about a specific algorithm's performance, it cannot alone determine an algorithm's efficiency. For instance, the selection sort and the bubble sort are both O(n^2) in terms of the number of comparisons made through the passes. But due to the nature of the bubble sort, many more exchanges occur during each pass comparing to the selection sort. Taking this into consideration, the selection sort is a much more efficient algorithm although it does not appear so from the big-O analysis. Another thing that I have learned is that when performing tasks such as searching through a list, many times the input format of the list does not affect the results of a big-O analysis. This means that the worst case efficiency for many cases is the same. For example, when searching through a unsorted list or searching through a sorted list, the worst case scenario for these two input formats are the same. In both cases, a pass through a list is required and n comparisons are required in total. Therefore both search would yield O(n). If the best cases are considered, the sorted input would be more efficient since if the item is smaller than the first item in the list no pass is required. Thus many times if want to evaluate the efficiency of an algorithm, not only that we need to consider the worst case, the best cases also need to be considered. In order to choose the ideal method, one must evaluate the form of input, not just at one instance in time but also over an entire period of time. The average of the form of the input may then be obtained and could be used to make this decision.
In terms of the assignment, I have found the matching part of the regular expression to be rather challenging. The concept of regular expression deals with the combinatorial explosion of possibilities when the length of the string is enlarged. During the assignment, I have found out that the regular expression of a string is by no means unique. Therefore, when matching an regular expression to a string , decomposing the target string itself may result in ambiguities depending on the method that is used to decompose the string. Thus a better approach would be to start from the regular expression itself and generate the possibilities of the potential matches. When it is finished, the target string can be searched within the generated list of possibilities. Although inefficient for long strings, this method always work since it decodes the regular expression itself. With this approach, I was able to math the basic regular expression with their corresponding targets. However, due to time constraints, I was unable to finish the entire 2nd part. If time permits in the future, I will try to finish the assignment and hopefully come up with a more efficient approach to this problem.     

My friend in CSC148 have talked about the Big-Theta Notation on his blog which can supplement my info that I have provided. Here is the link to his blog. http://athameemcsc148.wordpress.com/ 

1 comment:

  1. You're definitely correct about Big-O not being sufficient to truly and accurately map the efficiency of an algorithm. This is because Big-O always refers to the worst case scenario for an algorithm.

    What's interesting to note though -- despite it not being a part of CSC148's syllabus -- is that there is another measure, Big-Theta, which identifies both and upper- AND lower-bound in terms of efficiency. Effectively, it gives us an efficiency range within which an algorithm is expected to behave. I'm sure as we keep delving deeper into Computer Science, we'll learn a whole lot more about this!
    http://athameemcsc148.wordpress.com

    ReplyDelete