Page 3: Is ‘Seperate’ Spelled Correctly?

Unit 5, Lab 1, Page 3

On this page, you will use a binary search algorithm to search efficiently in a sorted list.

Answering this question is simple in Snap! because you can just ask a dictionary: 100000-words-contains-seperate.png. But “simple” doesn’t mean fast. The contains block still has to look at each item in the list until it finds the one you are looking for (and says true) or reaches the end of the list (and says false). It is still a linear search.

When you are only looking for one thing in a list (for example, checking whether a particular word is spelled correctly) rather than finding all the words with some characteristic (for example, looking for all five-letter words), you can use the strategy from Page 1: Guess My Number to make your algorithm faster. The best strategy for that problem is a binary search algorithm: always guess the middle value and then adjust your guess based on whether it was too high or too low. That way, you eliminate half the possibilities with each guess. We can use a similar strategy to look for a word in a word list.

You could have written a simpler number guessing program: the computer could guess the number 1, then 2, then 3, and so on until it finds the number. That would be a linear search algorithm; it’s easier to code, but it takes longer to run because every time it makes a wrong guess, it eliminates only that guess. With binary search, every wrong guess eliminates half the possibilities at once.

: Binary Search

AAP-2.P.1, AAP-2.P.2

A binary search algorithm starts in the middle of a sorted list and repeatedly eliminates half the list until either the desired value is found or all elements have been eliminated.

AAP-2.O.1

You learned about traversing a list on Unit 2 Lab 2 Page 3: Checking Each Quiz Answer.

Linear search does a complete traversal of the list. Binary search saves time by doing a partial traversal of the list.

The one thing that almost everyone knows about computers is that they use binary: ones and zeros. Binary search has nothing to do with that. The word “binary” just means “two,” whether it’s two digits or two halves.

  1. Write Out Your Thoughts Compare this binary search block with your code from Page 1: Guess My Number. What parts are the same? What parts are different?

    binary search for (value) in (data) {
        script variables (low) (high) (current index) (current item)
        set (low) to (1)
        set (high) to (length of (data))
        repeat until ((low) > (high)) {
                    set (current index) to (average of (low) and (high)) #comment: find the middle word
                    set (current item) to (item (current index) of (data))
                    if ((current item) = (value)) { report (true) }
                    else {
                                if ((current item) > (value)) #comment: eliminate half the list {
                                            set (high) to ((current index) – (1))
                                } else { set (low) to ((current index) + (1)) }
                    }
        }
        report (false)
}

    The > block (as well as the < and = blocks) compares words alphabetically:

    (carrot) > (banana) reporting true

    (apple) > (banana) reporting false

  2. Check whether “seperate” is spelled correctly by using binary search to look for the word in the sorted list 100,000 words (sorted).

  3. Try binary search with some words that you know are spelled correctly and some that you know are incorrect.

  4. Now use binary search to search for the same words in the unsorted 100,000 words.

  5. Talk with Your Partner Why does binary searching work on one list but not the other? Consider how the binary search algorithm works.

The contains block can’t make any assumptions about the ordering in list you are searching, but if you are looking for one thing in a sorted, list you can use binary search.

Two things affect whether you can use a binary search algorithm to make your program more efficient:

  • What question you are trying to answer? Are you searching for one thing in a list or are you finding all the things in the list with some characteristic?

  • What is the condition of your data? Are they sorted or unsorted?

find one value find many values
sorted data can use binary search must use linear search
unsorted data must use linear search must use linear search

If you are working with short lists, it’s not so important which algorithm you use. It’s when you are dealing with a lot of data that you have to think carefully about the algorithm.

  1. Build a spell-checker.

    Use split () by (whitespace) to convert the input text into a list.

  2. Talk with Your PartnerShould your spell-checker look through the dictionary for each word of the text or look through the text for each word of the dictionary?