Searching and sorting analysis lab

20 Oct 2021 - 14:04 | Version 1 |

Learning Objective(s)

  1. Thoroughly understand sequential search, binary search, an n^2 sort, merge sort, heap sort, and binary search tree
  2. See the implications of Big O on real data, esp. on large amounts of data
  3. Learn to write a quality report analyzing amount of work.

Overview of the problem

You are to implement a number of search and sorts and then do an analysis of the work done and write a resulting report

Specifics

  • For this task, you are reading a file of integers (one per line) as the data being searched. You can assume that there will be no more than 1,000,000 numbers in the file.
  • A second file will continue another list of integers that you search to see if each number if in the first list.
  • The files are specified via command line arguments with the first argument being the data to be possibly sorted and searched. The second argument is the list of numbers to be searched for.
  • Searches
    • implement a sequential search on unordered data
    • implement a binary search on ordered data
    • implement a search on a binary search tree
  • Sorts
    • implement a n^2 sort of your choice (selection, insertion, bubble)
    • implement the merge sort
    • implement the heap sort
    • implement the add to the binary search tree. It is recommended that you use a linked implementation here
  • Tasks:
    • Load the data into an array.
    • For the sequential search on unordered data, report the number of compares it takes to look up each number and the total number of compares it takes to look up all numbers (using the second file)
    • For the binary search tree, report the number of compares and moves (or swaps) it takes to build the binary search tree. Then report the number of compares it takes to look up each number and the total number of compares it takes to look up all numbers (using the second file).
    • For each sort, sort the data from the first files and report the number of compares and moves (or swaps → be consistent when possible). Then report the number of compares it takes to look up each number and the total number of compares it takes to look up all numbers (using the second file).
  • Run the tasks on a decent number of cases including:
    • data that is almost in order
    • data that is almost in reverse order
    • data that is in random order
    • tiny amount of data in the first file (< 100)
    • small amount of data in the first file
    • medium amount of data in the first file
    • large amount of data in the first file
    • a few numbers in the second file
    • a medium number of numbers in the second file
    • a large number of numbers in the second file
  • Number of classes
    • the file(s) that contain the code to sort and search can do no I/O. Therefore you will need at least two different classes.
    • it is recommended but not required that you consider a separate class for each sort. If you choose not to, make sure you employ quality loose coupling!

Design components

  • Indicate where you are counting the compares and moves
  • Be sure to provide ways to get back the number of compares and the number of moves
  • Spend effort on your testing data
  • For this implementation, your tree can know that it is storing integers if you wish but you may also keep it independent.

Report

  • Run enough examples of each type to get a good set of data points. (3 or 4 data points per type is not enough!)
  • Plot them on a graph (with the horizontal axis being the size of the data and the vertical being the number of comparisons and/or swaps)
  • Compare the generated results to the expected Big O results
  • Discuss the following:
    • How each algorithm compared to its Big O
    • When sorting first then searching is better than just searching
    • A comparisons of the sorts with each other
  • It is recommended that you use graphs in a spreadsheet to visualize the data

Optional improvements

  • Implement a sort not covered in class
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback
This website is using cookies. More info. That's Fine