Name______________________________

Data Structures and Algorithms I - Lab 8

This lab covers some of the important ideas related to Searching. You should print a copy of this lab so you can record your answers to questions. Turn in the completed questions and any supporting printouts as required in the lab. Here is a link to the API documentation in case you need it.

P1: Open the project named SearchTest in the provided archive (which expands to a folder named Lab12). Open the class named SearchTest. The main method contains a loop to create an array of random integers of user selected length. The array will be used to test various searching algorithms. There is also a timer class that will be used to time the searching process.

Run the program and enter 100 as the array size. Notice the elapsed time output. The values seem to jump by large amounts. This is because the time information is extracted from one of the operating system's clocks, but not all system clocks updated in real time. The smallest amount of time that can be measured by a clock is called its resolution. Based on your observation, what is the resolution of the clock accessed by the System.currentTimeMillis() method?

Answer: It appears to be accurate to about _____________________________ of a second. What calculation did you perform to get this answer?

 

 

In Java 5, another method was added to the System class. It is nanoTime(). This returns a time value in nanoseconds. Modify the TimerFacility class to use this method. You should divide all time values by 1,000,000 to convert the numbers back to milliseconds. Integer division is fine. Verify that you get similar answers, but notice whether or not the resolution of the timer increases.

Does the timer resolution increase? ____________ What evidence leads you to make this conclusion? Answer:

 

We want to measure the time needed to search for numbers in the array. Even a sequential (linear) search on a very large array will take a very small amount of time. Using a timer to do this is not exactly reliable, but that is the method we will employ. Remember that while your program is running, many other things are using up additional processor resources. Is an instant messenger program running? Is there music streaming through your speaker? Is the operating system performing disk management or downloading important updates? Is the mouse moving, requiring repaints of the screen? Because of these tasks that may interfere with accurate timing of the search algorithm, we will have to measure the search time for many searches, and then take the average.

Complete the sequentialSearch method to find the target integer. Feplace the for loop (in main) to iterate TRIALS (this is a constant in the class) times.

for (TRIALS times){
  re = random element from array
  loc = sequential search for re
  display re, the returned location, 
      and the element at that location in the array
}

Use the ArrayUtil's getElement() method to pick a random element. Use sequentialSearch to find the location of the element. Display the element sought, the location returned by the search, and to be sure your search works, display the element at that location.

Test your program on a small array (5 values) and verify that the results are correct (that you are actually finding items at the correct location).

Comment out the statement that displays the array contents (before the loop) and comment out the output you just added in the body of the for loop. Change TRIALS to 10,000. Your program will now perform 10,000 searches and report the total elapsed time. Test your program on an array of various sizes, 5000, 10000, 50000, 100000, 500000, 1000000. Record the average time (in milliseconds) to search for one value (divide total elapsed time by TRIALS) in the first column. Be sure no output is generated during the searching. Note: depending on your processor speed, you may need to change TRIALS to a larger or smaller number. Feel free to do this. If some of the lists are too large, indicate this in the table.

Sequential Search Timing Data (all data in milliseconds)
List Size Raw Average Time per Search Loop Overhead per Search Adjusted Average Time per Search
5000      
10000      
50000      
100000      
500000      
1000000      

Now, comment out the actual call to the search (in the body of the loop) and rerun the program using the same sizes. You will get new timing data (which indicates the overhead of the looping required to perform the multiple trials). Record these in the second column. You should subtract these from the total times using the search to get accurate times for the search itself.

The time to search using the sequential search algorithm is proportional to the size of the list. At least, that is what the theoretical analysis tells us. Does you data support the theoretical expectation that sequential search time is proportional to the size of the list being searched? Write a linear equation that expresses the time (in milliseconds) to the size of the list (n)? i.e. t = mn + b

 

 

 

P2: In a list of size 100,000, how long does it take (on the average) to locate the minimum value in the array? (Note that there is a method in ArrayUtil to tell you the minimum entry in the array). What about the maximum?

Answer: Using sequential search in a list of 100,000 integers,

it takes __________ milliseconds to find the minimum value and _________ milliseconds to find the maximum value.

Add a statement to main that will sort the array of data before you start the search trial. java.util.Arrays.sort will do the job nicely. Be sure to display the array and search results for a small example to be sure the answers are correct. Then answer the question about finding the max and min value again.

Answer: Using sequential search in a sorted list of 100,000 integers,

it takes __________ milliseconds to find the minimum value and _________ milliseconds to find the maximum value.

Explain the differences in times between this experiment and the previous one.:

 

 

 

P3: Included in the program is a method named binarySearch that can be used to find elements in a sorted array. It is generally more efficient than sequential search. Modify the call in main to use the new search method. Subject it to the same sequence of tests as above. Remember that the array must be sorted for binary search to work. Be sure to test the algorthm on a simple case and display the results for verification before going on. Fill in the data as before.

Binary Search Timing Data (all data in milliseconds)
List Size log(size) Raw Average Time per Search Loop Overhead per Search Adjusted Average Time per Search
5000        
10000        
50000        
100000        
500000        
1000000        

Theoretically, the running time for this algorithm should be t = m log(n) + c (for some constants m and c). Fill in the log (n) column (use common logarithms) and see if you can write a linear equation connecting t to log(n). What is your equation? i.e. t = m x + b (where x is the log(list size))

 

 

Redo the tests for locating the max and min values:

Answer: Using binary search in a list of 100,000 integers,

it takes __________ milliseconds to find the minimum value and _________ milliseconds to find the maximum value.

Comment on these results:

P4: Modify the call for the search to use the Arrays.binarySearch method. Rerun the timing tests and compare the speeds of your implementation to those of the algorithm provided in the Arrays class. What do you find?

 

Print the file you modified. Attach it to the lab answers. Be sure your file contains your name and the current date. Submit your answers and program solutions in a single stapled package.