This assignment involves the writing of two files — when submitting your work, please make sure that you have some reference to the part number in your file names (like part1.cpp or timingStudy.1.cpp for example).
Input for Both Parts
An input file of 1,000,000 randomly generated names (one per line) is posted in this module. There are a few different versions of the file and it will serve as the input to both parts of the assignment. Choose the one that best suits you:
Windows users: names_windows.txtPreview the document
Mac or Linux users: names.txtPreview the document (this has the control-M’s (extra carriage return) removed)
If you use the Cloud9 IDE, there is an upload limit which may prevent you from uploading large files. Use this compressed version, which you will need to unzip after uploading: names.zip
Part 1, Perform A Simple Timing Study
In this lab you will apply the “4-cycle timing code” posted to Canvas in this module here: timingtests.cpp.Preview the document
The purpose is to prepare you for making “big oh” determinations and confirming them with timing studies.
How to Use timingtests.cpp
Th code in this file reads a predetermined number of lines from the data file into a vector. This can be changed by changing the constant NLINES at the top of the file. You probably won’t need to change NLINES but keep in mind that the data file itself contains only one million lines so NLINES cannot be larger than that.
For each part of this lab, use the file like this:
Add your code to the timed section. The clock() calls should come before and after your code, to record the timing.
Set the string variable ‘bigOh’ to your guess as to the “big O” of your code (see the file for the options).
Set the variable ‘n’ to the input size. Start with the default size, then increase if you see 0’s or numbers that look strange.
Run your program.
The output should show growth of the time according to what you guessed:
The first entry is a baseline.
The program will double the input every time through the loop.
Each subsequent line is the timing of your code with the doubling of the input. For example, if you selected “n” and the first output timing was 2 seconds, the next one would be 4, then 8, 16, etc.
Re-adjust ‘n’ and/or ‘bigOh’ as needed and run again, if the numbers do not match your expectations.
Your Code for Part 1
Write a C++ console app, applying the timing test code to an operation that requires scanning the entire vector of names read from the file.
You’re looking for some operation that’s close to O(n) here. Here are a few examples of loops that would work:
Finds the shortest name in the array
Finds the longest name in the array
Counts how many times a particular first name occurs, like “Victoria”
To perform the timing tests, you will set the starting size with the variable ‘n’ at the top of main(), which controls how many lines your algorithm will use for input. Also set ‘bigOh’, which represents your “guess” about your algorithm.
Start with n = 50,000 for the first cycle, and set bigOh to O(n) — that is, we expect that the time it takes to read the file is directly proportional to the number of lines read from the file.
It is possible that caching may throw off the timing for the first cycle, so run your program more than once to see it work correctly. Your timings will not be an exact match for the ‘expected’ amounts each time, but should be close.
0.0002 (expected O(n)) for n=50000
0.0004 (expected 0.0004) for n=100000
0.0008 (expected 0.0007) for n=200000
0.0015 (expected 0.0015) for n=400000
As n increases each time through the loop, you should see the timing increase proportionally, i.e. doubling the input should double the processing time.
If you’re seeing numbers that don’t make sense, your computer may be too fast! Try increasing NLINES and the initial ‘n’ to a larger number. Remember that ‘n’ is doubled every time through the loop, so make sure your last doubling doesn’t exceed the available data (1M lines).
Part 2, Benchmark a Nested Loop Sort
In the second part you will make your own determination for the big oh of nested for-loop sorting of an array, and confirm your determination.
The nested for()-loop sort should use to this algorithm (with n as the number of strings):
for (int i = 0; i < n; i++)
for (int j = i + 1; j < n; j++)
if (a[j] < a[i])
Your Code for Part 2
Write a C++ console app, applying the same timing test code from the module, to sort the names in your vector in ascending order. You decide what you expect the “big oh” to be, and what n to use for the first cycle.
Write your app to do the following:
Start the timer, perform the nested for()-loop sort, and stop the timer.
Write a few lines of test code to verify that each string in the array is greater or equal to the string before it, starting with the 2nd string. Use assert, like this:
for (int i = 1; i < n; i++)
assert (a[i – 1] <= a[i]);
To use assert() you may need to include the header file . These “assertions” are used to abort the program if some unknown condition occurs, and are mainly enabled in debug/developer builds of the code, where the programmer wants to catch abnormal conditions as soon as possible. In “production” code (deployed into the real world), assertions are usually removed.
Output the results, which will look similar to part 1 but with different numbers.
Submit 2 Files
Submit the two .cpp files (one for each part). Do not submit the TXT input file — it should be unchanged.