An important problem in computing is optimising how we search for information.
- Understand different methods of searching arrays
- Implement different methods of searching arrays
The information we're interested in finding has to be stored somehow. Finding optimal ways of somehow is where different types of data structures come in - we'll look at different structures later in the course. However, in essence, the idea is to create useful data structures for storing information and to develop efficient algorithms for processing that information - often, one leads to the other and visa-versa.
Here, the process we're interested in is searching, and we can consider that by using one of the simpler data structures (one which we've considered often, so far) - arrays.
In computer science, the obvious way to store a collection of items is in an array, whereby elements of the array are typically stored in a continuous sequence of computer memory locations.
As you know, a list is represented in javascript thus:
const xs = [2187, 27, 19683, 9, 243, 3, 729, 81, 6561];
The array xs
has 9 elements. In javascript, we find the size of an array via the array prototype property length, so here, xs.length
is 9. In everyday life, we usually start counting from 1; unfortunately, with arrays in computer science, we more often than not start counting from 0, and that is true for javascript. Hence, for our array xs
, its positions are 0, 1, 2, . . . , 7, 8. The element in the 8th (last) position is 6561, and although we could use the notation xs[8]
to denote this element, it is usually more useful to use the array's length, e.g. xs[xs.length - 1]
(since we start counting from 0, not 1). More generally, for any integer i denoting a position, we write xs[i]
to denote the element in the ith position. This position i is called an index, and we say that the individual items xs[i]
in the array xs
are accessed using their index i.
We can access our array elements by jumping straight to a location via its index, e.g. xs[0]
(2187), xs[1]
(27), xs[2]
(19683) etc., but since algorithms that process data stored in arrays will typically need to visit all the items in the array and apply appropriate operations on those items, it's often more useful to move sequentially through the array by incrementing or decrementing that index automatically.
In pseudo code, this would typically take the general form:
for i = 0, 1, ..., n-1
do something
In javascript (and many other languages), we have this basic form (among others):
for( initialisation ; condition ; update ) {
//do something
}
e.g.
let sum = 0;
for( let i = 0; i < xs.length ; i++ ) {
sum += xs[i];
}
(you can also do more interesting initialisations, e.g.)
for (let [i, sum] = [0, 0]; i < xs.length; i++) {
sum += xs[i];
console.log(i, sum);
}
It's this basic form of looping through an array that we're going to use to explore searching, next.
Consider again our array:
const xs = [2187, 27, 19683, 9, 243, 3, 729, 81, 6561];
If we ask where the item 27 is in the array xs
, the answer is 1 because that's the index of that item. If we ask where the item 60 is in the array, the answer is nowhere (-1, null, undefined, etc.), because the item 60 is not present in xs
.
You are now asked to write an algorithm that finds whether the number x is in the array xs
, and if so, return its index.
We can use pseudo code to express a simple algorithm for searching through an array:
for i = 0, 1, ..., n-1
if xs[i] is equal to x
return i
If we reach this point then x is not in xs
return -1
What's the time complexity of the algorithm above?
While the simple linear seach satisfies the requirements of finding x in xs, might there be better ways of reaching the same result? Indeed, you should always consider whether it is possible to improve upon the performance of a particular algorithm. In this case, a more efficient method is a binary search.
A binary search relies on an ordered list, so first, we need to sort xs
:
const xs = [3, 9, 27, 81, 243, 729, 2187, 6561, 19683];
Below describes the binary search algorithm:
Set left = 0 and right = n-1 (where n is the length of the list, xs)
While left is less than right
set mid to be the integer part of (left+right)/2
if x is greater than xs[mid]
set left to mid+1,
otherwise
set right to mid
If xs[left] is equal to x
return left,
otherwise
return -1
A binary search has time complexity of O(log n). To realise the implication of this, consider an array of 1 million items. Now, log21000000 is approximately 20. Hence, for an array of size 1000000, only 20 iterations are needed in the worst case of the binary-search algorithm, whereas 1000000 are needed in the worst case of the linear-search algorithm.
- Fork this repository and clone the fork to your machine
- Run
npm ci
to install project dependencies - Implement each of the empty search functions inside the src directory
- Run
npm run test
to test your code - Add timing results for each of your search algorithms into the file searchResults.md (tip:
performance.now()
might be useful here, see: https://developer.mozilla.org/en-US/docs/Web/API/Performance/now). Which ran fastest? And slowest? Given your understanding of Big O complexity of the different search algorithms, were those results what you expected? Add that summary to the file, too