Bootcamp
Search…
A.3: Complexity Analysis, Big-O Notation

# Learning Objectives

By the end of this lesson, you should:
• be familiar with Big-O
• understand how space and time complexity is calculated
• be familiar with the common complexity functions and their ranking order

# Introduction

After you finish writing some code, how can you describe how fast it is? How can you describe how much memory your code takes up?
In this section we'll begin to describe the way that computer science defines the efficiency of a given set of code.
We'll begin with this simple example and describe its properties.
// a function that takes an array as a parameter
// and returns true if the array contains the integer 2
function arrayHasTwo(arr){
// set a default value of not found
var found = false;
for(var i=0; i<arr.length; i++){
if( arr[i] === 2 ){
// if 2 is found, set to true
found = true;
}
}
// return result
return found;
}

# Time Complexity

When we run this code it could take differing amounts of time.
var list = [1,2,3,3];
// the function iterates 4 times in the loop
var result = arrayHasTwo(list);
var list2 = [1,2,3,4,5,6,7,8,9,3];
// the function iterates 10 times in the loop
var result2 = arrayHasTwo(list2);
Our function runs slower if the input is bigger. What we are working towards is a generic way to describe how long something takes to run. Mathematically we can express this as a (mathematical) function of the length of the parameter array. The longer the array is, the longer it will take the code to run.
We refer to the size of the input array as n.
Big-O - linear complexity.
If we plotted the growth of time and of data in n on a graph it would look linear, i.e., a straight line. For each additional item in the input array, the "speed" of the function increases in step. The behaviour of time as n grows is the first property of complexity analysis. It assumes an input of varying length, and we can analyze what happens when that input grows or shrinks.

### What are We Measuring?

The Computer Science analysis of this speed / complexity makes no assumptions about the properties of this data, so it always assumes that the input data to the function can be zero to infinity length, as opposed to real-world data, which is very dependant on the specific case. The code that looks for a letter in the alphabet needs to be very different from the code that looks for a name amongst millions of users. Also note the lack of graph axis units in the figure above. The kind of measurement we're doing is called asymptotic analysis. It also assumes a theoretical "unit" of computation where the absolute time in seconds or hours doesn't matter. When we use Big-O to analyze code a naive approach would be to count each line of code and count each loop iteration or if a function is called that has a loop in it. Sometimes we'll count built-in functions such as `Array.unshift` because we'll know that the built-in function already has some inherent run-time complexity. (We'll note each of these special cases when they come up.)

# Big-O

So generally speaking, we measure the speed of a function or given section of code by counting the:
• number of lines of code
• iterations in a loop
• also count functions that we know contain loops
In Computer Science, the definition of speed is more specific than that. There are several specific measures of speed, including Big Theta and Big Omega, but we'll only be talking about Big-O. All of these notations define a mathematical expression for the speed behaviour of a function. We won't really be concerned with the mathematical definition of these, since for non-theoretical programs, Big-O is the most useful.

## Worst Case

When talking about Big-O we always talk about the worst-case performance of the function. That is, we don't consider what happens when the input data happens to be formatted in our favor- e.g., what happens if 2 is randomly at the beginning of some of our arrays? Big-O isn't concerned with these probabilities.
If we edit our function above we can see how it becomes more efficient, but that the Big-O remians the same. We can get rid of line 5, and just return true if and when two is found.
// a function that takes an array as a parameter
// and returns true if the array contains the integer 2
function arrayHasTwo(arr){
// set a default value of not found
// var found = false;
for(var i=0; i<arr.length; i++){
if( arr[i] === 2 ){
// if 2 is found, set to true
return true;
}
}
// return result
return false;
}
Then we can run our function:
var list = [1,2,4,4];
// the function iterates 4 times in the loop
var result = arrayHasTwo(list);
var list2 = [1,2,3,4,5,6,7,8,0,3];
// the function iterates 10 times in the loop
var result2 = arrayHasTwo(list2);
With the early return statement we've actually saved 2 iterations for `list` and 8 for `list2`! But Computer Science doesn't care- we haven't defined anywhere in our `arrayHasTwo` function or anywhere else about where the 2 might fall inside the array. The worst case performance of this function is still O(n).
Big-O doesn't care about any specific set of data, it also doesn't care about if the data is sorted.
Note that when calculating Big-O, if the problem statement itself defines the input array as always being sorted then this would be taken into account. If, given the above code it was stated that the input array is always sorted and starts with a value of zero, then the Big-O of the above code would be O(3) (and thus O(1))- the worst case performance is that the loop runs 3 times from zero to 2.

## Coefficients

When expressing Big-O notation, we only consider the variable with the largest order of complexity, and remove any summed variables of lower complexity. When evaluating the variable of largest order of complexity, we also remove any coefficients attached to that variable, since those coefficients make minimal difference in the complexity when we consider large numbers where the complexity matters. For example, if I determined that my algorithm ran in complexity of 2n² + c time, I would remove the c because it is of lower complexity than , and I would remove the 2 in 2n² because it is a coefficient. I would then be left with O(n²).

### What is c?

When calculating the Big-O of a function we would consider whether the iterations of a loop are constant or if they increase with the size of n. In this example where n is the size of the larger input array, an increase in the size of the input arrays do not change the fact that there are 2 loops in the function. Changing the size of n only increases the number of times the loops run- not the number of loops running.
// a function that takes an array as a parameter
// and returns true if the array contains the integer 2
function arraysHaveTwo(arr1, arr2){
// set a default value of not found
var found1 = false;
for(let i=0; i<arr.length; i++){
if( arr[i] === 2 ){
// if 2 is found, set to true
found1 = true;
}
}
var found2 = false;
for(let i=0; i<arr.length; i++){
if( arr[i] === 2 ){
// if 2 is found, set to true
found2 = true;
}
}
return found1 && found2;
}
Run the code:
var list = [1,2,4,4];
var list2 = [1,2,3,4,5,6,7,8,0,3];
var result = arraysHaveTwo(list1,list2);
Nor does n necessarily increase if the loops are nested. In this example the loop will only ever run twice. So c is 2, and for Big-O we would reduce O(2n) to O(n).
// a function that takes an array as a parameter
// and returns true if the array contains the integer 2
function arrayHasZeroOrOne(arr){
var found = false;
// run the array search twice
// once for zero, once for 1
for(let i=0; i<2; i++){
// set the value we want to find
var find = i;
// look through the array
for(let i=0; i<arr.length; i++){
if( arr[i] === find ){
// if 2 is found, set to true
found = true;
}
}
}
return found;
}
Run the code:
var list = [1,2,3,4,5,6,7,8,0,3];
var result = arraysHaveTwo(list);
Vocabulary
We'll be describing code (Python and JavaScript) functions and the mathematical definition of a function interchangeably when discussing Big-O. For our purposes here they behave equivalently.
Although technically incorrect, we'll talk about time, speed and complexity interchangeably, because, in the end, Big-O is telling us about the amount of time our code runs in. We don't need to worry about the theoretical distinction between these three words.
The "O" in Big-O apparently stands for "Order".

## Space Complexity

The other thing we can measure with Big-O is the amount of space, or memory a function takes up.
// a function that takes an array as a parameter
// and returns true if the array contains the integer 2
function buildArray(count){
// set a default value of not found
var result = [];
for(var i=0; i<count; i++){
result.push(Math.random());
}
return result
}
We can say that this function has O(n) complexity and also takes up O(n) space. When this function runs the result variable will take up n space. (Note that it doesn't matter whether or not we return anything at the end, just that a growing array existed while the function was running.)

### Reference vs. Value Space Complexity

Manipulating an array is also not the same as "taking up space". In the following code, although we create new variables for each loop iteration, we are not allocating any new space with this function. Thus it has a space complexity of O(1).
function shuffleArray(array){
for(var i=0; i<array.length; i++){
// save the first element value
var temp = array;
var tempIndex = Math.floor( Math.random() * array.length );
// switch a random location into the first element
array = array[tempIndex];
// replace the value in that random location
array[tempIndex] = temp;
}
// no need to return bc we did not allocate a
// new array- we manipulated this array by value.
//return array;
}
When we call this function we don't need to capture the return value- we manipulated the array by reference- running the function hasn't taken any new space.
var list = [1,2,3,4,5,6,7,8,0,3];
console.log(list); //this is the original version of the array.
shuffleArray(list);
console.log(list); // this will be the shuffled array.

# Big-O Order of Complexity

From the following chart by Geeks for Geeks, we can visualise the differences in complexity for various Big-O values from O(1) to O(n!). This shows us that there is indeed a large difference between each level of Big-O complexity.

# What is a "good" algorithm?

As defined above, the Big-O of a given function may seem like it's unviable or a "wrong" solution. However, many algorithms in the real-world run in "bad" or "worst" time complexity. The nature of the problem itself dictates the complexity of the solution, and there is no correct or incorrect solution to a problem.
One of the most classically intractable Computer Science algorithm problems, the Travelling Salesman Problem is a kind of problem that has to be solved constantly in real life. There is no perfect work-around for the fact that it runs in a minimum of exponential time.

# Big-O Common Functions

## Constant Time: `O(1)`

### Description

Also known as constant time, O(1) time complexity means that regardless of how big the input of an algorithm is, the algorithm will still be able to complete in a fixed number of iterations.
O(1) space complexity means regardless of input size, the algorithm can complete using a fixed amount of additional space, e.g. a fixed number of variables or data structure size excluding the input data.

### Example: Find the median element in a sorted array

Given a sorted input array of length n, find the median element of the array. Median refers to the element that is ranked n/2 in the sorting order. One might be tempted to do a for loop over the array and return when the index is _n/2, but this would be an O(n_) algorithm because the number of iterations is linear based on n. Instead, since the array is sorted, we can access the middle element directly with `arr[len(arr)/2]`. This operation does not depend on the size of n, since the array length is readily available and accessing elements in an array by index is a single operation. This algorithm that immediately retrieves the middle element by index runs in O(1) time.
The above algorithm also runs in O(1) space complexity because it does not require additional variables other than the input array.

## Log Time: `O(logn)`

### Description

Also known as logarithmic or logn ("log n") time, O(logn) complexity means that the algorithm will need to perform a number of operations proportional to the log of the size of the input. Without getting into too much math, log(n) represents the exponent necessary to raise a constant number (typically 2 or 10, the exact number doesn't matter in Big-O) to the value of n. For example, assuming a log of base 2, log(2) would be 1, and _log(1024) would be 10. Notice how even though the value of `n` increased by 1000+ between the 2 examples, the value of log(n) only increased by roughly 10. Thus log(n_) is significantly less than n at large numbers.

### Example: Find the element with value `x` in a sorted array

Given a sorted input array of length n, find the position of an element with value x. A naïve solution would be to iterate over every element in the array until we find x, but this would run in O(n) or linear time because we would need to search through n elements at most, in the event that x were at the end of the array. A more optimal solution would be to perform binary search over the array, which runs in O(log_n**) **time. Binary search works by splitting a sorted array into halves recursively until it finds the relevant element. This would require us to compare at most on the order of log(n) elements. For example, if my input array were numbers from 1 to 1000, e.g. [1, 2, ..., 1000], to perform binary search on this array, I would first look for the middle element at index _n/2, and compare it with x. If `x == arr[n/2]`, return `n/2` as the index of x. If `x > arr[n/2]`, repeat the same process on the right half of arr. If `x < arr[n/2]`, repeat the same process on the left half of `arr`. Read more about binary search here.
The above algorithm runs in O(1) space because it does not require additional variables beyond the input elements.

## Linear Time: `O(n)`

### Description

Known as linear time, O(n) complexity means that the algorithm needs to inspect at most each element of the input a constant number of times.

### Example: Find the largest number in an unsorted array

Given an unsorted input array of numbers of length n, find the position of the largest number. We cannot perform binary search on this array because it is unsorted and we do not know the value of the largest number. If the array were sorted, we could extract the last element of the array which would be the largest number in O(1) time. The fastest sorting algorithms take O(nlogn) time complexity. Thus the best method we have is to iterate over each element of the array linearly (i.e. a for loop) until we reach the end of the array and confirm the largest number, which would take O(n**) **time complexity.
The above algorithm runs in O(1) space because it does not require additional variables beyond the input elements.

## Loglinear Time: `O(nlogn)`

### Description

O(nlogn) complexity typically involves performing a binary-search-like operation for every element in the input.

### Example: Sort an unsorted array of numbers

It's rare to invent a sorting algorithm from scratch. Computer scientists have been working on efficient sorting algorithms since the start of the field, and so far they have concluded that Merge Sort is the fastest sorting algorithm (in the worst case) that exists today, and it runs in O(nlogn) time. Thus whenever we are given a problem to sort an unordered list, our first instinct should be to think that this could run in O(nlogn) time.
Companies rarely ask candidates to implement sorting algorithms such as merge sort in interviews because it's the kind of answer that can be memorised. It doesn't hurt to memorise how common sorting algorithms such as merge sort work, but there's no need to spend too much time on them other than to understand how they work.
The above Merge Sort algorithm runs in O(n) space because at any given time it requires temporary storage space on the order of n elements in addition to the input data.

## Quadratic Time: `O(n²)`

### Description

O(n²) complexity typically involves nested loops over an input collection.

### Example: Find all pairs of numbers in an array

Given an array of numbers, find all unordered pairs of those numbers. This problem requires us to loop through each element of the array, and for each element of the array, loop through each other element of the array. This is a classic O(n²) -complexity algorithm.
The above algorithm requires O(n) space during calculation, but its output will require space proportional to the size of the output, roughly nC2 or "n choose 2".

## Exponential Time: `O(2ⁿ)`

### Description

Also known as exponential complexity, `O(2ⁿ)` complexity (and any `O(Cⁿ)` complexity where C is a constant) is the highest order of complexity (without combining it with itself or others above) and typically bad. If you find that your algorithm runs in exponential complexity in a DS&A interview, there is probably a more efficient algorithm to solve the problem.

### Example: Calculate the n-th Fibonacci Number

The Fibonacci sequence is defined as Fib(n) = Fib(n-1) + Fib(n-2), where** Fib(0) == 1** and Fib(1) == 1. The naïve way to solve Fibonacci is to write a recursive algorithm that calculates Fib(n) by calling Fib(n-1) and Fib(n-2). However, this results in 2 operations per Fib operation, and because each Fib operation runs recursively, we end up with 2ⁿ operations because it would take n levels of Fib operations to reach the base cases of Fib(0) and Fib(1). Interestingly, this problem can be solved in O(_n_**) **time complexity. See here for solution.
The O(2ⁿ) time complexity algorithm above also requires O(2ⁿ) space complexity because at any given time, there will be up to O(2ⁿ) simultaneous function calls, thus O(2ⁿ) intermediate Fib calculations in memory. However, the optimal O(n) time complexity solution only requires O(1) space complexity because it only uses a finite number of variables to calculate Fib(n), regardless of _n_.

## Factorial Time: `O(n!)`

A naive solution to finding all permutations of an array:
permutations([1, 2, 3])
where the solution would be:
[1, 2, 3]
[1, 3, 2]
[2, 1, 3]
[2, 3, 1]
[3, 1, 2]
[3, 2, 1]
is an example of a factorial complexity algorithm.
The Travelling Salesman algorithm referenced above is also an example of a problem that can be written naively as n!.

Learning Objectives
Introduction
Time Complexity
Big-O
Worst Case
Coefficients
Space Complexity
Big-O Order of Complexity
What is a "good" algorithm?
Big-O Common Functions
Constant Time: O(1)
Log Time: O(logn)
Linear Time: O(n)
Loglinear Time: O(nlogn)