Evanalysis
3.1Embedded interactionEstimated reading time: 13 min

3.1 Complexity growth and algorithmic cost

Develop a rigorous cost model, simplify asymptotic expressions correctly, and use concrete code to read growth rates instead of guessing them.

Course contents

CSCI2520: Data structures

Structured notes for CSCI2520 data-structure foundations with operation-level reasoning and selective interactive demonstrations.

Chapter 0Programming foundations1 sections
Chapter 2Lists and recursion1 sections
Chapter 4Trees and BSTs1 sections

Why complexity comes before optimization

The lecture slides are right to begin with growth, not with micro-optimizations. When input size n becomes large, the way cost scales usually matters more than the exact machine that happens to run the code.

Big-O analysis is therefore a model of growth, not a stopwatch reading. Different machines will change constant factors, but they do not change whether the underlying pattern is linear, quadratic, or logarithmic.

Definition

Asymptotic upper bound (Big-O)

T(n)=O(f(n))T(n)=O(f(n)) means there exist constants C>0C>0 and n0n_0 such that T(n)Cf(n)T(n)\le C f(n) for every nn0n\ge n_0.

That definition is a proof obligation. To use Big-O correctly, you need to show the bound with a valid threshold, not just say “it looks quadratic”.

The simplification rules are not shortcuts; they are the point

The tutorial slides repeatedly emphasize two rules:

  • drop terms whose contribution becomes insignificant for large n,
  • drop constant factors.

That is why expressions such as

0.1n^3 + 10000n^2 + n + 3

still simplify to O(n3)O(n^3). The cubic term eventually dominates everything else. Likewise,

(n^2 + n)/2

still has growth rate O(n2)O(n^2) because the 1/2 does not change the asymptotic shape.

Worked example

Why n^2 dominates n

Compare the ratio

n^2 / n = n.

The ratio grows without bound, so n2n^2 eventually becomes much larger than n. That is why linear terms can be dropped when a quadratic term is present.

Common growth classes in CSCI2520

The course slides and tutorials repeatedly use the same five classes.

  • O(1): direct field access, known-pointer re-linking, simple arithmetic.
  • O(log n): halving-style search, balanced-tree lookup, repeated division by 2.
  • O(n): full scan over an array or linked list.
  • O(n log n): divide-and-conquer with linear work per level.
  • O(n2)O(n^2): nested scans, especially comparison-heavy double loops.

Read and try

Compare asymptotic growth at one n

The widget now ties each growth class to a concrete code shape, so readers can change n and see how the chosen sample behaves against the comparison table.

Choose an algorithm shape

Code sample

for (int width = 1; width < n; width *= 2) {
  for (int i = 0; i < n; i += 2 * width) {
    merge_block(i, width);
  }
}

Growth class: O(n log n)

Estimated primitive steps: 64.00

Interpretation: There are about log n rounds, and each round still touches a linear amount of data.

ClassValue at n=16
O(1)1.00
O(log n)4.00
O(n)16.00
O(n log n)64.00
O(n^2)256.00

The comparator widget is a visual aid, not a substitute for derivation. Use it to confirm intuition after you have already written down the growth model.

Reading complexity from code

The fastest way to analyze a function is usually to find the piece executed most often and count how many times that piece runs.

Worked example

Constant-time statements

int x = a + b;
int y = x * 2;
return y;

Each statement executes once, so the function is O(1). The exact CPU cost is not important; the growth does not depend on n.

Worked example

Linear scan

int sum(const int a[], int n) {
   int total = 0;
   for (int i = 0; i < n; i++) {
      total += a[i];
   }
   return total;
}

The loop body runs n times, and each body step is constant time, so the whole function is O(n).

Worked example

Nested loops create quadratic cost

int countPairs(int n) {
   int c = 0;
   for (int i = 0; i < n; i++) {
      for (int j = 0; j < n; j++) {
         c++;
      }
   }
   return c;
}

The inner statement runs nnn * n times. That is why the function is O(n2)O(n^2).

A concrete sorting example: selection sort

The tutorial deck uses selection sort to show where quadratic growth comes from. In pseudocode, the algorithm scans the remaining suffix to find the next minimum and then swaps it into place.

void selectionSort(int a[], int n) {
   for (int i = 0; i < n - 1; i++) {
      int min = i;
      for (int j = i + 1; j < n; j++) {
         if (a[j] < a[min]) {
            min = j;
         }
      }
      if (min != i) {
         int tmp = a[i];
         a[i] = a[min];
         a[min] = tmp;
      }
   }
}

The exact number of comparisons is roughly

(n-1) + (n-2) + ... + 1 = n(n-1)/2.

That sum is quadratic, so the algorithm is O(n2)O(n^2). The slide deck’s message is the same in words: doubling n makes the time roughly four times larger.

Worked example

Selection-sort growth

If n doubles, n2n^2 becomes 4n24n^2. If n increases by a factor of 10, n2n^2 increases by a factor of 100.

That is the reason quadratic algorithms quickly become expensive on large inputs.

Comparison sorts versus non-comparison sorts

The sorting slides add an important refinement: merge sort, heapsort, and quicksort are all comparison sorts.

For comparison sorts, the worst-case number of comparisons has a lower bound of Ω(n log n). That means no comparison sort can beat that asymptotic barrier in the worst case.

If we want to sort faster than that lower bound, we have to avoid comparisons as the primary mechanism. That is exactly why counting sort and radix sort matter.

  • Counting sort is linear time when the key range k is manageable.
  • Radix sort repeatedly sorts by digit, typically using a stable subroutine.
  • Counting sort is not an in-place comparison sort.

The practical lesson is not “every sort is the same”. The lesson is that the data domain and the operation model determine the correct complexity class.

Why lower terms and constants disappear

The tutorial’s simplification slides use expressions like

N+1=O(N)N + 1 = O(N)

and

N3+1000N2+N=O(N3)N^3 + 1000N^2 + N = O(N^3).

The reason is that the fastest-growing term controls the long-run growth. Lower-order terms still exist, but they stop mattering once n is large enough.

Worked example

A clean simplification proof

To show that (n^2 + n)/2 = O(n^2), choose C=1C = 1 and n0=1n_0 = 1.

For every n1n \ge 1,

(n^2 + n)/2 <= (n^2 + n^2)/2 = n^2.

So the function is bounded above by a constant multiple of n2n^2 for all large enough n.

Best case, worst case, and average case

One complexity label is often not enough. The same algorithm may behave differently depending on the input arrangement.

  • Best case describes the most favorable legal input.
  • Worst case describes the most expensive legal input.
  • Average case describes the expected cost under a probability model.

Binary search is a clean example.

Worked example

Binary search does not need a full scan

int binarySearch(const int a[], int n, int target) {
   int left = 0, right = n - 1;
   while (left <= right) {
      int mid = (left + right) / 2;
      if (a[mid] == target) return mid;
      if (a[mid] < target) left = mid + 1;
      else right = mid - 1;
   }
   return -1;
}

In the best case, the first middle element already matches the target, so the cost is O(1). In the worst case, the loop keeps halving the remaining search interval, so the cost is O(log n). The crucial idea is that we do not remove one candidate at a time; we discard about half the search space at every step.

When you read complexity claims in code, always ask which case is being quoted. A worst-case statement is the default unless the analysis explicitly says otherwise.

Divide-and-conquer and the n log n pattern

The slide deck mentions merge-sort-style behavior because it is the standard example of an n log n algorithm.

The usual reasoning has two layers:

  • each round processes a linear amount of data,
  • the number of rounds is about log n.

That is why the total becomes n log n.

Worked example

Why merge-style structure gives n log n

Suppose one algorithm repeatedly splits the input in half until subproblems are size 1, and every level of the recursion does at most cn units of merge or combination work.

If there are about log2nlog_2 n levels, then the total work is approximately

cn+cn++cncn + cn + \cdots + cn

repeated log2nlog_2 n times, which is

O(nlogn).O(n \log n).

The point is not the exact constant c. The point is that “linear work per level” and “logarithmically many levels” multiply into the standard n log n shape.

A checklist for reading complexity from code

When you meet a new piece of code, the safest order is:

  1. identify the operation that repeats most often;
  2. count how many times the surrounding loop, recursion, or traversal executes;
  3. separate constant work from size-dependent work;
  4. simplify only after the counting step is explicit.

If you skip the counting step, you usually end up guessing. Good complexity work is slow reading first, short classification second.

When constants still matter

Asymptotic notation describes the long-run shape, but it does not say constants never matter. For small n, a simple O(n2)O(n^2) routine may still beat an O(n log n) routine with a very large hidden constant.

The disciplined conclusion is therefore two-part: first classify the growth correctly, then discuss whether the current input size is small enough that constant factors still influence the practical choice.

Cost of stack and queue operations

The previous section on ADTs gives useful concrete examples for this section. Once the representation is fixed, the same interface can have different costs.

  • Array-based stack push / pop are typically constant time except when a resize happens.
  • Linked-list queue enqueue / dequeue can be constant time when head and tail are maintained correctly.
  • QueueLength() can be constant time if the length is stored, or linear time if it must scan the list.

That is why asymptotic analysis belongs next to implementation details, not apart from them.

Quick check

If an algorithm has two nested loops each running n times, what class is the total cost?

Assume the loop body is O(1).

Solution

Answer

Quick check

Why does O(n log n) beat O(n2)O(n^2) for sufficiently large n?

Compare the ratio (n^2)/(n log n).

Solution

Answer

Quick check

What is the asymptotic class of 0.0001n3+n0.0001n^3 + n?

Drop constants and lower-order terms.

Solution

Answer

Quick check

Why does selection sort have quadratic cost even though only one element is placed each pass?

Count the comparisons inside the shrinking inner loop.

Solution

Guided solution

Takeaway

Complexity analysis is a way to reason about scaling, not a way to guess which implementation “feels faster”. If you can identify the dominant repeated action, write the count, and simplify correctly, you can usually classify the algorithm without running it.