Skip to main content

Algorithms


Quotes

"A problem well-seen is a problem well solved."

"Stop feeling Sorry for Yourself and just do it."

Time and Space Complexity

  • Best Case (Omega Ω) (Lower Bound)

    • f(n) = Ω(g(n)) => f>=g
  • Average Case (theta θ) -> Exact Bound -> Used when for every case there is same value. Eg. linear search is θ(n), but quickSort is O(nlogn)

    • f(n) = θ(g(n)) => f=g
  • Worst Case (Big O) -> upper Bound (O)

    • f(n) = O(g(n)) => f <= g
  • Small O

    • 2n = O(n) ✔
    • 2n = o(n) ❌
    • 2n = o(n^2) ✔
      • Ans is just bigger order of function than O and above.
    • f(n) = o(g(n)) => f<g
  • Small Omega (ω)

    • 2n = Ω(n) ✔
    • 2n = ω(n) ❌
    • 2n^2 = ω(n) ✔
      • Ans is just smaller order of function than Ω and below.
    • f(n) = ω(g(n)) => f>g

Algebric Properties:

  • Reflexive property holds for O, θ, Ω but not for small o and ω.

    • f(n) = O(f(n)), ✔
    • f(n) = θ(f(n)), ✔
    • f(n) = Ω(f(n)), ✔
    • f(n) = o(f(n)), ❌
    • f(n) = ω(f(n)), ❌
  • Symmetric property only holds for θ.

    • f(n) = O(g(n)), => g(n) = O(f(n)), ❌
    • f(n) = Ω(g(n)), => f(n) = Ω(g(n)), ❌
    • f(n) = θ(g(n)), => f(n) = θ(g(n)), ✔
  • Transitive Property. All Five of them Follows it

    • If f = any_of_five_symbol(g) and g = any_of_five_symbol(h) then f = any_of_five_symbol(h) ✔
  • If f(n) = O(g(n)) && h(n) = O(i(n))

    • then, f(n) + h(n) = O(max(g(n), i(n))) ✔
    • then, f(n) h(n) = O(g(n)i(n)) ✔
  • If f(n) = O(g(n))

    • then, f(n)h(n) = O(g(n) h(n)) ✔

Order of Function

info

1 < loglogn < logn < logn^c < n^(1/3) < n^(1/2) < n < nlogn < n^2 < (n^2)logn < n^3 < 2^n < n^n < n!

Note:

  • max global array can be of 10^8 size.
  • max stack array that can be created is of 10^6 size otherwise segmentation fault.
  • In 1 sec, no of operations done = 210^8, but to be on the safer side, take it 510^7.
  • range of int is 10^9.
  • range of long long int is 10^18.
  • Space complexity counts the stack space used in recursion as well.
valueopfunction order
n < 10=n!, n7, n6
n < 20=(2n)*n, n5
n < 80=n4
n < 400=n3
n < 7500=n2
n < 105=n*sqrt(n)
n < 5*(105)=n*log(n)
n < 5*106=n
n < 1012=sqrt(n), sqrt(n)*logn
n < 1018=logn, O(1), log2n...

Solving Recurrence Relations/Equations:

  • Substitution Method

  • Master Theorem

    Master's Theorem for Subtract and Conquer Recurrences(with Trick)

    T(n) = aT(n - b) + f(n), where f(n) is a polynomial function(nk), means not like n/logn.

    Steps/Trick:
    • If a < 1, T(n) = θ(f(n))
    • If a = 1, T(n) = θ(f(n) * n)
    • If a > 1, T(n) = θ(f(n) * a(n/b))

    Master's Theorem for Divide and Conquer Recurrences (with Trick)

    T(n) = aT(n / b) + f(n) * (logn)^k

    • where f(n) is a polynomial function(nk).
    • and Let f(n)*(logn)^k = g(n)
    Steps/Trick:
    • calculate nlog_a[base_b]
    • If it is same as f(n) then return θ(g(n) * logn)
    • else return bigger of them. - If f(n) is bigger, return θ(g(n)). - If nlog_a[base_b] is bigger, return θ(nlog_a[base_b]).
  • Recursion Tree Method.

    • calculate the number of functional calls and how much each call costs. T(n) = a T(n / b) + c

    • where c can be anything [O((nk)*(logp(n)))]

    • and c is a Polynomial Function. means not like n/logn.

    • a should be >= 1

Stabilty & Inplace (Sorting)

  • elements in the new array appear in the same order as they were in the original.

  • Stable Sort (if clubs of 5 is first in input than heart of 5 then the order is maintained in output as well and is not by chance but by surety.)

    • Bubble Sort, O(n^2)

      • optimized Bubble Sort(isSwapped flag)
      • Best case complexity O(n))
      • Inplace
    • Insertion Sort O(n^2) -(Best case complexity O(n))

      • (Inplace)
      • If Array is sorted or almost sorted, insertion sort is best as the number of swaps and comparisons will be less.
    • Selection Sort O(n2)

      • (Inplace)
      • (all case O(n<sup>2</sup>))
    • Merge Sort, O(nlogn)

      • (Divide and Conquer Algorithm),
      • (O(nlogn) in time and O(n) in space)
      • (Not Inplace)
      • used in External Merge Sorting [no need to bring complete array at a time to sort, can divide the array into chunks that can fit into RAM, and sort individually][k-way merge is performed here.] [merging k sorted arrays in one go.][merge sort uses 2-way merge]
    • Counting Sort

      • O( n + range_of_numbers), made stable by keeping cumulative count of elements <= the ith element in the original array.
      • not a comparison based, not inplace, extra O(n+k) space, k for counting occurrences, and n for output array which will then be copied to the original array.
      • useful only when k(range of numbers) is linear to n(n/2, 2n, 5), otherwise the counting sort becomes n2 or worst depending upon k=n2 or n3.
      • used as a subroutine for radix sort.
      • both space and time complexity is O(n+k)
    • Radix Sort

      • works even when the range of numbers is very large.
      • works in linear time, even when numbers are in range(n2, n3).
      • use counting sort as a subroutine.
      • O(d*(n+k)), d is number of digits in largest number, k is biggest digit in numbers(base of numbers, k=b)(usually 10[decimal] so can be ignored), O(d*n).
      • space is same as of counting sort. O(n+k) and k=b
      • Question: -What is the Time Compexity to sort n integers using radix sort in the range [n^(k/2), n^k]. where k is independent of n.
        • Ans: T(n) in radix sort is = O(d*(n+range)),
        • range = 10(decimal number system)
        • d = log(n^k) = number of digits in the largest number.
        • T(n) = O( k logn n) = O(nlogn)
  • Unstable Sort

    • Quick Sort O(nlogn)
      • (Divide and Conquer Algorithm),
      • Inplace, if no stack space is considered as used in recursion, otherwise not-inplace.
      • Not Stable, as in partition function, the same elements(If a1,a2; a1==a2 is the order) then if a1 is swapped compared with pivot, then a2 will also be swapped with pivot and since a1 comes first, swapped first => placed in last and then placing a2 before a1(as filling from back).
      • However, Stable version exists for Quicksort, see wikipedia
      • In the worst case call stack can take up to n calls, quicksort is O(n) space-wise.
      • space is O(n), because of stack space.
      • Tail recursive hence is more optimized for complier as the compiler can do tail call elimination and perform code optimization.
      • Worst case: Array already sorted and we are taking pivot as the first element.
        • T(n) = T(n-1)+ n = O(n^2)
      • Best case: when pivot always lends in the middle.
        • T(n) = 2T(n/2)+ n = O(nlgn)
      • Almost best/worst case or rest of the cases:
        • Let partition happens like n/10 and 9n/10 [10% and 90% split].
        • T(n) = T(n/10)+T(9n/10)+n
        • Solving by recursion tree we get T(n)= O(n*logn) [logn with base 10/9].
        • Important: Only in one case Quicksort goes to n^2.
        • Even if we have n/100 and 99n/100 [1% and 99% split]split. It will still be nlogn.
      • Worst case O(n^2), the chance of happening that case can be reduced using randomized QuickSort.
        • Randomization can be added in picking the pivot element rather than first everytime.
        • Other can be randomly shuffle the input array. chance of the array remaining sorted is close to 0 after the shuffling.
        • but since there is still a very very small chance of happening the scenario. The worst case remains O(n^2) but now with much lower probability.
    • Heap Sort
      • θ(nlogn) in all cases (best, worst, average)
      • There is one case where it is O(n), when almost-all or all elements are the same, heapify becomes O(1), hence O(n).
      • Space complexity is O(1)
    • Bucket Sort
      • O(n) best case, O(used_sort_algo_time_to sort_individual_bucket)
  • Insertion Vs Selection Sort

    • Insertion is O(n^2) swaps in worst case
    • Selection is O(n) swaps in worst case
    • use selection sort where writing to memory(swaps) is a concern over insertion sort.
  • Question: can there be nay comparison based sorting algorithm which is O(n)?.

    • Ans: No, Since we have to make comparisons and the total number of arrangements possible is n!, we make a binary decision tree of elements where leaf nodes are all the n! arrangements. A binary tree of height h has 2^h-1 nodes. and 2^(h-1) leaf nodes.
    • Since not all arrangement will arrive at same height. leaf nodes are less than n! => n! <= 2^(h-1)
      => log(n!) <= h-1 => h >= log(n!)+1 => h = θ(nlogn) [as log(n!) = θ(nlogn)] => we can say h = Ω(nlogn)
      • Since to sort the array, we atleast have to follow one path down the decision tree, total Comparisons are atleast h = nlogn.
      • Note: Minimum Number of comparison to sort array based on comparison based sorting is ceil(lg(n!)), ceil because of h >= log(n!)+1.
  • Prove [as log(n!) = θ(nlogn)]

    • Proof:
      • log(n!) = log1 + log2 + logn3 + ... + log(n/2) + ... + logn <= logn + logn + logn + ... + logn = nlogn
        • => log(n!) = O(nlogn)
      • log(n!) = log1 + log2 + logn3 + ... + log(n/2) + ... + logn >= log(n/2) + log(n/2+1) + log(n/2+2) + ... + log(n) = (n/2)log(n/2)
        • => log(n!) = (Ω(nlogn)
      • => log(n!) = θ(nlogn) Proved.

Stack(infix prefix algo and points)

  • `Advantage of prefix and postfix expressions

    • can be easily evaluated by writing a program that traverses the given (prefix/postfix)expression exactly once using stack.
    • Don't require parenthesis, precedence rules, associativity rules.
  • Infix to postfix algo

  • If the current element is of higher/equal precedence(heavy one), push to stack

  • If the current element is of lower precedence (light one), pop till the current element becomes heavy.

  • Note: we are comparing the current element with the top of the stack for precedence.

  • In evaluating postfix expression for integer result, the first popped element will be the second operand and the second popped element will be the first operand.

  • Infix to prefix algo

    • just traverse the infix string from reverse and return the reverse of what you get, rest algo is the same as above.
    • In evaluating prefix expression for integer result, the first popped element will be the first operand and the second popped element will be the second operand(opposite of postfix).
  • Conclusion:

    • [bhari hai]/[same weight hai] toh stack mai daldo, bina kuch kiye.😁
    • halka hai toh pop karte raho jab tak wo halka bhari na hojaye.😁

Fibonacci Series

  • Total number of functions called for calculating fib(n) = 2 * fib(n + 1) - 1
  • Total number of additions performed to calculate fib(n) = fib(n + 1) - 1

Tower of Hanoi (for n disks)

  • Total number of finction calls 2^(n + 1) - 1
  • Total number of disk movements = 2 ^n - 1

Tree

  • height 0 is root = NULL

  • height starts from 0, level from 1, depth from 0 (follow this convention) => the height of the tree is the maximum depth of all nodes.

  • depth starts from the top. root's depth 0, root-children depth 1, and so on..

  • depth is the number of edges between the root to that node.

  • internal node = nodes that are not leaf nodes.; the root is also an internal node.

  • Tree problems follow recursive structure. Mostly all the problems can be solved by dividing the problems into subproblems and making recursive calls on subtrees.

  • Types of binary tree

    • full/strict binary tree : every node with 0 or 2 children; need not to be balanced.

      <!-- EXAMPLE -->
      O
      / \
      O O
      / \
      O O
      / \
      O O
      / \
      O O
      • In a full BT, If number of leaf nodes are n, then number of internal nodes are n-1.
      • In a full BT, If number of internal nodes are n, then number of leaf nodes are n+1 .
      • LeafNodes = internalNodes + 1 (always leafNode > Internal Node)
      • Total number of nodes in a Full BT with n internal nodes = 2n+1(n + n+1).
      • Total number of nodes in a Full BT with n leaf nodes = 2n-1(n + n-1).
    • Complete binary tree(CBT)

      • except last level,rest level are completely filled
      • last level has node from Left to Right. No Gaps
      • number of internal nodes in CBT is floor(n/2)
        <!-- EXAMPLE -->
        O
        / \
        O O
        / \ /
        O O O
    • perfect binary tree

      • Every level is completely filled.
      • or say every internal node has 2 children and all leaves nodes are at the same level.
            <!-- EXAMPLE -->
        O
        / \
        O O
        / \ / \
        O O O O
    • balanced binary tree

      • every nodes left Subtree height and right Subtree height difference is not more than 1.
    • To construct the unique BST from given preorder or postorder, Time Complexity = O(nlogn).

    • No unique BT is possible from given pre and post order.

      • Ex: pre ABC, post CBA. two skew trees are possible ABC(left skew) and ABC(right skew)
    • To construct a unique BT, we must have inorder with either pre or post in a Time Complexity = O(n^2).

    • (In + pre) or (In + post) = O(nlogn), if Inorder is sorted.

    • (In + pre) or (In + post) = O(n^2), if Inorder is not sorted.

    • Total number of BSTs possible with n nodes

      • Since each node tries to be the root
      • Total number is f(n) = f(0)*f(n-1) + f(1)*f(n-2) + f(2)*f(n-3) + f(3)*f(n-4)+... = ((2n)Cn)/(n+1) = Catalan number, where f(0) = 1, f(1) = 1, f(2) = 2, f(3) = 5, f(4) = 14, f(5) = 42
    • Gate Formulas

      • I0 = L = #Leave nodes
      • I1 = #nodes with 1 children
      • I2 = #nodes with 2 children
      • For binary tree, L = I2 + 1
      • Total number of Nodes in Binary Tree(N) = I0(or L) + I1 + I2 = 2I2 + I1 + 1
      • # BTs possible with n unlabelled nodes = Catalan Number = 2nCn / (n+1)
      • # BTs possible with n Distinct Labelled nodes = Catalan Number n! = **2nCn n! / (n+1)**
      • # Nodes in BT on level L (level starts from 0)
        • Minimum : 1
        • Maximum : 2^L
      • # Nodes in BT of height H (height of single node = 0)
        • Minimum : H + 1
        • Maximum : 2^(H+1) - 1

Hashing , HashFunction(hf), Load Factor and Hash Table

  • To analyze the searching time, we use Load Factor, which is [(total Slots filled)/(total Slots)] = Average number of keys per slot. = Average length of chain per slot.

  • Simple Uniform hashing follows, for each key, it is equally likely to be hashed to any slot in the table. And it is a requirement for a good hash function.

  • To avoid many collisions, we always make sure load factor<=1, generally 0.7, when it grows more, we create a new table of twice the size and reduce the load factor again.

  • Average number of comparisons to be performed to search a key = average chain length = Load Factor

  • Collisions are also called Probes.

  • Collisions are resolved by :-

    • Open Addressing or Closed hashing
      • Linear Probing (hf_i(n) = (hf(n) + i) mod m); [hf_i being calculating hf ith time; m is table size]
        • Side Effect of Linear Probing: Produces Primary Clustering
        • Primary Clustering: - In the case of Linear Probing, If two keys are mapped to the same starting location in a hash table then both follow the same path unnecessarily in a linear fashion as a result of which search time increases. To avoid this problem, we use Quadratic Probing.
        • Time for succesfull search = (1/LF)*ln(1/(1-LF)) on an average.
        • Time for unsuccesfull search = 1/(1-LF) on an average.
      • Quadratic Probing (hf_i(n) = (hf(n) + aii + b*i)) mod m, typically a=1 & b=0.
        • Side Effect of Quadratic Probing: Produces Secondary Clustering
        • Secondary Clustering: #TODO - To avoid this problem, we use Double hashing.
      • Double Hashing (hf_i(n) = (hf(n) + i*hf2(n)) mod m)
    • Open Hashing or Seperate chaining
      • Time for succesfull search = 1 + LF/2 on an average.
      • Time for unsuccesfull search = 1 + LF on an average.
      • Slots means table size.

Backtracking

  • Used for Constrained Assignment problems like N-Queen, Rat in a maze, Sudoku Solver.
  • In these problems, brute force solution tries all possible combinations and hence high in complexities
  • But backtracking is being smart about trying the combinations. If while following one of the combinations, we find out that it is going to be invalid, we simply reject that path, go back and try a new path. (Hence not going blindly in one direction, saving a lot of time.)

DP

  • Exhaustively searching all possible paths but also saving common subpaths in between and not going to the same path again.

Greedy

  • Going toward the local optimal solution will eventually result in a global optimal solution.
  • Not possible for every real world problem.
  • Fractional Knapsack
    • Sort by val/wt => value of 1kg of item.
  • Huffman Coding
    • Start making the tree taking 2 least frequent characters.
  • Job Sequence with Deadlines
  • Optimal Merge Pattern
    • Merging n-sorted arrays
      • Two ways
        1. Perform n-way merge
        1. Perform multiple 2-way merge(in total n-1 2-way merges) (here comes optimal merge pattern problem)
        • What is the minimum no of operations required to perform multiple 2-way merges of n sorted arrays.
        • seems like matrix Chain multiplication problem of DP
        • But since the cost of merging is simply the sum of the length of arrays.(n1+n2). We can do it greedily.
        • By Picking 2 smallest array and 2-way merge them everytime.
  • Prim's Algo for MST
  • Kruskal's Algo for MST

Interactive Problems

  • Problems in which our solution deals/interact/talk with the online judge/problem-grader.

  • It requires to flush the output as soon as we want to print something on screen.

  • Generally, People uses fast io,i.e. ios::base::sync_with_stdio(false); cin.tie(NULL); cout.tie(NULL); Which leads to storing the output in buffer and outputing on screen all together when program terminates.

  • It is fine for other problems which are not interacting with the judge. As this flushing in the end make our program faster.

  • But in Interactive problems we want to flush things. Hence we use cout<<flush; or cout<<endl; as cout<<'\n'<<flush is equivalent to cout<<endl;

  • That's why in other problems we uses '\n' instead of endl;

  • But here we need to use endl.

  • Conclusion: use endl instead of '\n' here and problem solution will become interactive

    • just comment out -> # define endl '\n'.
  • Links:

Policy Based Datastructures

  • Declaration, namespace, Headers, typedef for easier use

      #include <ext/pb_ds/assoc_container.hpp>
    #include <ext/pb_ds/trie_policy.hpp>
    using namespace __gnu_pbds;
    typedef tree<int, null_type, less<int>, rb_tree_tag, tree_order_statistics_node_update> pbds;
  • Working & Important Function

    • It works like a ordered set +
    • a function to return kth smallest element. find_by_order(k) [O(logn) time] +
    • a function to find #elements less than k. order_of_key(k) [O(logn) time].
  • code

  pbds s;
// insert in any order- stored always in sorted order.(rb tree)
s.insert(6);
s.insert(5);
s.insert(1);
s.insert(3);
s.insert(10);
s.insert(10); // No duplicates are added.

// cannot iterate over set this way.
// but can do with pbds
for (int i = 0; i < s.size(); i++) {
auto it = s.find_by_order(i); // finding ith smallest element.
cout << *it << " ";
}
// above output: 1 3 5 6 10
cout << endl;

for (int i = 0; i < 10; i++) {
auto order = s.order_of_key(i); // finding ith smallest element.
cout << order << " ";
}
// above output: 0 0 1 1 2 2 3 4 4 4 : - #elements smaller than index
cout << endl;
  • Question: Inversion count

      int n;
    cin>>n;
    int arr[n];

    for(int i=0; i<n;i++){
    cin>>arr[i];
    }

    int count = 0;
    for(int i=0; i<n;i++){
    count += St.size() - St.order_of_key(a[i]);
    St.insert(a[i]);
    }

    cout<<count<<endl;

P, NP-hard and NP-complete Problems

  • P problems are those whose solution exists and are solvable in polynomial time.
  • NonPolynomial-Hard Problems are those which we don't yet know, if there exist an algorithm that bounds the solution to that problem in polynomial time.
  • NP Complete problems means solution to the given problem is the combination of some polynomial time solutions which makes the original problems solved in polynomial time. OR we can say, original problem is reducable to the problem that we know is solvable in polynomial time.

Is it true ?

  • we cannot remove/find a specific element from the heap. can always only remove min only if it is a minheap?
    • Ans. True, pq don't have find, begin, erase functions.
  • If we want sorted order and also want to remove a specific element in the future. we use a set/multiset instead of heap.
    • Ans. True, pq don't have find, begin, erase functions, but set does.
  • set (BST STL) vs heap
    • set and heap both store elements in order.
    • but in heap we can only remove/erase top most elements, thus don't have pq.find() in it.
    • whereas in set we can erase-and-add(i.e update) any element ir-respective of index of elements in set using s.find() , s.erase(), s.insert().
    • That is why we uses set, instead of minheap in Dijkstra for C++.
    • In general, in C++ , whenever we are required heap functionality and we also want to remove non-min/max element, min element is at set.begin() and max at set.rbegin();
    • => always use set/Multiset.

Cause of Errors / Types of Errors

  1. Time Limit Excedded -> code faster solution
  2. Compilation Error -> syntax error
  3. Runtime Error ->
  • Divide by Zero
  • Segmentation Fault
    • Possible Causes
      • Array declared is small
      • Accessing the -ve indices
      • Array size > 108 , you can max create an array of size 10^8 in global scope(or dynamixcally created array on heap). But in block scope(that will go in stack), max size of array can be 10^6.
  1. Wrong Answer
  • check for datatype + uninitialized variables.