I. Greedy Algorithms

Greedy is an algorithmic paradigm that builds up a solution piece by piece, always choosing the next piece that offers the most obvious and immediate benefit. Greedy algorithms are used for optimization problems.

An optimization problem can be solved using Greedy if the problem has the following property:

However, it’s important to note that not all problems are suitable for greedy algorithms. They work best when the problem exhibits the following properties:

Greedy Algorithm solve optimization problems by making the best local choice at each step in the hope of finding the global optimum. It’s like taking the best option available at each moment, hoping it will lead to the best overall outcome.

Here’s how it works:

  1. Start with the initial state of the problem.
  2. Consider all the options available at that specific moment.
  3. Choose the option that seems best at that moment, regardless of future consequences. This is the “greedy” part - you take the best option available now, even if it might not be the best in the long run.
  4. Move to the new state based on your chosen option. This becomes your new starting point for the next iteration.
  5. Repeat steps 2-4 until you reach the goal state or no further progress is possible.

Example:

Let’s say you have a set of coins with values [1, 2, 5, 10] and you need to give minimum number of coin to someone change for 39.

The greedy algorithm for making change would work as follows:

  1. Step-1: Start with the largest coin value that is less than or equal to the amount to be changed. In this case, the largest coin less than or equal to 39 is 10.
  2. Step- 2: Subtract the largest coin value from the amount to be changed, and add the coin to the solution. In this case, subtracting 10 from 39 gives 29, and we add one 10-coin to the solution.
  3. Repeat steps 1 and 2 until the amount to be changed becomes 0.

Below is the illustration of above example:

  1. Step 1: We need to make change for 39 using couns of denominations {1, 2, 5, 10} using the fewest number of coins.

  2. Step 2: Start with the highest denomination (10). Since greedy algorithms works by taking the largest coin possible fist, we start with the coin of denomination 10.

  3. Step 3: Now take the next highest denimonation (5).

  4. Now thake the next highest denomination (2)

  5. Step 5: Successfully make the target sum (39) using a total of 6 coins:

    \[3 \times 10 + 1 \times 5 + 2 \times 2 = 30\]

Code demo:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Python Program to find the minimum number of coins
# to construct a given amount using greedy approach

def minCoins(coins, amount):
    n = len(coins)
    coins.sort()
    res = 0

    # Start from the coin with highest denomination
    for i in range(n - 1, -1, -1):
        if amount >= coins[i]:
          
            # Find the maximum number of ith coin we can use
            cnt = amount // coins[i]

            # Add the count to result
            res += cnt

            # Subtract the corresponding amount from the total amount
            amount -= cnt * coins[i]

        # Break if there is no amount left
        if amount == 0:
            break
    return res

if __name__ == "__main__":
    coins = [5, 2, 10, 1]
    amount = 39

    print(minCoins(coins, amount))

II. Greedy Algorithms General Structure

A greedy algorithm solves problems by making the best choice at each step. Instead of looking at all possible solutions, it focuses on the option that seems best right now.

Problem structure:

Most of the problems where greedy algorithms work follow these two properties:

  1. Greedy Choice Property:- This property states that choosing the best possible option at each step will lead to the best overall solution. If this is not true, a greedy approach may not work.
  2. Optimal Substructure:- This means that you can break the problem down into smaller parts, and solving these smaller parts by making greedy choices helps solve the overall problem.

Some algorithms

Sorting

Using Priority Queue or Heaps

III. Standard Greedy Algorithms

1. Huffman Coding

Huffman coding is a lossless data compression algorithm. The idea is to assign variable-length codes to input characters, lengths of the assigned codes are based on the frequencies of corresponding characters.

The variable-length codes assigned to input characters are Prefix Codes, means the codes (bit sequences) are assigned in such a way that the code assigned to one character is not the prefix of code assigned to any other character. This is how Huffman Coding makes sure that there is no ambiguity when decoding the generated bitstream. Let us understand prefix codes with a counter example. Let there be four characters a, b, c and d, and their corresponding variable length codes be 00, 01, 0 and 1. This coding leads to ambiguity because code assigned to c is the prefix of codes assigned to a and b. If the compressed bit stream is 0001, the de-compressed output may be “cccd” or “ccb” or “acd” or “ab”.

There are mainly two major parts in Huffman Coding

  1. Build a Huffman Tree from input characters.
  2. Traverse the Huffman Tree and assign codes to characters.

Steps to build Huffman Tree This algorithm builds a tree in bottom up manner using a priority queue (or heap). Input is an array of unique characters along with their frequency of occurrences and output is Huffman Tree.

  1. Step 1: Create a leaf node for each unique character and build a min heap of all leaf nodes (Min Heap is used as a priority queue. The value of frequency field is used to compare two nodes in min heap. Initially, the least frequent character is at root)
  2. Step 2: Extract two nodes with the minimum frequency from the min heap.
  3. Step 3: Create a new internal node with a frequency equal to the sum of the two nodes frequencies. Make the first extracted node as its left child and the other extracted node as its right child. Add this node to the min heap.
  4. Step 4: Repeat Steps#2 and #3 until the heap contains only one node. The remaining node is the root node and the tree is complete.

Let us understand the algorithm with an example:

1
2
3
4
5
6
7
character   Frequency
    a            5
    b           9
    c           12
    d           13
    e           16
    f           45

Now min heap contains 5 nodes where 4 nodes are roots of trees with single element each, and one heap node is root of tree with 3 elements

1
2
3
4
5
6
    character       Frequency
       c               12
       d               13
 Internal Node         14
       e               16
       f               45

Now min heap contains 4 nodes where 2 nodes are roots of trees with single element each, and two heap nodes are root of tree with more than one nodes

1
2
3
4
5
character           Frequency
Internal Node           14
        e               16
Internal Node           25
        f               45

Now min heap contains 3 nodes.

1
2
3
4
character          Frequency
Internal Node         25
Internal Node         30
      f               45 

Now min heap contains 2 nodes.

1
2
3
character     Frequency
       f         45
Internal Node    55

Now min heap contains only one node.

1
2
character      Frequency
Internal Node    100

Steps to print codes from Huffman Tree:

Traverse the tree formed starting from the root. Maintain an auxiliary array. While moving to the left child, write 0 to the array. While moving to the right child, write 1 to the array. Print the array when a leaf node is encountered.

The codes are as follows:

1
2
3
4
5
6
7
character   code-word
    f          0
    c          100
    d          101
    a          1100
    b          1101
    e          111

Code demo:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
# Python program for the above approach:
import heapq

# Class to represent huffman tree 
class Node:
	def __init__(self, x):
		self.data = x
		self.left = None
		self.right = None

	def __lt__(self, other):
		return self.data < other.data

# Function to traverse tree in preorder 
# manner and push the huffman representation 
# of each character.
def preOrder(root, ans, curr):
	if root is None:
		return

	# Leaf node represents a character.
	if root.left is None and root.right is None:
		ans.append(curr)
		return

	preOrder(root.left, ans, curr + '0')
	preOrder(root.right, ans, curr + '1')

def huffmanCodes(s, freq):
	# Code here
	n = len(s)

	# Min heap for node class.
	pq = []
	for i in range(n):
		tmp = Node(freq[i])
		heapq.heappush(pq, tmp)

	# Construct huffman tree.
	while len(pq) >= 2:
		# Left node 
		l = heapq.heappop(pq)

		# Right node 
		r = heapq.heappop(pq)

		newNode = Node(l.data + r.data)
		newNode.left = l
		newNode.right = r

		heapq.heappush(pq, newNode)

	root = heapq.heappop(pq)
	ans = []
	preOrder(root, ans, "")
	return ans

if __name__ == "__main__":
	s = "abcdef"
	freq = [5, 9, 12, 13, 16, 45]
	ans = huffmanCodes(s, freq)
	for code in ans:
		print(code, end=" ")

2. Huffman Decoding

Examples:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Input Data: ShandyDemo
Frequencies: S: 1, h: 1, a: 1, n: 1, d: 1, y: 1, D: 1, e: 1, m: 1, o: 1
Encoded Data: 0000010100111001011100110111101111

Huffman Tree: '#' is the special character used for internal nodes as character field is not needed for internal nodes. 

                    #(10)
                  /       \
              #(4)         #(6)
             /     \       /     \
          #(2)     #(2)  #(2)    #(4)
         /    \    /   \  /  \   /   \
        S(1) h(1) a(1) n(1) d(1) y(1) D(1)  #(2)
                                                 / \
                                              e(1) #(2)
                                                   / \
                                                m(1) o(1)

Code of 'S' is '000', code of 'h' is '001', ..
Decoded Data: ShandyDemo

Follow the below steps to solve the problem:

Note: To decode the encoded data we require the Huffman tree. We iterate through the binary encoded data. To find character corresponding to current bits, we use the following simple steps:

The below code takes a string as input, encodes it, and saves it in a variable encoded string. Then it decodes it and prints the original string.

Below is the implementation of the above approach:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
import heapq
from collections import defaultdict

# to map each character its huffman value
codes = {}

# To store the frequency of character of the input data
freq = defaultdict(int)

# A Huffman tree node
class MinHeapNode:
    def __init__(self, data, freq):
        self.left = None
        self.right = None
        self.data = data
        self.freq = freq

    def __lt__(self, other):
        return self.freq < other.freq

# utility function to print characters along with
# there huffman value
def printCodes(root, str):
    if root is None:
        return
    if root.data != '$':
        print(root.data, ":", str)
    printCodes(root.left, str + "0")
    printCodes(root.right, str + "1")

# utility function to store characters along with
# there huffman value in a hash table
def storeCodes(root, str):
    if root is None:
        return
    if root.data != '$':
        codes[root.data] = str
    storeCodes(root.left, str + "0")
    storeCodes(root.right, str + "1")

# function to build the Huffman tree and store it
# in minHeap
def HuffmanCodes(size):
    global minHeap
    for key in freq:
        minHeap.append(MinHeapNode(key, freq[key]))
    heapq.heapify(minHeap)
    while len(minHeap) != 1:
        left = heapq.heappop(minHeap)
        right = heapq.heappop(minHeap)
        top = MinHeapNode('$', left.freq + right.freq)
        top.left = left
        top.right = right
        heapq.heappush(minHeap, top)
    storeCodes(minHeap[0], "")

# utility function to store map each character with its
# frequency in input string
def calcFreq(str, n):
    for i in range(n):
        freq[str[i]] += 1

# function iterates through the encoded string s
# if s[i]=='1' then move to node->right
# if s[i]=='0' then move to node->left
# if leaf node append the node->data to our output string
def decode_file(root, s):
    ans = ""
    curr = root
    n = len(s)
    for i in range(n):
        if s[i] == '0':
            curr = curr.left
        else:
            curr = curr.right

        # reached leaf node
        if curr.left is None and curr.right is None:
            ans += curr.data
            curr = root
    return ans + '\0'

# Driver code
if __name__ == "__main__":
    minHeap = []
    str = "ShandyDemo"
    encodedString, decodedString = "", ""
    calcFreq(str, len(str))
    HuffmanCodes(len(str))
    print("Character With there Frequencies:")
    for key in sorted(codes):
        print(key, codes[key])

    for i in str:
        encodedString += codes[i]

    print("\nEncoded Huffman data:")
    print(encodedString)

    # Function call
    decodedString = decode_file(minHeap[0], encodedString)
    print("\nDecoded Huffman Data:")
    print(decodedString)

Easy Problems

1. Fractional Knapsack

Given two arrays, val[] and wt[], representing the values and weights of item respectively, and an integer capacity representing the maximum weight a knapsack can hold, we have to determine the maximum total value that can be achieved by putting the items in the knapsack without exceeding its capacity. Items can also be taken in fractional parts if required.

Examples:

1
2
3
4
5
6
7
Input: val[] = [60, 100, 120], wt[] = [10, 20, 30], capacity = 50
Output: 240 
Explanation: We will take the items of weight 10kg and 20kg and 2/3 fraction of 30kg. 
Hence total value will be 60 + 100 + (2/3) * 120 = 240.

Input: val[] = [500], wt[] = [30], capacity = 10
Output: 166.667

Case 1: Picking the items with smaller weights first Example: val[] = [10, 10, 10, 100], wt[] = [10, 10, 10, 30], capacity = 30

So, choosing smaller weights first fails here.

Case 2: Picking items with larger value first Example: val[]= [10, 10, 10, 20], wt[] = [10, 10, 10, 30], capacity = 30

So, choosing higher values first also fails.


Steps to solve the problem:

  1. Calculate the ratio (value/weight) for each item.
  2. Sort all the items in decreasing order of the ratio.
  3. Iterate through items:
  4. Stop once the capacity becomes zero.

Code demo

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
def compare(a, b):
    a1 = (1.0 * a[0]) / a[1]
    b1 = (1.0 * b[0]) / b[1]
    return b1 - a1

def fractionalKnapsack(val, wt, capacity):
    n = len(val)

    # Create list to store value and weight
    # items[i][0] = value, items[i][1] = weight
    items = [[val[i], wt[i]] for i in range(n)]

    # Sort items based on value-to-weight ratio in descending order
    items.sort(key=lambda x: x[0]/x[1], reverse=True)

    res = 0.0
    currentCapacity = capacity

    # Process items in sorted order
    for i in range(n):

        # If we can take the entire item
        if items[i][1] <= currentCapacity:
            res += items[i][0]
            currentCapacity -= items[i][1]

        # Otherwise take a fraction of the item
        else:
            res += (1.0 * items[i][0] / items[i][1]) * currentCapacity

            # Knapsack is full
            break

    return res

if __name__ == "__main__":
    val = [60, 100, 120]
    wt = [10, 20, 30]
    capacity = 50

    print(fractionalKnapsack(val, wt, capacity))

2. Minimum cost to make array size 1 by removing larger of pairs

Given an array of n integers. We need to reduce size of array to one. We are allowed to select a pair of integers and remove the larger one of these two. This decreases the array size by 1. Cost of this operation is equal to value of smallest one. Find out minimum sum of costs of operations needed to convert the array into a single element.

**Examples: **

1
2
3
4
5
6
7
Input: arr[]= [4 ,3 ,2 ]
Output: 4
Explanation: Choose (4, 2) so 4 is removed, new array = {2, 3}. Now choose (2, 3) so 3 is removed.  So total cost = 2 + 2 = 4.

Input: arr[]=[ 3, 4 ]
Output: 3
Explanation: choose (3, 4) so cost is 3. 

The idea is to always pick the minimum value as part of the pair and remove larger value. This minimizes cost of reducing array to size 1.


Code demo:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Python program to find minimum 
# cost to reduce array size to 1

# function to calculate the 
# minimum cost
def cost(a, n):

    # Minimum cost is n-1 multiplied
    # with minimum element.
    return ( (n - 1) * min(a) )


# driver code
a = [ 4, 3, 2 ]
n = len(a)
print(cost(a, n))

# This code is contributed by
# Smitha Dinesh Semwal

3. Minimum rotations to unlock a circular lock

You are given a lock which is made up of n-different circular rings and each ring has 0-9 digit printed serially on it. Initially all n-rings together show a n-digit integer but there is particular code only which can open the lock. You can rotate each ring any number of time in either direction. You have to find the minimum number of rotation done on rings of lock to open the lock.

Examples:

1
2
3
4
5
6
7
8
9
10
11
12
13
Input : Input = 2345, Unlock code = 5432 
Output : Rotations required = 8
Explanation : 1st ring is rotated thrice as 2->3->4->5
              2nd ring is rotated once as 3->4
              3rd ring is rotated once as 4->3
              4th ring is rotated thrice as 5->4->3->2

Input : Input = 1919, Unlock code = 0000 
Output : Rotations required = 4
Explanation : 1st ring is rotated once as 1->0
              2nd ring is rotated once as 9->0
              3rd ring is rotated once as 1->0
              4th ring is rotated once as 9->0 

For a single ring we can rotate it in any of two direction forward or backward as:

But we are concerned with minimum number of rotation required so we should choose min (abs(a-b), 10-abs(a-b)) as a-b denotes the number of forward rotation and 10-abs(a-b) denotes the number of backward rotation for a ring to rotate from a to b. Further we have to find minimum number for each ring that is for each digit. So starting from right most digit we can easily the find minimum number of rotation required for each ring and end up at left most digit.


Code demo:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Python3 program for min rotation to unlock

# function for min rotation
def minRotation(input, unlock_code):

    rotation = 0;

    # iterate till input and unlock 
    # code become 0
    while (input > 0 or unlock_code > 0):

        # input and unlock last digit 
        # as reminder
        input_digit = input % 10;
        code_digit = unlock_code % 10;

        # find min rotation
        rotation += min(abs(input_digit - code_digit), 
                    10 - abs(input_digit - code_digit));

        # update code and input
        input = int(input / 10);
        unlock_code = int(unlock_code / 10);

    return rotation;

# Driver Code
input = 28756;
unlock_code = 98234;
print("Minimum Rotation =",
       minRotation(input, unlock_code));