k largest(or smallest) elements in an array | added Min Heap method
Asked by geek4u
Question: Write an efficient program for printing k largest elements in an array. Elements in array can be in any order.
For example, if given array is [1, 23, 12, 9, 30, 2, 50] and you are asked for the largest 3 elements i.e., k = 3 then your program should print 50, 30 and 23.
Method 1 (Use Bubble k times)
Thanks to Shailendra for suggesting this approach.
1) Modify Bubble Sort to run the outer loop at most k times.
2) Print the last k elements of the array obtained in step 1.
Time Complexity: O(nk)
Like Bubble sort, other sorting algorithms like Selection Sort can also be modified to get the k largest elements.
Method 2 (Use temporary array)
K largest elements from arr[0..n-1]
1) Store the first k elements in a temporary array temp[0..k-1].
2) Find the smallest element in temp[], let the smallest element be min.
3) For each element x in arr[k] to arr[n-1]
If x is greater than the min then remove min from temp[] and insert x.
4) Print final k elements of temp[]
Time Complexity: O((n-k)*k). If we want the output sorted then O((n-k)*k + klogk)
Thanks to nesamani1822 for suggesting this method. Please see this comment for example.
Method 3(Use Sorting)
1) Sort the elements in descending order in O(nLogn)
2) Print the first k numbers of the sorted array O(k).
Time complexity: O(nlogn)
Method 4 (Use Max Heap)
1) Build a Max Heap tree in O(n)
2) Use Extract Max k times to get k maximum elements from the Max Heap O(klogn)
Time complexity: O(n + klogn)
Method 5(Use Oder Statistics)
1) Use order statistic algorithm to find the kth largest element. Please see the topic selection in worst-case linear time O(n)
2) Use QuickSort Partition algorithm to partition around the kth largest number O(n).
3) Sort the k-1 elements (elements greater than the kth largest element) O(kLogk). This step is needed only if sorted output is required.
Time complexity: O(n) if we don’t need the sorted output, otherwise O(n+kLogk)
Thanks to Shilpi for suggesting the first two approaches.
Method 6 (Use Min Heap)
This method is mainly an optimization of method 1. Instead of using temp[] array, use Min Heap.
Thanks to geek4u for suggesting this method.
1) Build a Min Heap MH of the first k elements (arr[0] to arr[k-1]) of the given array. O(k)
2) For each element, after the kth element (arr[k] to arr[n-1]), compare it with root of MH.
a) If the element is greater than the root then make it root and call heapify for MH
b) Else ignore it.
O((n-k)*logk)
3) Finally, MH has k largest elements and root of the MH is the kth largest element.
Time Complexity: O(k + (n-k)Logk) without sorted output. If sorted output is needed then O(k + (n-k)Logk + kLogk)
All of the above methods can also be used to find the kth largest (or smallest) element.
Please write comments if you find any of the above explanations/algorithms incorrect, or find better ways to solve the same problem.
References:
http://jonah.cs.elon.edu/sduvall2/courses/csc331/2006spring/Lectures/Order_statistics.ppt
http://en.wikipedia.org/wiki/Selection_algorithm
http://net.pku.edu.cn/~course/cs101/resource/Intro2Algorithm/book6/chap10.htm


Another algorithm which will take O(n(1+k))
- Scan through the original array and create a temp array, “temp”, of k elements, such that temp elements are in ascending order. This temp array will have our k largest elements.
- Arr[n]
- Make and array temp[k] with k ( here 3) elements:
temp[i] = 0 for i = 0 to 1
max = temp[2] = Arr[0] = 0
temp[] = {0,0, Arr[0]);
// Comparing each element of Arr with temp[k-1] and place the //larger one in temp[k-1], maintaining temp in ascending order.
for (count=0;count<n;count++)
{// n comparison
if (temp[k-1] < Arr[count])
{
// Assigning Arr[count] to temp[k-1] and keeping array in
// order and over writing the lowest element in temp.
count2=0;
while(count2<temp[k-1]) // (k-2) shifts.
temp[count2] = temp[count2+1];
temp[count2] = Arr[count];
}
let us take and example of worst case;
Arr[] = {0,1,2,3,4,5,6,7};
and k = 3
1) temp = 0,0,0
on comparing temp[2]<Arr[1]
so new temp will be
temp = 0,0,1
2) now temp[2]<Arr[2]
new temp = 0,1,2
……..
when temp = 4,5,6
and temp[2]<Arr[7]
then new temp will be
temp = 5,6,7….which are the largest 3 numbers of Arr.
But it is the worst case, probably avg case would be little bit better.
Please do let me know for nay issue with the algo.
@ Method 5
I am not sure why we need to sort the elements(step 3)once we have already found the kth largest element. The question says the elements larger then kth largest element can be in any order.
In my opinion, the running time be O(n).
Please correct me if I am wrong.
@RK: Thanks for sharing your thoughts, we have added a note for this.
Link in method 5 broken.
@Mahesh: Thanks for pointing this out. We ave fixed it.
what about forming a binary tree (as a preprocessing step ) with each node storing the number of elements on its left child side? this is the solution i gave in my interview
You can use the Median of Median method for this problem to reduce the Time complexity to O(n). Median of Median modifies the partition method of quick sort to find “Good” pivot.
Explained here
I thing the method that you are suggesting and Method 5 in the above post are same.
Not really. There needs no partitioning as mentioned under method 5.
Just pick the values which is greater than or equal to k-th largest element. Simple O(n) solution.
Hi
I am interested in knowing how we gonna approach if the array contains , say billion integers.
Obviously, we cannot put them all in memory and apply sorting due to memory constraint.
suggestions, any?
You can use method 2 or method 6 because these methods do not require all the billion integers to be present in memory.
Among these two methods, method 6 is a better choice.
We could apply Bubble Sort for K times so the largest/smallest k elements will be sorted.
Thanks for suggesting a new method. We have included it to the original post.
@ankit: This is quite interesting. Please see http://en.wikipedia.org/wiki/Binary_heap#Building_a_heap for proof that heap can be built in O(n) time.
How can you build a max heap tree in O(n) time? It should be O(nlogn).
@nesamani1822: Thanks for suggesting a new approach. We have added it to the original post. Keep it up!!
@Sandeep:
Here is the explanation for your example.
[1, 23, 12, 9, 30, 2, 50, 3]
1)Array creation
int *max=malloc(N,sizeof(int));
in your case N is 3
2) Initialize the array with first N elements
max[0] = 1
max[1] = 23
max[2] = 12
3) compare from the index 3 to 7
i=3
max[0] = 9
max[1] = 23
max[2] = 12
i=4
max[0] = 9
max[1] = 23
max[2] = 30
i=5
No change (Not greater than any element)
max[0] = 9
max[1] = 23
max[2] = 30
i=6
max[0] = 50
max[1] = 23
max[2] = 30
i=7
No change (Not greater than any element)
max[0] = 50
max[1] = 23
max[2] = 30
@nesamani1822: Could you please explain the approach with below example.
Find 3 largest elements of the array [1, 23, 12, 9, 30, 2, 50, 3]
I could easily understand first two steps, just have doubts about the third step.
We can do in other way also.
Here is the way how to do it.
1) Create one array based on the N provided.
2) Initialize that array with the first N elements of original array.
3) then compare the next elements of the array with the newly initialized array and replace with lowest element of that newly initialized array.
@Madhav: Method 1 talks about same. i.e., sort the elements and get the k largest elements.
@duke87: Could you please explain how the given code find k largest elements. Also, there seems to be typos in below lines.
for(h=p;hn) /* What is hn ?*/ search(arr,p,q-1,n,o); else /* ---> Else without if*/ search(arr,q+1,r,n-q,o);quick sort thrice nd u get d 3 largest/smallest elements ..
algo called quick select which u write in method 3rd
#include int partation(int [], int ,int); void search(int[],int ,int ,int,int); int main() { int n; printf("enter the n for nth larest element\n"); scanf("%d",&n); int arr[]={5,2,7,1,8,9,6}; search(arr,0,6,n-1,n-1); //printf("%d ",partation(arr,0,6)); getchar(); getchar(); return 0; } void search(int arr[],int p,int r, int n,int o) { if(r>p) { int h; int q=partation(arr,p,r); for(h=p;hn) search(arr,p,q-1,n,o); else search(arr,q+1,r,n-q,o); } } } int partation(int a[],int p,int r) { int i,j; j=p; i=j-1; while(j<=r) { if(a[j]<a[r]) { i++; int temp=a[j]; a[j]=a[i]; a[i]=temp; } j++; } int temp=a[i+1]; a[i+1]=a[r]; a[r]=temp; return i+1; }