自定义有容量限制的性能优良的PriorityQueue

时间：2017-02-04 19:02:10 阅读：410 评论：0 收藏：0 [点我收藏+]

JDK中有自带的PriorityQueue,但是没有容量限制，性能比较差。在特定的场合，比如solr自带的搜索智能提示公能，当构建完三叉树，前缀匹配查找出所有的节点之后，就要进行排序。如果业务场景要求只提示前五个，按优先级排序。假设查找出了很多个，这个问题可以抽象为：在海量数据中找出topK的算法。可以自定义一个优先级队列，同时要兼顾时间和空间复杂度。关于这个问题，前面的博客已经分析过了。今天，要定义一个数据结构，区别于JDK的priorityQueue，最坏的时间复杂度：(n-k)*lgk +lg(k-1)!今天下午，试着写了这个数据结构，调试花了很长时间，由于写代码时马虎，经常写错，或者由于某些地方考虑不周全。以后写代码时，一定要注意，保证思维缜密，尽量一次性搞定，不用调试。

package chinese.utility.utils;
/**
* 自定义有容量的优先级队列：构造容量为K(k值很小)的小根堆，每次取值的时候，从堆顶取，然后把最后一个元素置顶，调整小根堆，默认按升序取值。
* @author XueQiang Tong
* @version 1.0
* @since JDK 1.8
* @date 2017/02/04
*/
public abstract class PriorityQueue<T> {
   protected int size;
   protected int heapSize;
   protected int maxSize;
   protected T heap[];

   public PriorityQueue(){
       this.size = 0;
       heapSize = 5;
       this.heapSize = heapSize;
       Object h = new Object[heapSize];
       heap = (T[])h;
   }

   //constructor
   public PriorityQueue(int maxSize){
       this.size = 0;
       if(maxSize == 0){
           heapSize = 5;
       } else{
           if(maxSize >= (1 << 31) - 1){
               throw new IllegalArgumentException("max size must be <= (1 << 32) -1;got:" + heapSize);
           }

           this.heapSize = maxSize;
           Object h = new Object[heapSize];
           heap = (T[])h;
       }
   }
   /**
   * 比较节点属性值大小的方法
   * @param node
   * @param element
   * @return
   */
   public abstract boolean lessThan(T node,T element);

   /**
   * 向队列中添加元素，注意判断容量，如果容量已超，并且添加的元素属性值比堆顶的大，替换之，然后重新调整小根堆
   * @param element
   * @return
   */
   public T insertWithOverFlow(T element){
       if(element == null){
           return null;
       }
       if(this.size < heapSize){
           add(element);
           return element;
       } else {
           if(this.size > 0 && lessThan(heap[0],element)){
               Object o = heap[0];
               heap[0] = element;
               minify(0);
               return (T)o;
           }
       }
       return element;
   }

   /**
   * 调整小根堆
   * @param i
   */
   private void minify(int i) {
       int left = 2 * i + 1;
       int right = 2 * i + 2;
       int min;

       if(left <= this.size-1 && lessThan(heap[left],heap[i])){
           min = left;
       } else min = i;

       if(right <= this.size-1 && lessThan(heap[right],heap[min])){
           min = right;
       }

       if(min == i || min > size) return;
       swap(this.heap,i,min);
       minify(min);
   }
   /**
   * 交换对象
   * @param heap2
   * @param i
   * @param min
   */
   private void swap(T[] heap, int i, int min) {
       T tmp;
       tmp = heap[i];
       heap[i] = heap[min];
       heap[min] = tmp;
   }

   /**
   * 向队列中添加元素
   * @param element
   */
   private void add(T element) {
       this.heap[this.size++] = element;
       //从底部开始调整
       for(int i = this.size/2 - 1;i >= 0;--i){
           minify(i);
       }
   }

   /**
   * 返回堆顶的元素
   * @return
   */
   public T top(){
       return heap[0];
   }

   /**
   * 取堆顶的元素，然后把堆底部的元素置顶，重新调整小根堆
   * @return
   */
   public T pop(){
       if(this.size == 0){
           return null;
       }
       T result = this.heap[0];
       this.heap[0] = this.heap[this.size-1];
       this.heap[this.size-1] = null;
       this.size --;
       minify(0);
       return result;
   }
}

//自定义队列

package chinese.utility.utils;

public class MyPriorityQueue extends PriorityQueue0<Integer> {

   public MyPriorityQueue(){
       super();
   }

   public MyPriorityQueue(int heapSize){
       super(heapSize);
   }

   @Override
   public boolean lessThan(Integer node,Integer element) {
       return node < element;
   }
}
测试类：

package chinese.utility.utils;

import org.junit.Test;

public class PriorityQueueTest {
   PriorityQueue0<Integer> q;


   @Test
   public void test() {
       q = new MyPriorityQueue(5);
       q.insertWithOverFlow(99);
       q.insertWithOverFlow(78);
       q.insertWithOverFlow(109);
       q.insertWithOverFlow(123);
       q.insertWithOverFlow(23);
       q.insertWithOverFlow(45);
       q.insertWithOverFlow(56);
       Integer element = (Integer) q.pop();
       while(element != null){
           System.out.println(element);
           element = (Integer) q.pop();
       }
   }

}

运行结果：

56
78
99
109
123
输出前k个最大值，按升序排列。如果想按降序排列，可以构建大根堆，或者把结果加入到stack中都可以。传递的泛型对象，可以是自定义的节点对象，对象属性可以是词频，或者更加复杂的对象。

在进行算法研发的过程中，往往需要理论验证算法的正确性，同时还要保证算法是最优的。比较复杂的算法，比如机器学习算法，理论证明和推导更重要。最后的阶段是转化为代码。代码实施时，第一步就是设计自己的数据结构，加强数据结构的研究，可以让一个程序员产生质的飞跃！

佟氏出品，必属精品！在算法方面，兼顾理论和编程，立志成为业内高手！

自定义有容量限制的性能优良的PriorityQueue

原文：http://www.cnblogs.com/txq157/p/6365916.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)