查询及删除重复记录的方法

时间：2015-03-19 23:44:52 阅读：378 评论：0 收藏：0 [点我收藏+]

1、查找表中多余的重复记录，重复记录是根据单个字段（peopleId）来判断select * from people
where peopleId in (select peopleId from people group by peopleId having count(peopleId) > 1)

2、删除表中多余的重复记录，重复记录是根据单个字段（peopleId）来判断，只留有rowid最小的记录
delete from people
where peopleId in (select peopleId from people group by peopleId having count(peopleId) > 1)
and rowid not in (select min(rowid) from people group by peopleId having count(peopleId )>1)

3、查找表中多余的重复记录（多个字段）
select * from vitae a
where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)

4、删除表中多余的重复记录（多个字段），只留有rowid最小的记录
delete from vitae a
where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)
and rowid not in (select min(rowid) from vitae group by peopleId,seq having count(*)>1)

5、查找表中多余的重复记录（多个字段），不包含rowid最小的记录
select * from vitae a
where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)
and rowid not in (select min(rowid) from vitae group by peopleId,seq having count(*)>1)

先看表myemp

技术分享

查出有重复数据的记录

技术分享

查出没有重复数据的记录

技术分享

查出不重复的记录

技术分享

或者

select * from myemp e where rowid = (select max(rowid) from myemp e2 where e.userid = e2.userid and e.username = e2.username and e.salary = e2.salary)

如何删除重复数据

1、当有大量重复数据存在并且在列userid,username,salary上有索引的情况下

delete myemp where rowid not in (select max(rowid) from myemp group by userid,username,salary);

2、适用于少量重复数据的情况（当有大量数据时，效率很低）

delete myemp e where rowid <> (select max(rowid) from myemp e2 where e.userid = e2.userid and e.username = e2.username and e.salary = e2.salary);

3、 exception方法，适合大量重复数据的情况

首先建立exception表

技术分享