删除重复的行,保留最后一个,然后删除第一个

时间:2019-01-23 22:15:47

标签: excel vba

enter image description here

我正在尝试提供在D列中查找所有重复文本的代码,然后删除第一个重复文本所在的整行。行之间有空格,因此使用代码.End(xl)Up除非您能够将整个列作为目标,而不管两者之间是否有空格,否则它不会起作用  行不知何故。

到目前为止,我已经尝试了两种方法,但是都没有达到我的期望。

这是我的第一种方法,因为工作表有轮廓,所以该方法无效:

Sub test()

ActiveSheet.Range("D:D").RemoveDuplicates Columns:=1, header:=xlNo

End Sub

这是我从另一个运行了几分钟的站点获得的第二种方法,但似乎并没有达到我想要达到的目的。

Sub Row_Dupe_Killer_Keep_Last()
Dim lrow As Long

For lrow = Cells(Rows.Count, "D").End(xlUp).Row To 2 Step -1
    If Cells(lrow, "D") = Cells(lrow, "D").Offset(-1, 0) Then
       Cells(lrow, "D").Offset(-1, 0).EntireRow.Delete
    End If

Next lrow
End Sub

有人有任何建议或提示吗?我已经用有限的技能从事这项工作了几天,还没办法找到一种方法...在此先感谢您的时间。

3 个答案:

答案 0 :(得分:2)

编辑:现在忽略空格

编辑:修改后可以更改起始行

您可能想要做的是将数据拉入数组并在数组中搜索重复项。 Excel可以比处理每个单元格快得多的速度来处理数组。

以下代码就是这样做的。它将不理会D1(例如在您的示例代码中),并将删除所有重复项的整个行,仅保留最后一项。

要处理删除行,我们将所有重复项添加到名为rngDelete的范围对象中,并一次删除所有行。这将使它的运行比逐个删除的速度快得多。

Sub Row_Dupe_Killer_Keep_Last()
    Dim vData As Variant
    Dim rngDelete As Range
    Dim lrow As Long, lrowSearch As Long
    Dim lStartRow as long

    'Change this to the row you wish to start with (the top row)
    lStartRow = 22

    'Get all of the data from the cells into a variant array
    'Normally I would prefer to use usedrange, but this method is fine
    '(Note: Change the 2 to 1 if you want to include the entire column including Row number 1)
    vData = Range(Cells(lStartRow, "D").Address & ":" & Cells(Rows.Count, "D").End(xlUp).Address)

    'Search for duplicates
    'First, loop through backwards one by one
    For lrow = UBound(vData) To LBound(vData) Step -1
        'now loop through forwards (up to the point where we have already looked)
        For lrowSearch = LBound(vData) To lrow
            'Check if we have a duplicate
            If Not IsError(vData(lrow, 1)) And Not IsError(vData(lrowSearch, 1)) Then
                If lrow <> lrowSearch And vData(lrow, 1) = vData(lrowSearch, 1) And vData(lrow, 1) <> "" Then
                    'We have a duplicate! Let's add it to our "list to delete"
                    If rngDelete Is Nothing Then
                        'if rngDelete isn't set yet...
                        Set rngDelete = Range("D" & lrowSearch + lStartRow-1)
                    Else
                        'if we are adding to rngDelete...
                        Set rngDelete = Union(rngDelete, Range("D" & lrowSearch + lStartRow-1))
                    End If
                End If
            End If
        Next lrowSearch
    Next lrow

    'Delete all of the duplicate rows
    If Not rngDelete Is Nothing Then
        rngDelete.EntireRow.Delete
    End If
End Sub

答案 1 :(得分:1)

这应该可以帮助您。

    Option Explicit

    Const c_intMaxBlanks As Integer = 5
    Const c_AbsoluteMaxRowsInSheet As Integer = 5000

    Public Sub RunIt()
        Row_Dupe_Killer_Keep_Last ActiveSheet.Range("D:D")
    End Sub

    Public Sub Row_Dupe_Killer_Keep_Last(rngCells As Range)

        Dim iRow As Integer, iCol As Integer
        Dim intBlankCnt As Integer
        Dim intMaxBlanks As Integer
        Dim blnIsDone As Boolean
        Dim intSaveStartRow As Integer
        Dim blnStartCnt As Boolean
        Dim strTemp As String
        Dim strCheck As String
        Dim intI As Integer
        Dim intJ As Integer
        Dim intSaveEndRow As Integer


        'First, Count the consecutive blanks
        blnIsDone = False
        blnStartCnt = False
        intSaveStartRow = 0
        intSaveEndRow = 0
        intBlankCnt = 0
        iRow = 1
        iCol = rngCells.Column
        Do While (Not blnIsDone)
            'Check for blank Row using length of string
            If (Len(Trim(rngCells.Cells(iRow, 1).Value)) < 1) Then  
                If Not blnStartCnt Then
                    intSaveStartRow = iRow
                    blnStartCnt = True
                Else
                    If (intSaveStartRow + intBlankCnt) <> iRow Then
                        'restart
                        intSaveStartRow = iRow
                        intBlankCnt = 0
                    End If
                End If
                intBlankCnt = intBlankCnt + 1
            Else
                'restart
                blnStartCnt = False
                intBlankCnt = 0
            End If

            intSaveEndRow = iRow

            If intBlankCnt >= c_intMaxBlanks Then blnIsDone = True

            'Stop Loop: Maybe Infinite"
            If iRow > c_AbsoluteMaxRowsInSheet Then Exit Do
            iRow = iRow + 1
        Loop

        'Now, loop through each row in the column and check values.
        For intI = intSaveEndRow To 2 Step -1
            strTemp = LCase(Trim(rngCells.Cells(intI, 1).Value))
            For intJ = intSaveEndRow To 2 Step -1
                If intJ <> intI Then
                    strCheck = LCase(Trim(rngCells.Cells(intJ, 1).Value))
                    If strTemp = strCheck Then
                        'Found a dup, delete it
                        rngCells.Cells(intJ, 1).EntireRow.Delete
                    'ElseIf Len(strCheck) < 1 Then
                    '    'Delete the blank line
                    '    rngCells.Cells(intJ, 1).EntireRow.Delete
                    End If
                End If
            Next intJ
        Next intI

    End Sub

答案 2 :(得分:1)

此方法避免使用foo: bar: '{ "hey": "there''s cake" }' ,而众所周知,sed的速度很慢。清除内容,并对数据集进行排序以消除空白。

编辑:切换到“下一个”以启用从底部向上搜索;还清理了宏记录器生成的排序例程...当我需要它时,我似乎从未有过该例程:)。

注意:这也不会与大纲配合使用...无论您如何使其适用于其他答案,

我很好奇“清除/排序”方法是否对您有用,并且是否可以加快您的例程。

EntireRow.Delete