Question

在R语言中，我想使用switch语句替换nest if else语句。我想为新列分配值，我的想法是：

## Create a function to seperate the case

Range <- function(x)
    if (CityData_Group_Copy$BadDebtNum[x] < 26)  
              { CityData_Group_Copy$BadDebtRange[x] <- "1~25"}

    else if(CityData_Group_Copy$BadDebtNum[x] > 25 && CityData_Group_Copy$BadDebtNum[x] < 51)  
              {CityData_Group_Copy$BadDebtRange[x] <- "26~50"}

    else if(CityData_Group_Copy$BadDebtNum[x] > 51 && CityData_Group_Copy$BadDebtNum[x] < 76)   
              {CityData_Group_Copy$BadDebtRange[x] <- "51~75"}

    else if(CityData_Group_Copy$BadDebtNum[x] > 75 && CityData_Group_Copy$BadDebtNum[x] < 101)  
              {CityData_Group_Copy$BadDebtRange[x] <- "76~100"}

    else if(CityData_Group_Copy$BadDebtNum[x] > 100)
              { CityData_Group_Copy$BadDebtRange[x] <- "100+"}


## Assign the result to the new column "CityData_Group_Copy$BadDebtRange" 

for(i in 1: nrow(CityData_Group_Copy) ){
  Range(i)
}

我也试过这个解决方案：

Range <- function(x)
 switch (true) {
  case (CityData_Group_Copy$BadDebtNum[x] < 26): CityData_Group_Copy$BadDebtRange[x] <- "1~25"  break;
  case (CityData_Group_Copy$BadDebtNum[x] > 25 && CityData_Group_Copy$BadDebtNum[x] < 51): CityData_Group_Copy$BadDebtRange[x] <- "26~50"  break;
  case (CityData_Group_Copy$BadDebtNum[x] > 51 && CityData_Group_Copy$BadDebtNum[x] < 76): CityData_Group_Copy$BadDebtRange[x] <- "51~75"  break;
  case (CityData_Group_Copy$BadDebtNum[x] > 75 && CityData_Group_Copy$BadDebtNum[x] < 101): CityData_Group_Copy$BadDebtRange[x] <- "76~100"  break;
  case (CityData_Group_Copy$BadDebtNum[x] > 100): CityData_Group_Copy$BadDebtRange[x] <- "100+" break;
  }

但似乎R中没有这样的语法。我收到了一个错误：

错误：出乎意料的＆＃39;中断＆＃39; in＆＃34; case（CityData_Group_Copy $ BadDebtNum [x]＆gt; 101）：CityData_Group_Copy $ BadDebtRange [x]＆lt; - ＆＃34; 100 +＆＃34;打破＆＃34; *

那么有没有解决方案以简单的方式实现我的想法？

Answer 1

看起来您正在对数据进行分箱，这可以使用cut函数来完成：

bad_debt_num = sample(1:120, 100, replace=T)
cut(bad_debt_num, breaks=c(0, 25, 50, 75, 100, 1000))

有关问题Generate bins from a data frame中的binning的更多信息。

R switch statement相当有限。

Answer 2

使用Sub Aftekenen() Dim Sh As Worksheet Dim Loc As Range Dim Datum As String Application.FindFormat.NumberFormat = "dd-mm-yyyy" Datum = "28-12-2015" For Each Sh In ThisWorkbook.Worksheets With Sh.UsedRange Set Loc = .Cells.Find(Datum, , xlValues, , , , True, , True) If Not Loc Is Nothing Then Do Until Loc Is Nothing Loc.Value = "Yes" Set Loc = .FindNext(Loc) Loop End If End With Set Loc = Nothing Next：无需ifelse功能

switch

Answer 3

首先，为什么在一组if else if类型语句中你是双重定义逻辑？您所需要的只是：

iel = function(x){
  if(data[x] < 26) {
    return("<=25")
  } else if(data[x] < 51){
    return("26~50")
  } else if(data[x] < 76){
    return("51~75")
  } else if(data[x] < 101){
    return("76~100")
  } else {
    return("100+")
  }
}

这与使用ifelse()语句的其他答案相比如何？同样的事情，你可以通过利用你正在嵌套逻辑的事实减少检查你的工作量，不需要说“如果它不是＆lt; 26然后检查以确保它是＆gt; 25” - 这是多余的。

ieie = function(data){
  return(ifelse(data< 26, "<=25", 
         ifelse (data < 51,"26~50",
                 ifelse(data < 76, "51~75",
                        ifelse (data < 101,"76~100",
                                "100+")))))
}

从速度的角度来看，这个解决方案有何比较？您的里程可能会有所不同，但是：

library(microbenchmark)
data = rnorm(1e6,50,15)
rmicrobenchmark(sapply(1:length(data),iel),ieie(data), times=50L)

#> Unit: seconds
                          expr      min       lq     mean   median       uq      max neval
 sapply(1:length(data), group) 1.710709 2.016842 2.243246 2.223891 2.376228 2.954147    50
                    ieie(data) 1.902938 2.094678 2.296946 2.220572 2.438968 3.929247    50

通过采用传统逻辑，即使没有向量化，并将其包装在sapply（返回向量）中，我看到嵌套ifelse()在min，mean和max方面略有改进。这仅基于50次重复（每次约2.5秒（平均）意味着每次模拟约5秒）。数据从未改变过，这只是关注计算机可以快速处理数据的速度，同时消除了计算机上发生的其他事情的噪音。

如果我们将其提升到长度为1e7的矢量

，该怎么办？

data = rnorm(1e7,50,15)
microbenchmark(sapply(1:length(data),iel),ieie(data), times=5L)
#> Unit: seconds
                        expr      min       lq     mean   median       uq      max neval
 sapply(1:length(data), iel) 22.38624 27.42520 27.74565 27.85335 27.89591 33.16756     5
                  ieie(data) 17.52102 17.62965 18.90965 19.49140 19.89423 20.01194     5

这对我来说实际上非常有趣，我总是被告知/认为嵌套ifelse()语句对性能不利，但是当矢量大小增加时，情况并非如此。

尽管如此，cut功能仍然非常优越：

data6 = rnorm(1e6,50,15)
data7 = rnorm(1e7,50,15)
microbenchmark(cut(data6, breaks=c(0, 25, 50, 75, 100, 1000)),cut(data7, breaks=c(0, 25, 50, 75, 100, 1000)),times=10L)
#>Unit: milliseconds
                        expr       min        lq      mean    median        uq       max neval
 cut(data6, breaks = c(...))  204.1436  206.2564  224.1509  221.5659  232.8876  260.8075    10
 cut(data7, breaks = c(...)) 2059.5744 2118.6611 2213.9544 2210.8787 2271.1089 2407.6448    10

哇！那是几毫秒。 R中的内置函数利用其他语言肯定会得到回报。

所以，我的答案并没有提供新的解决方案，但希望有助于教育不同方法的处理速度。

在R中，如何解释大于/小于的switch语句

3 个答案: