Question

我是一个顽固的使用者，他一直使用=而不是<-，显然很多R程序员都会对此表示不满。我编写了formatR包，可以根据parser包将=替换为<-。正如你们中的一些人可能知道的那样，parser几天前在CRAN上成了孤儿。虽然现在又回来了，但这让我对依赖它犹豫不决。我想知道是否有另一种方法可以安全地将=替换为<-，因为并非所有=都是平均分配，例如fun(a = 1)。正则表达式不太可靠（请参阅mask.inline()中formatR函数的line 18），但如果您能改进我的话，我一定会感激不尽。也许codetools包可以提供帮助吗？

一些测试用例：

# should replace
a = matrix(1, 1)
a = matrix(
  1, 1)

(a = 1)
a =
  1

function() {
  a = 1
}

# should not replace
c(
  a = 1
  )

c(
  a = c(
  1, 2))

Answer 1

这个答案使用正则表达式。有一些边缘情况会失败，但大多数代码都应该没问题。如果你需要完美的匹配，那么你需要使用一个解析器，但如果遇到问题，可以随时调整正则表达式。

提防

#quoted function names
`my cr*azily*named^function!`(x = 1:10)
#Nested brackets inside functions
mean(x = (3 + 1:10))
#assignments inside if or for blocks
if((x = 10) > 3) cat("foo")
#functions running over multiple lines will currently fail
#maybe fixable with paste(original_code, collapse = "\n")
mean(
  x = 1:10
)

代码基于?regmatches页面上的示例。基本思路是：交换占位符的函数内容，进行替换，然后将函数内容放回去。

#Sample code.  For real case, use 
#readLines("source_file.R")
original_code <- c("a = 1", "b = mean(x = 1)")

#Function contents are considered to be a function name, 
#an open bracket, some stuff, then a close bracket.
#Here function names are considered to be a letter or
#dot or underscore followed by optional letters, numbers, dots or 
#underscores.  This matches a few non-valid names (see ?match.names
#and warning above).
function_content <- gregexpr(
  "[[:alpha:]._][[:alnum:._]*\\([^)]*\\)", 
  original_code
)

#Take a copy of the code to modify
copy <- original_code

#Replace all instances of function contents with the word PLACEHOLDER.
#If you have that word inside your code already, things will break.
copy <- mapply(
  function(pattern, replacement, x) 
  {
    if(length(pattern) > 0) 
    {
      gsub(pattern, replacement, x, fixed = TRUE) 
    } else x
  }, 
  pattern = regmatches(copy, function_content), 
  replacement = "PLACEHOLDER", 
  x = copy,
  USE.NAMES = FALSE
)

#Replace = with <-
copy <- gsub("=", "<-", copy)

#Now substitute back your function contents
(fixed_code <- mapply(
  function(pattern, replacement, x) 
  {
      if(length(replacement) > 0) 
      {
          gsub(pattern, replacement, x, fixed = TRUE) 
      } else x
  }, 
  pattern = "PLACEHOLDER", 
  replacement = regmatches(original_code, function_content), 
  x = copy,
  USE.NAMES = FALSE
))

#Write back to your source file
#writeLines(fixed_code, "source_file_fixed.R")

Answer 2

Kohske向formatR个软件包发送了pull request，该软件包使用codetools软件包解决了问题。基本思想是设置代码遍历器来遍历代码;当它将=检测为函数调用的符号时，它将被<-替换。这是由于R的“Lisp性质”：x = 1实际上是`=`(x, 1)（我们将其替换为`<-`(x, 1)）;当然，在=的解析树中对fun(x = 1)的处理方式不同。

formatR包（＆gt; = 0.5.2）已经摆脱了对parser包的依赖，replace.assign现在应该是健壮的。

Answer 3

=替换<-的最安全（也可能是最快）方法是直接键入<-，而不是尝试替换它。

判断=是否在R代码中赋值的可靠方法？

3 个答案: