我使用特殊的非标准语法编写了数千行代码。我需要能够使用不支持此语法的其他编译器来编译代码。我试图自动执行需要进行的更改,但是对正则表达式等的使用不是很好。我失败了。
这是我要实现的目标:当前,在我的代码中,使用以下可能的语法调用/访问对象的方法和变量:
call obj.method()
obj.method( )
obj.method( arg1, arg2, kwarg1=kwarg1 )
obj1.var = obj2.var2
相反,我希望它是:
call obj%method()
obj%method( )
obj%method( arg1, arg2, kwarg1=kwarg1 )
obj1%var = obj2%var2
我想进行这些更改而不会影响以下可能出现的“。” s:
小数:
a = 1.0
b = 1.d0
逻辑运算符(注意可能的空格和方法调用):
if (a.or.b) then
if ( a .and. .not.(obj.l1(1.d0)) ) then
任何有注释的内容(为此使用感叹号“!”)
!>I am a commented line.
! > I am.a commented line with..leading blanks and extra periods.1.
b=a1.var( 0.d0 ) !! I contain a commented version of this line: b=a1.var( 0.d0 )
任何用引号引起来(即字符串文字)
c = "I am a string"
c= 'I am an obnoxious string: b=a1.var( 0.d0 ) ... '
有人知道如何解决这个问题。我猜正则表达式是自然的方法,但是我对任何事物都持开放态度。 (以防有人在乎:代码是用fortran编写的。ifort对“。”语法感到满意;对gfortran则不满意)
答案 0 :(得分:2)
您是否考虑过使用flex解决问题?它使用正则表达式,但是更高级,因为它尝试使用不同的模式并返回最长的匹配选项。规则如下所示:
napoleon_use_param = True
您可能需要修改第三行。当前,如果没有从conf.py
到%% /* rule part of the program */
!.*\n printf(yytext); /* ignore comments */
\".*\"|'.*' printf(yytext); /* ignore strings */
[^A-Za-z_][0-9]+\. printf(yytext); /* ignore numbers */
".and."|".or."|".not." printf(yytext); /* ignore logical operators */
\. printf("%%"); /* now, replace the . by % */
[^\.] printf(yytext); /* ignore everything else */
%% /* invoke the program */
int main() {
yylex();
}
,从.
到A
的任何字符,它会忽略在任何位数之后出现的任何Z
或数字前的字符a
。如果标识符中还有更多合法字符,则可以添加它们。
如果一切正确,则应该可以将其转换为程序。将其复制到名为z
的文件中并执行:
_
然后您有了C程序lex.l
。您可以在命令行中使用它:
$ flex -o lex.yy.c lex.l
$ gcc -o lex.out lex.yy.c -lfl
这使用与Ed Mortons建议相同的原理,但是使用了flex,因此我们可以跳过组织。在某些情况下,例如在字符串中包含lex.out
还是失败。
cat unreplaced.txt | ./lex.out > replaced.txt
\"
答案 1 :(得分:1)
如果没有语言解析器,您将无法做到100%健壮(例如,如果您将\"
放在双引号字符串中,以下操作将在某些情况下失败-易于处理,但只是许多可能的失败之一而未包括在内)您的用例),但这将处理您到目前为止向我们展示的内容。它将GNU awk用于gensub(),将第三个arg用于match()。
示例输入:
$ cat file
call obj.method()
obj.method( )
obj.method( arg1, arg2, kwarg1=kwarg1 )
obj1.var = obj2.var2
a = 1.0
b = 1.d0
if (a.or.b) then
if ( a .and. .not.(obj.l1(1.d0)) ) then
!>I am a commented line.
! > I am.a commented line with..leading blanks and extra periods.1.
b=a1.var( 0.d0 ) !! I contain a commented version of this line: b=a1.var( 0.d0 )
c = "I am a string"
c= 'I am an obnoxious string: b=a1.var( 0.d0 ) ... '
c="I am an exclaimed string!"; b=a1.var()
预期输出:
$ cat out
call obj%method()
obj%method( )
obj%method( arg1, arg2, kwarg1=kwarg1 )
obj1%var = obj2%var2
a = 1.0
b = 1.d0
if (a.or.b) then
if ( a .and. .not.(obj%l1(1.d0)) ) then
!>I am a commented line.
! > I am.a commented line with..leading blanks and extra periods.1.
b=a1%var( 0.d0 ) !! I contain a commented version of this line: b=a1.var( 0.d0 )
c = "I am a string"
c= 'I am an obnoxious string: b=a1.var( 0.d0 ) ... '
c="I am an exclaimed string!"; b=a1%var()
脚本:
$ cat tst.awk
{
# give us the ability to use @<any other char> strings as a
# replacement/placeholder strings that cannot exist in the input.
gsub(/@/,"@=")
# ignore all !s inside double-quoted strings
while ( match($0,/("[^"]*)!([^"]*")/,a) ) {
$0 = substr($0,1,RSTART-1) a[1] "@-" a[2] substr($0,RSTART+RLENGTH)
}
# ignore all !s inside single-quoted strings
while ( match($0,/('[^']*)!([^']*')/,a) ) {
$0 = substr($0,1,RSTART-1) a[1] "@-" a[2] substr($0,RSTART+RLENGTH)
}
# Now we can separate comments from what comes before them
comment = gensub(/[^!]*/,"",1)
$0 = gensub(/!.*/,"",1)
# ignore all .s inside double-quoted strings
while ( match($0,/("[^"]*)\.([^"]*")/,a) ) {
$0 = substr($0,1,RSTART-1) a[1] "@#" a[2] substr($0,RSTART+RLENGTH)
}
# ignore all .s inside single-quoted strings
while ( match($0,/('[^']*)\.([^']*')/,a) ) {
$0 = substr($0,1,RSTART-1) a[1] "@#" a[2] substr($0,RSTART+RLENGTH)
}
# convert all logical operators like a.or.b to a@#or@#b so the .s wont get replaced later
while ( match($0,/\.([[:alpha:]]+)\./,a) ) {
$0 = substr($0,1,RSTART-1) "@#" a[1] "@#" substr($0,RSTART+RLENGTH)
}
# convert all obj.var and similar to obj%var, etc.
while ( match($0,/\<([[:alpha:]]+[[:alnum:]_]*)[.]([[:alpha:]]+[[:alnum:]_]*)\>/,a) ) {
$0 = substr($0,1,RSTART-1) a[1] "%" a[2] substr($0,RSTART+RLENGTH)
}
# Convert all @#s in the precomment text back to .s
gsub(/@#/,".")
# Add the comment back
$0 = $0 comment
# Convert all @-s back to !s
gsub(/@-/,"!")
# Convert all @=s back to @s
gsub(/@=/,"@")
print
}
运行脚本及其输出:
$ awk -f tst.awk file
call obj%method()
obj%method( )
obj%method( arg1, arg2, kwarg1=kwarg1 )
obj1%var = obj2%var2
a = 1.0
b = 1.d0
if (a.or.b) then
if ( a .and. .not.(obj%l1(1.d0)) ) then
!>I am a commented line.
! > I am.a commented line with..leading blanks and extra periods.1.
b=a1%var( 0.d0 ) !! I contain a commented version of this line: b=a1.var( 0.d0 )
c = "I am a string"
c= 'I am an obnoxious string: b=a1.var( 0.d0 ) ... '
c="I am an exclaimed string!"; b=a1%var()