Question

我目前正在编写一个Bash脚本，该脚本对文本文件的每一行进行哈希处理并将其输出到格式为/^[a-z0-9]+$/i的新文件中。我目前执行此操作的脚本是：

// True
sValue = "123abc"
console.log(/^[a-z0-9]+$/i.test(sValue))

sValue = "123"
console.log(/^[a-z0-9]+$/i.test(sValue))

sValue = "abc"
console.log(/^[a-z0-9]+$/i.test(sValue))

//False
sValue = "123abc "
console.log(/^[a-z0-9]+$/i.test(sValue))

sValue = "123,hyg"
console.log(/^[a-z0-9]+$/i.test(sValue))

sValue = " 123"
console.log(/^[a-z0-9]+$/i.test(sValue))

// Unclear whether this should this be true or false?
sValue = ""
console.log(/^[a-z0-9]+$/i.test(sValue))

我最初是从一个非常相似的问题here链接中获得此代码的。该脚本的工作方式与所宣传的完全相同；但是，问题在于，即使对于非常小的文本文件（<15KB），也要花费很长时间。

如果有人可以建议一个脚本实现完全相同的结果，但效率更高的话，我将不胜感激。

在此先感谢您的帮助，

亲切的问候，约翰

Answer 1

您可以将文件每行分割成一个文件，然后在一个调用中完成：

$ cat > words.txt << EOF
> foo
> bar
> baz
> EOF
$ split --lines=1 words.txt 
$ sha256sum x*
b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c  xaa
7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730  xab
bf07a7fbb825fc0aae7bf4a1177b2b31fcf8a3feeaf7092761e18c859ee52a9c  xac

Answer 2

我会非常谨慎地在纯外壳中执行此操作。每行启动哈希函数的开销将使其在大型文件上的运行速度变慢。

一小段Perl怎么样？

perl -MDigest::MD5 -nle 'print Digest::MD5::md5_hex($_), ":", $_' <$originalfile >>$outputlocation

Perl有各种Digest模块，因此使用比MD5少的东西很容易。

perl -MDigest::SHA -nle 'print Digest::SHA::sha256_hex($_), ":", $_' <$originalfile >>$outputlocation

如果您想使用Whirlpool，则可以使用以下方法从CPAN安装它：

cpan install Digest::Whirlpool

并与

一起使用

perl -MDigest -nle '$ctx = Digest->new("Whirlpool"); $ctx->add($_); print $ctx->hexdigest(), ":", $_' <$originalfile >>$outputlocation

哈希文本文件每一行的最有效方法？

2 个答案: