构建数据结构 - 数组哈希哈希的哈希

时间:2011-11-08 20:05:40

标签: ruby parsing text transformation text-parsing

本周在工作中我遇到了解析特定文件格式的挑战 包含按不同站点,区域和分类的IP范围 区域。 基本上我需要一个脚本来将所有这些位置信息加载到 数据结构,这将使我能够轻松获得所有IP 的网站,区域或地区,以便以后进行转型。

Required data structure:
 data[Region][Area][Site] -> IPs
       Hash   Hash  Hash   Array

我想知道函数“processLocations”是否可以 优化或是否存在实现所需数据的更简单方法 结构体。特别是在创建“阵列哈希的哈希哈希”区域变量。

希望这也能帮助处于相同情况的其他人,所以这是我当前的工作副本:

require 'pp'

# Function that processes the content of the locations file and returns
the following structure:
#
# data[Region][Area][Site] -> IPs
#       Hash   Hash  Hash   Array
#
def processLocations (lines)
  sites = Hash.new{|h, k| h[k] = []} # HashOFArray
  area = Hash.new{|h,k| h[k]=Hash.new(&h.default_proc)} # HashOFHash
  region = Hash.new{|h,k| h[k]=Hash.new(&h.default_proc)} # HashOFHash

  lines.each do |line|
    next if lines =~ /^#.*/

    # Process IPs range section
    if line =~ /(.*)=([\d|\-|\.]+)/
      #puts "IP: #{$1} - #{$2}"
      sites[$1.chomp.capitalize] << $2
    end

    # Process area section
    if line =~ /(.*)\.area=(.*)/i
      #puts "Area: #{$1} - #{$2}"
      if sites.has_key?($1.chomp.capitalize)

        if (area.has_key?($2.chomp.capitalize) &
area[$2.chomp.capitalize].has_key?($1.chomp.capitalize))
          # The hash exists
          #puts "Adding to an existing hash key more IPs elements to the
array"
          area[$2.chomp.capitalize][$1.chomp.capitalize] <<
sites[$1.chomp.capitalize]
       else
          # The hash does not exist
          #puts "Adding new hash key with new array"
          area[$2.chomp.capitalize][$1.chomp.capitalize] =
sites[$1.chomp.capitalize]
        end

        # Clean site hash
        sites = Hash.new{|h, k| h[k] = []} # HashOFArray
      end
    end

    # Process region section
    if line =~ /(.*)\.region=(.*)/i
      #puts "Region: #{$1} - #{$2}"
      if area.has_key?($1.chomp.capitalize)
        tmp = Hash.new
        tmp = area.dup

        region[$2.chomp.capitalize][$1.chomp.capitalize] =
tmp[$1.chomp.capitalize]
      end
    end
  end
  return region
end

##############
#  MAIN

f = File.open(DATA)
 lines = f.readlines
f.close
data = processLocations(lines)

puts "+data---------------------------------------------------------"
pp data

puts "+data['Asia']-------------------------------------------------"
pp data['Asia']

puts "+data['Asia']['Australia']------------------------------------"
pp data['Asia']['Australia']

puts "+data['Europe-middle east-africa']['France']['Paris']---------"
pp data['Europe-middle east-africa']['France']['Paris']


__END__
Alexandria (ALH)=192.168.6.0-192.168.6.127
Alexandria (ALH).area=Australia
Australia.region=Asia

Altona=192.168.1.192-192.168.1.255
Altona=192.168.2.192-192.168.2.255
Altona.area=Australia

TOKYO VPN=192.168.3.192-192.168.3.255
TOKYO VPN.area=JAPAN
JAPAN.region=Asia

Paris=192.168.4.192-192.168.4.255
Paris.area=France

Rennes=192.168.5.192-192.168.5.255
Rennes.area=France
France.region=EUROPE-MIDDLE EAST-AFRICA

示例输出:

# ruby ruby_help.rb
+data---------------------------------------------------------
{"Asia"=>
  {"Australia"=>
    {"Alexandria (alh)"=>["192.168.6.0-192.168.6.127"],
     "Altona"=>["192.168.1.192-192.168.1.255",
"192.168.2.192-192.168.2.255"]},
   "Japan"=>{"Tokyo vpn"=>["192.168.3.192-192.168.3.255"]}},
 "Europe-middle east-africa"=>
  {"France"=>
    {"Paris"=>["192.168.4.192-192.168.4.255"],
     "Rennes"=>["192.168.5.192-192.168.5.255"]}}}
+data['Asia']-------------------------------------------------
{"Australia"=>
  {"Alexandria (alh)"=>["192.168.6.0-192.168.6.127"],
   "Altona"=>["192.168.1.192-192.168.1.255",
"192.168.2.192-192.168.2.255"]},
 "Japan"=>{"Tokyo vpn"=>["192.168.3.192-192.168.3.255"]}}
+data['Asia']['Australia']------------------------------------
{"Alexandria (alh)"=>["192.168.6.0-192.168.6.127"],
 "Altona"=>["192.168.1.192-192.168.1.255",
"192.168.2.192-192.168.2.255"]}
+data['Europe-middle east-africa']['France']['Paris']---------
["192.168.4.192-192.168.4.255"]

提前致谢并提出任何建议,

Sebastian YEPES

1 个答案:

答案 0 :(得分:0)

我同意Mu你可以建立一些课程,但我认为这应该有效:

def processLocations (lines)
  sites     = Hash.new{|h, k| h[k] = []}
  areas     = Hash.new{|h, k| h[k] = {}}
  regions   = Hash.new{|h, k| h[k] = {}}

  lines.each do |line|

  case line
    # Process IPs range section
    when /(.*)=([\d|\-|\.]+)/
      sites[$1.chomp.capitalize] << $2
    # Process area section
    when /(.*)\.area=(.*)/i
      site_key, area = $1.chomp.capitalize, areas[$2.chomp.capitalize]
      area[site_key] = sites[site_key]
    # Process region section
    when /(.*)\.region=(.*)/i
      area_key, region = $1.chomp.capitalize, regions[$2.chomp.capitalize]
      region[area_key] = areas[area_key]
    when /^#.*/ # do nothing
    else
      # error?
    end
  end
  regions
end
相关问题