读取zip存档中的文件,而不解压缩存档

时间:2015-01-23 00:02:27

标签: ruby

我有一个包含100多个zip文件的目录,我需要读取zip文件中的文件来进行一些数据处理,而不需要解压缩存档。

是否有一个Ruby库来读取zip存档中的文件内容,而不解压缩文件?

使用rubyzip会出错:

require 'zip'

Zip::File.open('my_zip.zip') do |zip_file|
  # Handle entries one by one
  zip_file.each do |entry|
    # Extract to file/directory/symlink
    puts "Extracting #{entry.name}"
    entry.extract('here')

    # Read into memory
    content = entry.get_input_stream.read
  end
end 

给出了这个错误:

test.rb:12:in `block (2 levels) in <main>': undefined method `read' for Zip::NullInputStream:Module (NoMethodError)
    from .gem/ruby/gems/rubyzip-1.1.6/lib/zip/entry_set.rb:42:in `call'
    from .gem/ruby/gems/rubyzip-1.1.6/lib/zip/entry_set.rb:42:in `block in each'
    from .gem/ruby/gems/rubyzip-1.1.6/lib/zip/entry_set.rb:41:in `each'
    from .gem/ruby/gems/rubyzip-1.1.6/lib/zip/entry_set.rb:41:in `each'
    from .gem/ruby/gems/rubyzip-1.1.6/lib/zip/central_directory.rb:182:in `each'
    from test.rb:6:in `block in <main>'
    from .gem/ruby/gems/rubyzip-1.1.6/lib/zip/file.rb:99:in `open'
    from test.rb:4:in `<main>'

2 个答案:

答案 0 :(得分:15)

如果条目是目录而不是文件,则返回Zip::NullInputStream,可能是这种情况吗?

这是代码更强大的变体:

#!/usr/bin/env ruby

require 'rubygems'
require 'zip'


Zip::File.open('my_zip.zip') do |zip_file|
  # Handle entries one by one
  zip_file.each do |entry|
    if entry.directory?
      puts "#{entry.name} is a folder!"
    elsif entry.symlink?
      puts "#{entry.name} is a symlink!"
    elsif entry.file?
      puts "#{entry.name} is a regular file!"

      # Read into memory
      content = entry.get_input_stream.read

      # Output
      puts content
    else
      puts "#{entry.name} is something unknown, oops!"
    end
  end
end

答案 1 :(得分:0)

我遇到了同样的问题,在if entry.file?解决问题之前检查了entry.get_input_stream.read

require 'zip'

Zip::File.open('my_zip.zip') do |zip_file|
  # Handle entries one by one
  zip_file.each do |entry|
    # Extract to file/directory/symlink
    puts "Extracting #{entry.name}"
    entry.extract('here')

    # Read into memory
    if entry.file?
      content = entry.get_input_stream.read
    end
  end
end