Question

我希望存储大量（数亿到数千亿）任意嵌套的哈希结构（通常为4-6个级别），其中一些属性位于顶层。我不需要在嵌套哈希中查询，只需要在顶级属性上查询。必须可以在不编写代码的情况下进行查询，通常用于顶级属性的完全匹配。更新记录时，我希望能够仅更新已更改的子哈希结构部分，而不必读取/写入整个记录。 db必须具有C，Ruby和Python的绑定/驱动程序。

Mongodb似乎是理想的，除了对个别项目有4MB（很快就会是8MB或16MB）的限制。大多数这些项目都很小，但其中一些可能是100-200MB，可能更大。

是否有其他符合这些条件的数据库？

Answer 1

Redis不符合您的许多要求，但如果您愿意在其上面构建一些东西，它可以。

缺少两件关键的事情。

首先，Redis不支持嵌套哈希。但是，如果您愿意使用某种编码，则值可以指向具有另一个哈希的键。这将允许任意嵌套结构。有了这个hack，更新只需要更新更改的部分。你必须用C，Ruby和Python编写这个层。但这很简单。

其次，没有一个界面可以让你在不编写代码的情况下查询它。但这应该很容易写。你只需要写一次。

Answer 2

您可以进行后期处理。你将不得不为你的subhashes中的'id'键分别命名，但如果你这样做，那么这样的东西应该有用......到目前为止一直都很好：

给出一个像这样的存储哈希：

x => #<Company id: 16, name: "JRapid", markets: {"markets"=>"[{:market_id=>12, :market_name=>\"enterprise software\", :parents=>[{:parent_id=>12, :name=>\"enterprise software\", :grandparents=>{:parent_id=>12, :name=>\"enterprise software\"}}]}, {:market_id=>38, :market_name=>\"cloud computing\", :parents=>[{:parent_id=>38, :name=>\"cloud computing\", :grandparents=>{:parent_id=>38, :name=>\"cloud computing\"}}]}, {:market_id=>409, :market_name=>\"development platforms\", :parents=>[{:parent_id=>409, :name=>\"development platforms\", :grandparents=>{:parent_id=>409, :name=>\"development platforms\"}}]}, {:market_id=>1132, :market_name=>\"developer tools\", :parents=>[{}]}]"}, locations: {"locations"=>"[{:location_id=>1624, :location_name=>\"california\", :parents=>[{}]}, {:location_id=>1703, :location_name=>\"sunnyvale\", :parents=>[{}]}]"}, follower_count: 8, high_concept: "Rapid development Java cloud platform", product_desc: "JRapid is a Platform as a Service and is the fastes...", urls: {"blog_url"=>"http://www.jrapid.com/blog", "logo_url"=>"https://angel.co/images/icons/startup-nopic.png", "thumb_url"=>"https://angel.co/images/icons/startup-nopic.png", "company_url"=>"http://www.jrapid.com", "twitter_url"=>"http://www.twitter.com/JRapid", "angellist_url"=>"https://angel.co/jrapid"}, status: nil, created_at_or_updated_at: {"created_at"=>"2010-07-21T18:48:32Z", "updated_at"=>"2011-05-07T20:00:37Z"}, screenshots: {"screenshots"=>"[[nil]]"}, created_at: "2012-08-07 05:37:54", updated_at: "2012-08-07 05:37:54">

你可以这样做：

x = x.locations
x = x['locations']
x = eval(x)
x[0][:id]
 #=> 1624

警告：在给定字符串上运行eval（）几乎可以使用任何内容。所以这可能不是一种“生产模式”解决方案。事实上并非如此。但是在您学习如何使用一些真正的Document-DB解决方案之前，它将适用于临时工作。再次：警告！运行eval可能很危险！

（如果这对你有所帮助，请一次性 - 因为提出太多问题而需要更多代表才能再次提出问题，我被禁止参加SO）

我应该使用什么数据库来存储大量可能很大的嵌套哈希结构？

2 个答案: