Question

以下是我的行的样子：

{"id": x ,

"data": [
  {
  "someId": 1 ,
  "url": https://twitter.com/HillaryClinton?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor, »
  } ,

  {
  "someId": 2 ,
  "url": http://nymag.com/daily/intelligencer/2016/05/hillary-clinton-candidacy.html, »
  } ,
]}

我在data.url上创建了二级索引，因此查找文档很简单，但如何最有效地更新该特定的嵌套对象？

我可能会为其添加新密钥或仅更新现有密钥（newField，下面的示例中为anotherField）。

最终结果应如下所示：

  "data": [
  {
  "someId": 1 ,
  "url": https://twitter.com/HillaryClinton?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor,

  "newField": 5,
  "anotherField": 12
  } 
...

编辑：让它像这样工作（python）：

a = r.db("lovli").table("KeyO").get_all("https://www.hillaryclinton.com/", index= "url").update(
    lambda doc:
        {"data": doc['data'].map(lambda singleData:
            r.branch(
            singleData['url'] == "https://www.hillaryclinton.com/",
            singleData.merge({"status_tweet":3, "pda": 1}),
            singleData
            )
        )
        }

).run(conn)

这可以改善吗？此外，我将同时更新大量网址...无论如何要通过大量这样做进一步提高性能？

Answer 1

正如一般建议，如果您计划对数组内的嵌入文档执行许多操作。您可能希望展开并展平数据模型。

更新阵列中的单个文档既尴尬又困难，以下是一种可能方法中的步骤：

获取要编辑的原始嵌入文档的值
获取嵌入文档的索引
修改嵌入文档
使用修改后的
更新整个外部文档。

或者，你所做的是：

将文档缩小到某个域，在本例中为“hillaryclinton.com”
映射data数组中的N条推文并检查它们是否匹配，如果匹配则进行更新。

在这种情况下，在最糟糕的情况下，您将打到过滤的M个文档数，超过N个嵌入文档。您可能想查看我给出了类似设计问题的人this other answer。但是，我认为以下会带来卓越的性能。

如果你改为存储你的数据：

{
  "secondary_id": x,
  "data": {
    "someId": 1 ,
    "url": "https://twitter.com/HillaryClinton?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor", 
  } 
}, {
 "secondary_id": x,
 "data": {
  "secondary_id": 2 ,
  "url": "http://nymag.com/daily/intelligencer/2016/05/hillary-clinton-candidacy.html", 
  }
}

然后，您可以根据url索引所基于的值和值data.url创建复合索引，这可能会显着降低您的操作。

如果不了解您在数据集中经常访问的内容，很难给出更多指示，但我认为这应该会更好。如果要重新创建原始数据模型，它将如下所示：

r.db("lovli").table("Key0").get_all( SEARCH_URL, index="url").group("secondary_id")

你会得到这样的东西。此处使用x作为示例：

{
  "group": x,
  "reduction": [{
      "secondary_id": x,
      "data": {
        "someId": 1 ,
        "url": "https://twitter.com/HillaryClinton?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor", 
      } 
    }, {
     "secondary_id": x,
     "data": {
      "secondary_id": 2 ,
      "url": "http://nymag.com/daily/intelligencer/2016/05/hillary-clinton-candidacy.html", 
      }
    }]
}

嵌套数组中的RethinkDB更新字段 - ＆gt; obj

1 个答案: