我应该直接返回数据集还是应该使用one_shot迭代器?

时间:2019-02-04 16:11:45

标签: python tensorflow iterator pipeline tensorflow-datasets

我正在使用Dataset API构建数据管道,但是当我训练多个GPU并在输入函数中返回dataset.make_one_shot_iterator()。get_next()时,出现ValueError:dataset_fn()必须返回tf.data。使用tf.distribute.Strategy时的数据集。我可以遵循错误消息并直接返回数据集,但是我不了解iterator()。get_next()的用途以及它如何在单GPU或多GPU上进行训练。

...

    dataset = dataset.repeat(num_epochs)
    dataset = dataset.batch(batch_size = batch_size)
    dataset = dataset.cache()

    dataset = dataset.prefetch(buffer_size=None)

    return dataset.make_one_shot_iterator().get_next()

return _input_fn

1 个答案:

答案 0 :(得分:0)

在将<link rel="import" href="../bower_components/polymer/polymer.html"> <link rel="import" href="../bower_components/vaadin-ordered-layout/vaadin-vertical-layout.html"> <link rel="import" href="../bower_components/vaadin-button/vaadin-button.html"> <link rel="import" href="../bower_components/vaadin-charts/vaadin-chart.html"> <dom-module id="test-vlbutton"> <template> <style include="shared-stylesXXXX"> :host { display: block; } </style> <vaadin-vertical-layout size-full theme="spacing" style="width: 100%; height: 100vh;" id="vl"> <vaadin-button id="but1" theme="primary"> Primary1 </vaadin-button> <vaadin-chart style="width: 100%; flex-grow: 1" id="chart1" title="Solar Employment Growth by Sector, 2010-2016" subtitle="Source: thesolarfoundation.com" categories="[2010, 2011, 2012, 2013, 2014, 2015, 206, 2017]"> <vaadin-chart-series title="Installation" unit="Number of Employees" values="[43934, 52503, 57177, 69658, 97031, 119931, 137133, 154175]"></vaadin-chart-series> <vaadin-chart-series title="Manufacturing" unit="Number of Employees" values="[24916, 24064, 29742, 29851, 32490, 30282, 38121, 40434]"></vaadin-chart-series> <vaadin-chart-series title="Sales &amp; Distribution" unit="Number of Employees" values="[11744, 17722, 16005, 19771, 20185, 24377, 32147, 39387]"></vaadin-chart-series> <vaadin-chart-series title="Project Development" unit="Number of Employees" values="[null, null, 7988, 12169, 15112, 22452, 34400, 34227]"></vaadin-chart-series> <vaadin-chart-series title="Other" unit="Number of Employees" values="[12908, 5948, 8105, 11248, 8989, 11816, 18274, 18111]"></vaadin-chart-series> </vaadin-chart> <vaadin-button style="width: 100%" id="but3" theme="primary"> Primary3 </vaadin-button> </vaadin-vertical-layout> </template> <script> class TestVlbutton extends Polymer.Element { static get is() { return 'test-vlbutton'; } static get properties() { return { // Declare your properties here. }; } } customElements.define(TestVlbutton.is, TestVlbutton); </script> </dom-module> 与分发策略结合使用(可以与Keras和tf.data一起使用)时,您的输入fn应该返回tf.Estimator

tf.data.Dataset

有关分发策略,请参见documentation

def input_fn(): dataset = dataset.repeat(num_epochs) dataset = dataset.batch(batch_size = batch_size) dataset = dataset.cache() dataset = dataset.prefetch(buffer_size=None) return dataset ...use input_fn... 在分发策略/高级库之外很有用,例如,如果您正在使用低级库或调试/测试数据集。例如,您可以像这样迭代数据集的所有元素:

dataset.make_one_shot_iterator()