Question

NetCDF允许（至少在其基于HDF5的第4版格式中）创建复合数据类型（非常类似于C结构）。每个组件都有一个标签，一个类型和一个复合类型的位置。例如，对于统计数据集，我们可以使用[('min', 'float'), ('max', 'float'), ('avg', 'float'), ('std', 'float')]定义的复合类型作为第二个组件float标记为max。

现在，netCDF还允许添加元数据。这些通常遵循中心，例如NetCDF Climate and Forecast (CF) Metadata Conventions。这很有用，因此生成的netCDF文件的其他用户可以轻松理解元数据。

但我没有找到专门处理复合数据类型的约定，例如，专门为复合数据的一个组件提供元数据。

有这样的约定吗？
如果没有，或者在实践中使用了什么？
如果没有使用，你有什么建议？为什么？（我考虑使用多行属性，因此用\n分隔，并使用特定于组件的标签来开始每一行，例如[avg]或#avg。）

Answer 1

要保持CF约定，您可以为复合类型的每个成员创建一个单独的变量，并使用ancillary_variables属性指示它们是相关的：

netcdf test {
dimensions:
    time = 3 ;
    lat = 36 ;
    lon = 36 ;
variables:
    double time(time) ;
        time:long_name = "Time" ;
        time:standard_name = "time" ;
        time:units = "Days since 1970-01-01 00:00" ;
        time:calendar = "standard" ;
    float ctp(time, lat, lon) ;
        ctp:_FillValue = -999.f ;
        ctp:long_name = "Cloud Top Pressure" ;
        ctp:standard_name = "air_pressure_at_cloud_top" ;
        ctp:units = "Pa" ;
        ctp:cell_methods = "time: mean" ;
        ctp:ancillary_variables = "ctp_std ctp_min ctp_max" ;
    float ctp_std(time, lat, lon) ;
        ctp_std:_FillValue = -999.f ;
        ctp_std:long_name = "Cloud Top Pressure Standard Deviation" ;
        ctp_std:units = "Pa" ;
        ctp_std:cell_methods = "time: standard_deviation" ;
    float ctp_min(time, lat, lon) ;
        ctp_min:_FillValue = -999.f ;
        ctp_min:long_name = "Cloud Top Pressure Minimum" ;
        ctp_min:units = "Pa" ;
        ctp_min:cell_methods = "time: minimum" ;
    float ctp_max(time, lat, lon) ;
        ctp_max:_FillValue = -999.f ;
        ctp_max:long_name = "Cloud Top Pressure Maximum" ;
        ctp_max:units = "Pa" ;
        ctp_max:cell_methods = "time: maximum" ;
}

然后您可以像往常一样通过变量添加元数据＆＃39;属性。例如，cell_methods属性可用于描述应用的统计信息。

如果你想坚持使用复合数据类型，有一个关于可能相关的矢量的票证（虽然它很旧）：https://cf-trac.llnl.gov/trac/ticket/79

netCDF中复合数据类型的元数据属性约定

1 个答案: