代码之家  ›  专栏  ›  技术社区  ›  JimP

使用NCML减少netCDF文件中的维数

  •  3
  • JimP  · 技术社区  · 12 年前

    我正在尝试使用NCML将CF-1.4文件“转换”为CF-1.6。特别令人感兴趣的是如何1)删除维度,然后2)更改变量的维度。例如,下面是顶部( ncdump )共两个 netCDF 文件夹。第一个是CF-1.4,有尺寸 time , z , lat lon 在这个文件中,变量(例如,temp)是这四个的函数: temp(time,z,lat,lon) 。我想通过NCML将其转换为CF-1.6文件,如第二个文件所示,其中 z / 纬度 / 离子 不再是维度,变量只是时间的函数。谢谢

    文件1:

    netcdf wqb_1.4 {
    dimensions:
            time = UNLIMITED ; // (109008 currently)
            z = 1 ;
            lat = 1 ;
            lon = 1 ;
    variables:
            float time(time) ;
                    time:long_name = "Time" ;
                    time:standard_name = "time" ;
                    time:short_name = "time" ;
                    time:axis = "T" ;
                    time:units = "minutes since 2008-01-01 00:00:00 -10:00" ;
            float z(z) ;
                    z:long_name = "depth below mean sea level" ;
                    z:standard_name = "depth" ;
                    z:short_name = "depth" ;
                    z:axis = "z" ;
                    z:units = "meters" ;
            float lat(lat) ;
                    lat:long_name = "Latitude" ;
                    lat:standard_name = "latitude" ;
                    lat:short_name = "lat" ;
                    lat:axis = "Y" ;
                    lat:units = "degrees_north" ;
            float lon(lon) ;
                    lon:long_name = "Longitude" ;
                    lon:standard_name = "longitude" ;
                    lon:short_name = "lon" ;
                    lon:axis = "X" ;
                    lon:units = "degrees_east" ;
            float temp(time, z, lat, lon) ;
                    temp:long_name = "Temperature" ;
                    temp:standard_name = "sea_water_temperature" ;
                    temp:short_name = "temp" ;
                    temp:units = "Celsius" ;
                    temp:coordinates = "time lat lon alt" ;
                    temp:valid_range = 10., 35. ;
                    temp:_FillValue = -999.f ;
                    temp:observation_type = "measured" ;
    

    文件2:

    netcdf wqb_1.6 {
    dimensions:
            time = UNLIMITED ; // (109008 currently)
            name_strlen = 5 ;
    variables:
            char station_name(name_strlen) ;
                    station_name:long_name = "wqbaw" ;
                    station_name:cf_role = "timeseries_id" ;
            float time(time) ;
                    time:long_name = "Time" ;
                    time:standard_name = "time" ;
                    time:short_name = "time" ;
                    time:axis = "T" ;
                    time:units = "minutes since 2008-01-01 00:00:00 -10:00" ;
            float z ;
                    z:long_name = "depth below mean sea level" ;
                    z:standard_name = "depth" ;
                    z:short_name = "depth" ;
                    z:axis = "z" ;
                    z:units = "meters" ;
            float lat ;
                    lat:long_name = "Latitude" ;
                    lat:standard_name = "latitude" ;
                    lat:short_name = "lat" ;
                    lat:axis = "Y" ;
                    lat:units = "degrees_north" ;
            float lon ;
                    lon:long_name = "Longitude" ;
                    lon:standard_name = "longitude" ;
                    lon:short_name = "lon" ;
                    lon:axis = "X" ;
                    lon:units = "degrees_east" ;
            float temp(time) ;
                    temp:long_name = "Temperature" ;
                    temp:standard_name = "sea_water_temperature" ;
                    temp:short_name = "temp" ;
                    temp:units = "Celsius" ;
                    temp:coordinates = "time lat lon alt" ;
                    temp:valid_range = 10., 35. ;
                    temp:_FillValue = -999.f ;
                    temp:observation_type = "measured" ;
    
    5 回复  |  直到 12 年前
        1
  •  3
  •   Rich Signell    12 年前

    使现代化 : 下面的解决方案 出现 工作,但它 :从中提取数据 失败 ,正如约翰·M发现的那样(见其他答案)。我们认为我们已经发现维护单一维度是解决方案,但从四个维度到一个维度最终会导致错误。正如Sean A.所指出的,你不能使用NcML来改变变量的形状。

    原始“解决方案”(实际上不起作用):

    如果您的目标是使数据符合CF-1.6,那么您可以使该维度 station 值为1。所以你可以这样做:

    <?xml version="1.0" encoding="UTF-8"?>
    <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" location="/usgs/data/file1.nc">
      <remove type="dimension" name="lon"/>
      <remove type="dimension" name="lat"/>
      <remove type="dimension" name="z"/>
      <dimension name="station" length="1"/>
      <dimension name="name_strlen" length="20" />
      <variable name="lat" shape="station"/>
      <variable name="lon" shape="station"/>
      <variable name="z" shape="station"/>
      <variable name="temp" shape="time station"/>
      <variable name="site" shape="station name_strlen" type="char">
        <attribute name="standard_name" value="station_id" />
        <attribute name="cf_role" value="timeseries_id" />
        <values> my_station_001 </values>
      </variable>
      <attribute name="Conventions" value="CF-1.6" />
      <attribute name="featureType" value="timeSeries" />
    </netcdf>
    
        2
  •  3
  •   Sean A.    12 年前

    Rich的解决方案在某种程度上适用于这个非常具体的案例,但原因是错误的。在NcML中,您可以删除维度对象,但不能重塑数据变量。对于这种特定的情况,当您试图删除单例维度(大小为1)时,事情似乎会成功,因为它并没有真正改变数据在磁盘上的布局方式。例如,如果您使用Unidata的toolsUI,使用Rich的答案中的NcML为临时变量进行ncdump,您将看到singleton维度仍然存在,并且没有真正删除。我不确定这会对文件的读取产生什么影响——我认为这将取决于客户端。然而,如果你试图去除非单体尺寸,那么这将在表面上失败。

    如果你真的想正确地重塑你的数据,你必须重写你的netCDF文件。不幸的是,据我所知,这方面没有任何“捷径”。例如,如果您在Unidata的toolsUI中使用Rich中的NcML,并试图基于它写出一个新文件,您将得到一个错误,例如“错误:对于变量z,部分(1)中的范围数必须为=0。”这是因为单例维度仍然存在于netCDF文件中,但NcML文件试图将范围强制为0。然而,如果您了解python,那么编写一个脚本来重写netCDF文件应该非常简单。

    请注意,使用NcML重塑变量的能力是我们定期听到的功能请求——向support-netcdf-java@unidata.ucar.edu.还要注意,Unidata是一个社区驱动的组织,Rich是我们的用户委员会成员,该委员会将于下个月举行会议。我建议他在会议上也提到这个功能要求。

    干杯

    肖恩

        3
  •  2
  •   JimP    12 年前

    Rich是正确的,这样做的目的是试图将我们的数据提高到CF-1.6,这样做是为了让我们可以通过SOS提供数据。更具体地说,我们想要使用ncSOS(基于TDS构建),而这种特殊风格的SOS需要CF-1.6。在这方面,通过NCML进行的修改似乎有效(加上一些额外的修改;见下文)。

    我宁愿不必修改数据集,其中一些数据集可以追溯到几年前。Sean关于客户端工具的观点也很中肯,因为我们的许多用例都涉及到需要变量具有lat/lon维度的工具。因此,我们的解决方案是通过两个NCML“包装器”通过TDS提供一个数据集,一个用于ncSOS,另一个用于通过OPeNDAP访问的需要lat/lon的特定客户端。

    除了Rich上面的建议外,为了在ncSOS工作,我们必须:

    1. 添加CF-1.6“坐标”属性(“时间-纬度-经度”)
    2. 添加全局属性“featureType=timeSeries”
    3. 添加station_变量
    4. 将数据类型从点更改为站

    结果如下:

        <attribute name="featureType" value="timeSeries" />
        <remove type="dimension" name="lon"/>
        <remove type="dimension" name="lat"/>
        <remove type="dimension" name="z"/>
        <dimension name="name_strlen" length="4"/>
        <variable name="lat" shape=""/>
        <variable name="lon" shape=""/>
        <variable name="z" shape=""/>
        <variable name="station_name" shape="name_strlen" type="char">
          <attribute name="long_name" value="NS01" />
          <attribute name="cf_role" value="timeseries_id" />
          <values>NS01</values>
        </variable>
        <variable name="temp" shape="time">
          <attribute name="coordinates" value="time lat lon z" />
        </variable>
    
        4
  •  2
  •   John Maurer    12 年前

    为了跟进Jim的上述帖子,虽然Rich的NcML解决方案最初似乎有效,但通过OPeNDAP或ncSOS获取数据的尝试没有成功,这证实了Sean上述的怀疑态度。

    目录成功出现,OPeNDAP表格显示了CF-1.6的新维度和重塑变量。此外,ncSOS GetCapabilities文档也成功出现。

    但是,尝试使用OPeNDAP表单下载某些数据会出现问题。我无法在OPeNDAP表单上获取变量的子集。例如:

    http://oos.soest.hawaii.edu/thredds-test/dodsC/hioos/nss/ns01/ns01_2012_02_23.nc.html

    如果我尝试使用此URL获取第一个临时值:

    http://oos.soest.hawaii.edu/thredds-test/dodsC/hioos/nss/ns01/ns01_2012_02_23.nc.ascii?temp[0:1:0]

    它给了我一个错误:

    Error {
        code = 500;
        message = "NcSDArray InvalidRangeException=Number of ranges in section (1) must be = 4";
    };
    

    唯一成功的是获取所有值:

    http://oos.soest.hawaii.edu/thredds-test/dodsC/hioos/nss/ns01/ns01_2012_02_23.nc.ascii?temp[0:1:359]

    此外,尝试通过ncSOS GetObservation获取数据也会失败。尝试使用以下URL:

    http://oos.soest.hawaii.edu/thredds-test/sos/hioos/nss/ns01agg.ncml?service=SOS&version=1.0.0&request=GetObservation&responseFormat=text%2Fxml%3Bsubtype%3D%22om%2F1.0.0%22&offering=NS01&observedProperty=temp&procedure=urn:ioos:station:org.pacioos:NS01

    这将导致threddsServlet.log中出现以下错误消息:

    2013-10-02T09:03:44.844 -1000 [1288472818][    2602] INFO  - threddsServlet - Remote host: 128.171.151.240 - Request: "GET /thredds-test/sos/hioos/nss/ns01agg.ncml?service=SOS&version=1.0.0&request=GetObs
    ervation&responseFormat=text%2Fxml%3Bsubtype%3D%22om%2F1.0.0%22&offering=NS01&observedProperty=temp&procedure=urn:ioos:station:org.pacioos:NS01 HTTP/1.1"
    2013-10-02T09:03:44.845 -1000 [1288472819][    2602] INFO  - com.asascience.ncsos.controller.SosController - Handling SOS metadata request.
    2013-10-02T09:03:45.614 -1000 [1288473588][    2602] ERROR - ucar.nc2.Structure - Structure.IteratorRank1.readNext()
    ucar.ma2.InvalidRangeException: Number of ranges in section (1) must be = 4
        at ucar.ma2.Section.fill(Section.java:144)
        at ucar.nc2.Variable.read(Variable.java:673)
        at ucar.nc2.Variable.read(Variable.java:647)
        at ucar.nc2.ncml.AggregationOuterDimension$DatasetOuterDimension.read(AggregationOuterDimension.java:774)
        at ucar.nc2.ncml.AggregationOuterDimension.reallyRead(AggregationOuterDimension.java:293)
        at ucar.nc2.dataset.VariableDS._read(VariableDS.java:533)
        at ucar.nc2.Variable.read(Variable.java:673)
        at ucar.nc2.dataset.VariableDS.reallyRead(VariableDS.java:553)
        at ucar.nc2.dataset.VariableDS._read(VariableDS.java:533)
        at ucar.nc2.Variable.read(Variable.java:673)
        at ucar.nc2.Variable.read(Variable.java:647)
        at ucar.nc2.dataset.StructurePseudoDS.reallyRead(StructurePseudoDS.java:193)
        at ucar.nc2.Variable._read(Variable.java:861)
        at ucar.nc2.Variable.read(Variable.java:673)
        at ucar.nc2.Variable.read(Variable.java:619)
        at ucar.nc2.Structure.readStructure(Structure.java:378)
        at ucar.nc2.Structure$IteratorRank1.readNext(Structure.java:464)
        at ucar.nc2.Structure$IteratorRank1.next(Structure.java:447)
        at ucar.nc2.ft.point.PointIteratorFromStructureData.nextStructureData(PointIteratorFromStructureData.java:103)
        at ucar.nc2.ft.point.PointIteratorFromStructureData.hasNext(PointIteratorFromStructureData.java:68)
        at ucar.nc2.ft.point.PointCollectionImpl.calcBounds(PointCollectionImpl.java:128)
        at com.asascience.ncsos.util.DatasetHandlerAdapter.calcBounds(DatasetHandlerAdapter.java:122)
        at com.asascience.ncsos.cdmclasses.TimeSeries.setData(TimeSeries.java:243)
        at com.asascience.ncsos.getobs.SOSGetObservationRequestHandler.setCDMDatasetForStations(SOSGetObservationRequestHandler.java:193)
        at com.asascience.ncsos.getobs.SOSGetObservationRequestHandler.<init>(SOSGetObservationRequestHandler.java:138)
        at com.asascience.ncsos.service.SOSParser.enhanceGETRequest(SOSParser.java:197)
        at com.asascience.ncsos.controller.SosController.handleSOSRequest(SosController.java:80)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.springframework.web.bind.annotation.support.HandlerMethodInvoker.invokeHandlerMethod(HandlerMethodInvoker.java:176)
        at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.invokeHandlerMethod(AnnotationMethodHandlerAdapter.java:440)
        at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.handle(AnnotationMethodHandlerAdapter.java:428)
        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:925)
        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:856)
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:936)
        at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:827)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:621)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:812)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
        at thredds.servlet.filter.RequestPathFilter.doFilter(RequestPathFilter.java:102)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
        at thredds.server.RequestBracketingLogMessageFilter.doFilter(RequestBracketingLogMessageFilter.java:48)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
        at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
        at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:579)
        at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:307)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
    2013-10-02T09:03:45.616 -1000 [1288473590][    2602] ERROR - com.asascience.ncsos.util.DatasetHandlerAdapter - Could not calculate the bounds of the PointFeatureCollection NS01
    Structure.Iterator.readNext()
    2013-10-02T09:03:45.616 -1000 [1288473590][    2602] ERROR - com.asascience.ncsos.cdmclasses.baseCDMClass - TimeSeries - setData; exception:
    java.lang.NullPointerException
    2013-10-02T09:03:45.616 -1000 [1288473590][    2602] ERROR - com.asascience.ncsos.service.SOSParser - java.lang.NullPointerException
    2013-10-02T09:03:45.617 -1000 [1288473591][    2602] ERROR - com.asascience.ncsos.controller.SosController -
    2013-10-02T09:03:45.817 -1000 [1288473791][    2602] INFO  - threddsServlet - Request Completed - 200 - -1 - 973:1
    
        5
  •  2
  •   John Caron    10 年前

    NcML现在(自版本4.4以来)有一个删除长度为1的尺寸的操作,例如:

    <variable name="temp">
      <logicalReduce dimNames="lat lon" />
    </variable>
    

    看见

    http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/ncml/AnnotatedSchema4.html#logicalReduce