验证引用另一个架构的 Avro 架构

时间:2021-06-09 10:38:38

标签: python python-3.x avro

我正在使用 Python 3 avro_validator 库。

我要验证的架构引用了单独的 avro 文件中的其他架构。这些文件位于同一文件夹中。如何使用库编译所有引用的模式?

Python 代码如下:

from avro_validator.schema import Schema

schema_file = 'basketEvent.avsc'

schema = Schema(schema_file)
parsed_schema = schema.parse()

data_to_validate = {"test": "test"}

parsed_schema.validate(data_to_validate)

我返回的错误:

ValueError: Error parsing the field [contentBasket]: The type [ContentBasket] is not recognized by Avro

以及下面的示例 Avro 文件:


basketEvent.avsc 

{
  "type": "record",
  "name": "BasketEvent",
  "doc": "Indicates that a user action has taken place with a basket",
  "fields": [
    {
      "default": "basket",
      "doc": "Restricts this event to having type = basket",
      "name": "event",
      "type": {
        "name": "BasketEventType",
        "symbols": ["basket"],
        "type": "enum"
      }
    },
    {
      "default": "create",
      "doc": "What is being done with the basket. Note: create / delete / update will always follow a product event",
      "name": "action",
      "type": {
        "name": "BasketEventAction",
        "symbols": ["create","delete","update","view"],
        "type": "enum"
      }
    },
    {
      "default": "ContentBasket",
      "doc": "The set of values that are specific to a Basket event",
      "name": "contentBasket",
      "type": "ContentBasket"
    },
    {
      "default": "ProductDetail",
      "doc": "The set of values that are specific to a Product event",
      "name": "productDetail",
      "type": "ProductDetail"
    },
    {
      "default": "Timestamp",
      "doc": "The time stamp for the event being sent",
      "name": "timestamp",
      "type": "Timestamp"
    }
  ]
}

contentBasket.avsc

{
  "name": "ContentBasket",
  "type": "record",
  "doc": "The set of values that are specific to a Basket event",
  "fields": [
    {
      "default": [],
      "doc": "A range of details about product / basket availability",
      "name": "availability",
      "type": {
        "type": "array",
        "items": "Availability"
      }
    },
    {
      "default": [],
      "doc": "A range of care pland applicable to the basket",
      "name": "carePlan",
      "type": {
        "type": "array",
        "items": "CarePlan"
      }
    },
    {
      "default": "Category",
      "name": "category",
      "type": "Category"
    },
    {
      "default": "",
      "doc": "Unique identfier for this basket",
      "name": "id",
      "type": "string"
    },
    {
      "default": "Price",
      "doc": "Overall pricing info about the basket as a whole - individual product pricings will be dealt with at a product level",
      "name": "price",
      "type": "Price"
    }
  ]
}

可用性.avsc

{
  "name": "Availability",
  "type": "record",
  "doc": "A range of values relating to the availability of a product",
  "fields": [
    {
      "default": [],
      "doc": "A list of offers associated with the overall basket - product level offers will be dealt with on an individual product basis",
      "name": "shipping",
      "type": {
        "type": "array",
        "items": "Shipping"
      }
    },
    {
      "default": "",
      "doc": "The status of the product",
      "name": "stockStatus",
      "type": {
        "name": "StockStatus",
        "symbols": ["in stock","out of stock",""],
        "type": "enum"
      }
    },
    {
      "default": "",
      "doc": "The ID for the store when the stock can be collected, if relevant",
      "name": "storeId",
      "type": "string"
    },
    {
      "default": "",
      "doc": "The status of the product",
      "name": "type",
      "type": {
        "name": "AvailabilityType",
        "symbols": ["collection","shipping",""],
        "type": "enum"
      }
    }
  ]
}

maxDate.avsc

{
  "type": "record",
  "name": "MaxDate",
  "doc": "Indicates the timestamp for latest day a delivery should be made",
  "fields": [
    {
      "default": "Timestamp",
      "doc": "The time stamp for the delivery",
      "name": "timestamp",
      "type": "Timestamp"
    }
  ]
}

minDate.avsc

{
  "type": "record",
  "name": "MinDate",
  "doc": "Indicates the timestamp for earliest day a delivery should be made",
  "fields": [
    {
      "default": "Timestamp",
      "doc": "The time stamp for the delivery",
      "name": "timestamp",
      "type": "Timestamp"
    }
  ]
}

shipping.avsc

{
  "name": "Shipping",
  "type": "record",
  "doc": "A range of values relating to shipping a product for delivery",
  "fields": [
    {
      "default": "MaxDate",
      "name": "maxDate",
      "type": "MaxDate"
    },
    {
      "default": "MinDate",
      "name": "minDate",
      "type": "minDate"
    },
    {
      "default": 0,
      "doc": "Revenue generated from shipping - note, once a specific shipping object is selected, the more detailed revenye data sits within the one of object in pricing - this is more just to define if shipping is free or not",
      "name": "revenue",
      "type": "int"
    },
    {
      "default": "",
      "doc": "The shipping supplier",
      "name": "supplier",
      "type": "string"
    }
  ]
}

时间戳.avsc

{
  "name": "Timestamp",
  "type": "record",
  "doc": "Timestamp for the action taking place",
  "fields": [
    {
      "default": 0,
      "name": "timestampMs",
      "type": "long"
    },
    {
      "default": "",
      "doc": "Timestamp converted to a string in ISO format",
      "name": "isoTimestamp",
      "type": "string"
    }
  ]
}

1 个答案:

答案 0 :(得分:0)

我不确定该库是否支持您尝试执行的操作,但 fastavro 应该。

如果您将第一个架构放在名为 BasketEvent.avsc 的文件中,将第二个架构放在名为 ContentBasket.avsc 的文件中,那么您可以执行以下操作:

from fastavro.schema import load_schema
from fastavro import validate

schema = load_schema("BasketEvent.avsc")
validate({"test": "test"}, schema)

请注意,当我尝试执行此操作时,我收到了 fastavro._schema_common.UnknownType: Availability 错误,因为您似乎还没有在此处发布其他引用的架构。