I'm recently worked in a project where event driven architecture was used to distribute different events to multiple consumers. During the project we spend some time to think how to document events easily and how to create centralized repository for event schemas. I share some thoughts about event catalog in this blog post.
What is event driven architecture?
In event driven architecture components communicate with each other through events. Events can be triggered when significant change has happend in application state. Event driven architecture enables nearly real-time processing, decoupling of components and component specific scaling.
Open-source tool for Documentation Site
Open-source project called EventCatalog is a comprehensive and free event driven architecture documentation tool. This tool has comprehensive list of built-in features which enables to show domain boundaries, event schemas, event samples and dependencies. Markdown syntax and Mermaid diagrams are also supported which enables that you can enrich the event documentation as much as you want.
EventCatalog is an advanced event documentation tool and it doesn't provide API capabilities to share event schemas for publishers and consumers. You need to create the API layer on your own.
You can find comprehensive instructions how to install and configure EventCatalog from here. In Azure you can host EventCatalog static web site e.g. in Static Web Site, Blob Storage or in App Service.
Some screenshots from EventCatalog
Event Schema viewer
Event sample viewer
Thoughts about Event Catalog implementation in our project
- One centralized repository for event schemas is required
- API or Client SDK is required to enable event publishers and consumers to fetch event schemas programmatically
- Documentation site which can presents event schemas, samples and diagrams of event publishers and consumers
Event documenting tool
EventCatalog documentation tool provided so much great features and it matched quite well for our needs so it was clear that we would use it. Our plan is to host EventCatalog site in Azure Blob Storage (Static Web Site). Capability to share event schemas via API was the only thing which was missing so we needed to consider other ways to fill this requirement.
Next I'll present some concepts what we considered for implementation during the discovery work.
Event Hubs is a fundamental part of event driven architecture in our project. During the project we noticed that schema registry (repository) is actually already part of the Event Hubs. This was great because we just needed to solve how to update schemas automatically to EventCatalog (documentation site) when schema was changed in Event Hubs schema registry (master data source).
Note that schema registry is available only in Basic or higher pricing tier of Event Hubs.
EventCatalog uses below folder structure so schemas files should be fairly easy to update to this structure.
│ ├── FeedbackCreated
│ │ └──versioned
│ │ │ └──0.0.1
│ │ │ └──Examples
│ │ │ │ └──example.json
│ │ │ └──index.md
│ │ │ └──schema.json
│ │ └──index.md
│ │ └──schema.json
Event Hub has great Client SDK / API for schema registry which solves the requirement of fetching schemas programmatically. You can find samples how to use it from here.
First thought was that perhaps schema registry in Event Hub supports Event Grid to distribute events e.g. when schema was updated. Azure Function could then subscribe those events and fetch the new schema from schema registry via Client SDK and update it to EventCatalog site's specific folder structure. Unfortunately CaptureFileCreated was the only supported event type.
So this wasn't a feasible solution.
In this second iteration main goal was still use schema registry of Event Hub as a master source for schemas but schemas were updated to EventCatalog (documentation site) periodically with Azure Function. Azure Function was responsible for fetching the schema data from Schema Registry via Client SDK and then update schemas to Blob Storage where EventCatalog (documentation site) is hosted.
- Schema Registry is built-in feature provided by Event Hubs and custom development is not required
- No need to implement separate schema registry API because Event Hub's Client SDK / API enables access to schemas programmatically
- Schema reader / updater potentially could create event samples automatically while updating schemas
- Schema Registry client in Azure.Data.SchemaRegistry Nuget package currently doesn't support fetching all schemas at once. You need to know the name of the schema to retrieve the actual schema. This is not a complete show stopper but requires some extra work.
- Schema Registry supports AVRO schema format but JSON support is still in preview.
- If you don't use Event Hub in your system architecture, you need to separately provision it to get access to Schema Registry and it generates some small extra costs per month.
- Schema reader / updater Azure Function requires logic to determine which publisher / consumer service is using the event. This is determined in event specific index.md file.
This is a possible solution but let's iterate this still a bit more.
Combining Event Hub's schema registry and open source documentation site EventCatalog was a bit too complex as stated in the previous iteration so we decided to iterate this a bit more.
In this approach Event Hub's schema registry was removed and EventCatalog in Blob Storage will be the master data source for event schema data. This approach enables that we can use Blob Storage Client SDK / API to fetch schema data from Blob Storage. To make fetching schemas for publishers and consumers as easy as possible, custom library should be developed. We had different kind of systems as a publisher and consumer so this would require extra work.
This approach enables that you can maintain event schemas and event samples in source control of EventCatalog and CI/CD pipeline will deploy everything to Blob Storage.
Sample event specific index.md file
FeedbackProcessed event contains raw event data and enriched sentiment data provided by Azure Cognitive Service.
- Feedback Processor
- Feedback Subscriber
- You can maintain event schemas and event samples in source control of EventCatalog
- Simpler architecture. No need to replicate event schemas from another place because everything is in one place (Blob Storage)
- You don't need Event Hub if it's not used in your architecture
- Requires more development effort
- Different type of publisher and consumer systems require own Schema Reader libraries which increase maintenance
It was pretty difficult to automate completely Event Hub's Schema Registry to work with EventCatalog documention site. From these options I would choose a solution presented in iteration three where schema registry (repository) is in Blob Storage because solution is simpler. If you don't need advanced event documention then Schema Registry of Event Hub is good option for you.