Back to blog

Choosing the right Azure Integration Service based on Performance Benchmarking

Performance bench-marking of Azure integration services — Event Hub, Service Bus, Data Factory — to help you choose the right tool for bulk operations.

12 February 20204 min readazure, performance, service-bus, data-factory, event-hub

When it comes to integrating different applications using Azure Platform, Microsoft provides a lot of different tools (such as EventHub, EventGrid, ServiceBus, Storage Queues, IOT Hub, Azure Data Factory, API Management etc). All these tools are often very confusing in terms of their purpose since the names and terminologies seems very similar. Sometimes its hard for me as well to remember which service does what. So I decided to do a POC in all of these to take the matter in my own hands. I did it based on the following two criteria.

  1. Exploration of Features and Limitations
  2. Performance bench-marking

In today's blog I am only going to talk about the performance based bench-marking (point number 2) to keep the discussion short. I will be posting a series of Youtube videos to cover point number 1.

Case Study Background

As part of the performance bench marking case study, I am going to demonstrate the statistics of transferring 50,000 Sales Order records from one place to another.

Services Compared

To limit the size of the Case study, I am doing bench-marking of the following 4 services.

  • Azure Event Hub
  • Azure Service Bus
  • Azure Data Factory (with CopyData Pipeline)
  • Azure Data Factory (with Data Flow)

Case Study Method

Loading 50,000 Sales Order records and on the output end saving them in Azure SQL Server (General Purpose Serverless with Min of 0.5 VCores and Max of 4 VCores).

Case Study Results

Service NameInputOutputTime Taken
Event HubCustom .net core producerAzure SQL Table18 seconds
Service BusCustom .net core producerAzure SQL Table135 seconds
Data Factory with CopyDataCSV on Storage BlobAzure SQL Table6 to 10 minutes (varied every time)
Data Factory with DataFlowCSV on Storage BlobAzure SQL Table13 seconds (with optimised partitioning)

Case Study Conclusion

As mentioned above the case study was done to keep bulk operations in mind.

Event Hub: So if the nature of your integration is event like, then you should certainly go for EventHub as its built to cater huge scale of events but keep in mind that the consumers are serial in nature and its hard to have multiple consumers or subscribers on the receiving end.

Service Bus: Not as fast as Event Hub but Service bus is very useful when you have to have multiple subscribers (you can use Topics) that way you can increase the number of subscribers to improve parallelism. Also its a good idea to use Service bus if your use case has workflows like sagas involved.

Data Factory with Copy Data: If you are dumping the data from source to the destination, Data Factory is the fastest tool also the scale of data that can be managed by Data factory is huge as its optimized for Big Data and Data warehouse applications.

Data Factory with Data Flow: This is the best option to choose when you require customisation in your data pipelines. Like if you want to filter, sort, merge, cleanse etc. That's why its called Data Flow because you can define complex ETL like flows and pipelines of data.

Ending Remarks

It goes without saying that these benchmarks are very initial level and in order for you to make a decision to choose one of these services you should consider other factors in mind (client capability, performance requirement, data requirements, costing etc). But this is something that gives you an initial comparison.