With traditional architectures, it’s quite hard to counter challenges imposed by real-time streaming data – one such use case is joining streams of data from disparate sources. For example, think about a system that accepts processed orders from customers (real time, high velocity data source) and the requirement is to enrich these “raw” orders with additional customer info such as name, email, location etc. A possible solution is to build a service that fetches customer data for each customer ID from an external system (for example, a database), perform a join (in-memory) and stores the enriched data in another database perhaps (materialized view). This has several problems though and one of them is not being able to keep up (process with low latency) with a high volume data.
[Read More]