initialCapacity[]

Fundamentals of Software Architecture for Big Data

The course is intended for individuals looking to understand the basics of software engineering as they relate to building large software systems that leverage big data. You will be introduced to software engineering concepts necessary to build and scale large, data intensive, distributed systems. Starting with software engineering best practices and loosely coupled, highly cohesive data microservices, the course takes you through the evolution of a distributed system over time.

← Back

Download the codebase

The milk problem continued

An example architecture used for managing product inventory which highlights the use of event collaboration with RabbitMQ.

History

The milk problem first surfaced while working with a well-known grocery store to track product inventory in real time. The choice of database was largely driven by a non-trivial performance requirement. The initial solution used an eventually consistent database which was available and partition tolerant. Read about the CAP theorem to learn more about the relationship between consistency, availability, and partition tolerance.

The challenge is that high availability comes at the cost of consistency. High availability databases are eventually consistent, and thus are notorious for dirty reads: allowing uncommitted changes from one transaction to affect a read in another transaction. As a result, the grocery chain was unable to produce an accurate count of milk on the shelves.

The below exercise introduces the reader to transactions while highlighting the challenges of dirty reads. We then move to event collaboration with RabbitMQ while highlighting the challenges with messaging systems.

The exercise

Start with the TODO - DIRTY READS items, then get the tests to pass!

Once you're done, continue to the TODO - MESSAGING items and get the tests to pass.

Here are a few links to supporting documentation.

Quick start

The below steps walk through the environment setup necessary to run the application in both local and production environments.

Install dependencies

  1. Install and start Docker.

  2. Run Docker Compose.

    docker-compose up
    

Run migrations

  1. Migrate the test database with Flyway.

    FLYWAY_CLEAN_DISABLED=false flyway -user=milk -password=milk -url="jdbc:postgresql://localhost:5432/milk_test" -locations=filesystem:databases/milk clean migrate
    
  2. Migrate the development database with Flyway.

    FLYWAY_CLEAN_DISABLED=false flyway -user=milk -password=milk -url="jdbc:postgresql://localhost:5432/milk_development" -locations=filesystem:databases/milk clean migrate
    
  3. Populate development data with a product scenario.

    PGPASSWORD=milk psql -h'127.0.0.1' -Umilk -f applications/products-server/src/test/resources/scenarios/products.sql milk_development
    

Run tests

Use Gradle to run tests. You'll see a few failures at first.

./gradlew build

Run apps

  1. Use Gradle to run the products server

    ./gradlew applications:products-server:run
    
  2. Use Gradle to run the simple client

    ./gradlew applications:simple-http-client:run
    

Hope you enjoy the exercise!

Thanks,

The IC Team

© 2023 by Initial Capacity, Inc. All rights reserved.

A workshop by

Initial Capacity