D6Pc

In this post, we will look deep into one of the famous cost estimation technique called as "The Back-of-envelope calculations".

Every modern day system what we design these days have majorly following components:
1. Applications -> You run on VMs or any cluster like GKE
2. Databases -> SQL, NoSQL etc..
2. Cache layer -> Redis, Memcached etc..
4. Message Queues -> Kafka, Google Pub/Sub etc..
5. Networking layer

We need to do estimations at each of the above component to calculate the actual cost of building/scaling any application.

So how do we do these estimates? What are the prerequisites?

Let's start with the basic refreshers.
Remember this "BKMGTP", this will be used for data size estimations.

B: Byte : Ten: 10
K : Kilo : Thousand : 1000 - > 2^10
M: Mega: Million: 1000 0000 ->2^20
G: Giga: Billion: 1000 000 000 ->2^30
T: Tera: Trillion: 1000 000 000 000 ->2^40
P: Peta: Quadrillion: 1000 000 000 000 000->2^50

Latency numbers:

ms -> 10^-3 seconds
1µs -> 10^-6 seconds
1ns -> 10^-9 seconds

L1 cache reference -> .5 ns
Branch mispredict -> 5 ns
L2 cache reference -> 7 ns
Mutex lock/unlock -> 100 ns
Main memory reference -> 100 ns
Compress 1k bytes with Zippy -> 10 µs
Send 2kb over 1 Gbps network -> 20 µs
Read 1 MB sequentially from memory -> 250 µs
Round trip within the same data center -> 500 µs
Disk seek -> 10ms
Read 1 MB sequentially from the network -> 10 ms
Read 1 MB sequentially from disk -> 30 ms

We will discuss in the next post about the component wise estimation, in the meantime, please refresh above details.

Happy learning :)

References:

Part3: https://rb.gy/3hax8
https://lnkd.in/g66FPP2v