How an API was redesigned to improve concurrency by ~50x.

⚠ ⚠ Warning : Unlike my regular blogs, this is very techno-oriented.

Intro

Concurrency : How many users can access your application at the same time.

In the technology world, concurrency is $$. Here is my guess …

DMarts, Myntra, Jabong would typically serve ~30,000 requests/ second.

Payment Gateways would serve ~1,50,000 requests/ second

Google, Facebook, Amazon would serve ~5,00,000 requests/ second

See the correlation between higher concurrency and revenue ?? Concurrency also works as a show-off token for Nerds. But, that’s for another time.

Fudr faced a problem beyond a certain concurrency. Here is how we redesigned some APIs to scale up concurrency.

The Need

A system where each day starts with a new counter, for each restaurant

Food ordering applications need to ‘reset’ the order count after each day. As an example, at a restaurant.

Day 1 :

Order 1, Order 2 ….…. Order 52

Day 2 :

Order 1, Order 2 ….…. Order 109

Day 3 :

Order 1, Order 2 ….…. Order 39

Therefore, the application needs a system where “Restaurant Day” is synchronous. And resets each day.

While the need is straightforward, the implementation is not.

Restaurant Day refers to each day for every restaurant. e.g.

Cafe Coffee

Day 1 : New day for orders

Day 2 : New day for orders

Chai ki Dukaan

Day 1 : New day for orders

Day 2 : New day for orders

Tech requirements:

●Application needs an intelligent sequence generator.

●The implementation should be automated; no manual intervention.

●The sequence generator needs to be centralized ( to allow horizontal scalability )

○System does not use Redis.

Therefore the only place this can happen is in the DB.

○MySQL ( older version ) does NOT support a ‘Sequence generator’

Meaning a customized solution needs to be figured out.

●The solution must account for high concurrency.

Possible Solutions

A customized sequence can be generated by any one of the following ways. Let’s look at all the pros and cons.

DB Sequences:

From Postgres’ documentation:

Sequence objects are special single-row tables. They are commonly used to generate unique identifiers for rows of a table. The sequence functions provide simple, multiuser-safe methods for obtaining successive sequence values from sequence objects

DB sequence is a good option. But the system uses MySql. And MySql does NOT support sequences. Hence, NOT an option.

Increment + Re-calibrate :

This approach suggests that we:

1.Increment the counter indefinitely

2.Re-calibrate with each day’s offset

Day 1 :

Order 1, Order 2 ….…. Order 52

Day 1 ended on 52.

Day 2 :

Since Day 1 ended on 52, subtract 52 offset from each order. Order number increments.

53 – 52 = Order 1

54 – 52 = Order 2

….

160 – 52 = Order 108

Day 3 :

Since Day 2 ended on 108, subtract 108 from each order. Order number increments

109 – 108 = Order 1

110 – 108 = Order 2

●When an Order is created – additional computation is required to calculate the Order Number

●Next time the same order needs to be read. The order number needs to be recalculated.

●In the future, when a report is needed. The number needs to be recalculated.

Meaning – This approach adds an infinite computational ‘Tech debt’. Therefore is a poor approach.

Database Triggers :

Triggers : A trigger is a stored procedure in the database. It is automatically invoked whenever a special event in the database occurs.

We could create a BEFORE INSERT trigger. Meaning – one that would start whenever someone tries to insert a row into our table. The trigger would finish its processing before the INSERT commits.

BEGIN
DECLARE c DOUBLE;
SET c =
(
SELECT Count(*)
FROM database.`my_order`
WHERE my_order.`restaurant_day` = new.`restaurant_day`);
SET new.order_number = c + 1;
END

But this design read locks, the my_order table for Count(*). This lock blocks any INSERT into the table, reducing the concurrency.

A concurrency issue would look like:

{
“code”: “UNKNOWN_SERVER_ERR”,
“message”: “Unknown internal service exception”,
“detail”: “00cbdcd4-8581-42ca-b3a3-6c274a2484a3-could not execute statement; SQL [n/a]; nested exception is org.hibernate.exception.LockAcquisitionException: could not execute statement”,
“status”: “INTERNAL_SERVER_ERROR”
}

Also, triggers are NOT synchronous. So, 2 orders at the same time could get the same OrderNumber.

Due to both functional and concurrency issues. This is not a good solution.

Upsert :

Next we try UPSERTs.

Databases provide UPDATE + INSERT functionality to help DB updates. For MySQL this looks like:

insert into item values (1, ‘Random’) ON DUPLICATE KEY UPDATE `id` = `id` + 1;

While this is a good simple solution, it has an inherent computational-debt problem. This is how a processing would go:

1.Transaction tries to insert into the DB.

a.Transaction Success – Insert

b.Transaction Failure ( due to Duplicate row )

i.Handle failure

ii.Update

Highlighted part is the problem. Imagine a system getting 1000 requests per day.

1.999 exception handlings would occur

2.Only 1 request would succeed.

999 FAILUREs + exception handling is wasted CPU clocks. This process, while functional, again has loads of computational tech debt !

Custom Upsert :

Let us try moving the UPSERT into our control. This is how the code would look like …

1.Create a custom table with Unique Constraint for each restaurant day.

2.For each order

a.Lock the DB row

b.Read the value

c.Value available ?

i.NO – Insert

ii.YES – Update

Once again, a system getting 1000 requests per day.

3.Only 1 request would Insert.

4.999 would update

But, we have ZERO exception handling.

Unlike the previous approach. Else condition is NOT the norm, it’s the edge case. Hence time saved.

The code would look like:

Table:

CREATE TABLE `order_number` (
`id` bigint NOT NULL AUTO_INCREMENT,
`number` smallint DEFAULT NULL,
`prefix` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UK_hr9le9u255fj2lwohkyn8nwqi` (`prefix`))

Application:

@Transactional
public Short getOrderNumberForRestaurant(String prefix) {

OrderNumber newOrder = orderNumberRepository.findByPrefix(prefix);
if (Objects.isNull(newOrder)) {

newOrder = new OrderNumber(prefix, (short) 1);
orderNumberRepository.save(newOrder);
}

orderNumberRepository.incrementOrderNumber(prefix);

return newOrder.getNumber();
}

@Modifying
@Query(value = “update order_number set `number` = `number` + 1 where prefix = :prefix”, nativeQuery = true)
void incrementOrderNumber(@Param(“prefix”) String prefix);

@Lock(LockModeType.PESSIMISTIC_WRITE)
OrderNumber findByPrefix(String prefix);

This is the method we used.

Results

We saw immediate results. Here is the Jmeter performance report.

And this was in the lowest environment. Production instances with higher resources would surely handle higher load.

PS : The 14% failures are all valid ones.

And this is how we got almost ~50x concurrency improvement. 🏆 💯

Wrap up

A good design is like a faithful friend – helping us without us asking. 🤟 🤟

I hope the redesign implemented something of the sort !