Why I Built This
WhyIBuiltThis
In modern distributed architectures, microservices require durable and high-speed data pipelines to handle event logs. Traditional relational databases introduce high transaction overhead for sequential write-heavy tasks, while database-less services often store logs in memory, risking complete data loss during system crashes.
Scanning large transaction files to retrieve specific event entries creates critical bottlenecks as disk I/O scales. Backend developers are forced to choose between complex, resource-heavy database solutions or fragile, custom file-scanning scripts prone to encoding issues when dealing with multi-byte characters like emojis.
My Approach
MyApproach
I built Dilamme Event Store, a lightweight database engine that separates the transaction log durability from memory queries. The database appends serialized events directly to a single, continuous on-disk log file in newline-delimited JSON format.
An in-memory map index keeps track of exact byte offsets and lengths for each event ID. Reads perform a targeted, single-seek file read operation to retrieve the requested event slice instantly in O(1) time. On system start, the engine streams the log file line-by-line using Node's readline package to fully recover and rebuild the in-memory index, guaranteeing data durability and fast boot recovery.
Key Features
KeyFeatures
- Crash-safe durability via an append-only event log persistence layer
- Fast random-access reading in O(1) time using an in-memory byte offset index
- Automatic index recovery and reconstruction from the transaction log upon system restart
- Multi-byte Unicode/Emoji handling with byte-accurate offset tracking
- Low-overhead Express API exposing health, append, query, and statistics endpoints
- Robust test coverage validating persistence reliability and crash recovery scenarios
How the System Works
HowtheSystemWorks
The lifecycle of the event store flows from durable disk appending to high-speed memory-mapped seeking.
- A client sends a POST request containing custom event data to the /events endpoint.
- The system generates a unique UUID and a createdAt timestamp, then appends the serialized object onto the transaction log.
- The system calculates the exact UTF-8 byte length using Buffer.byteLength and updates the in-memory index map with the event's byte offset and size.
- When a client requests a specific event ID via /events/:id, the server looks up the index map and does a single disk seek read at that exact offset and size.
- Upon process crashes or server restarts, the initialization task streams the log file line-by-line to recreate the in-memory index map, resuming normal operation without data loss.
Engineering Challenges & Lessons Learned
EngineeringChallenges&LessonsLearned
Ensuring seek-and-read accuracy for multi-byte Unicode characters (e.g. emojis or foreign scripts) that can span up to 4 bytes. Using standard string lengths caused index offset drift, which was solved by computing and caching lengths with Buffer.byteLength.
Optimizing startup index reconstruction to prevent memory overload. Instead of loading the entire log file at once, the store uses createReadStream and createInterface from node:readline/promises to scan the file line-by-line efficiently.
What Iβd Improve Next
WhatIβdImproveNext
- Introduce log rotation and segments to prevent a single event log file from growing infinitely.
- Add Webhook support to broadcast events to active subscribers immediately upon transaction log commit.
- Implement a Redis caching layer to offload hot reads from disk seeking entirely.
