Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfc(decision): Batch multiple files together into single large file to improve network throughput #98

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix flowchart
  • Loading branch information
cmanallen committed May 24, 2023
commit 721c28cb66616800ca085e26902f583d2c6e5c20
9 changes: 6 additions & 3 deletions text/0098-store-multiple-replay-segments-in-a-single-blob.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,15 @@ This table will need to support, at a minimum, one write per segment. Currently,
Second, the Session Replay recording consumer will not _commit_ blob data to GCS for each segment. Instead it will buffer many segments and flush them all together as a single blob to GCS. In this step it will also make a bulk insertion into the database.

```mermaid
A[Message Received] --> B[Message Processing];
B --> C[Message Buffer];
C --> D{Has the insert trigger been reached?};
flowchart
A[Wait For New Message] --> B[Process];
B --> C[Push to Buffer];
C --> D{Buffer Full?};
D -- No --> A;
D -- Yes --> E[Write Single File to GCS];
E --> F[Bulk Insert Byte Ranges];
F --> G[Clear Buffer];
G --> A;
```

Third, when a client requests recording data we will look it up in the "recording_byte_range" table. From it's response, we will issue as many fetch requests as there are rows in the response. These requests may target a single file or many files. The files will be fetched with a special header that instructs the service provider to only respond with a subset of the bytes. Specifically, the bytes that related to our replay.
cmanallen marked this conversation as resolved.
Show resolved Hide resolved
Expand Down