Name: PlikShare
Author: Damian Krychowski

Moving a file from someone's browser into storage sounds like it should be one step: take the bytes, put them somewhere. For an average file it more or less is. The work goes into the cases at the extremes, the single huge file and the folder with tens of thousands of small ones, and into not making the average case pay for them. This is a walk through how it all fits together, in plain language.

Two kinds of upload

Upload performance isn't really one problem. It's two, and they pull in opposite directions. In the first case, someone drops a folder with 40,000 tiny files: config files, photos, a checked-out git tree. The total size is small; the cost is all in the per-file overhead. Forty thousand requests to start an upload, forty thousand to finish one, forty thousand rows to write. Here the thing slowing you down is latency.

In the second case, someone uploads one 50 GB video. There's basically no per-file overhead now, because there's only one file. The cost is throughput: pushing 50 GB through a single HTTP request that might fail at 80% and have to start over. Here the things slowing you down are bandwidth and reliability.

One upload strategy can't do well at both, so PlikShare uses three and picks one per file. The rest of the work goes into making sure the heaviest part of the system, the file bytes themselves, travels through as few hops as possible.

The main idea: keep the file bytes off the server when you can

The most expensive thing an upload can do is route the file body through the application server. If it does, the server has to receive all those bytes, hold them, and send them back out to object storage. You pay for the bandwidth twice, you hold memory the whole time, and every concurrent upload adds load to one process. So PlikShare avoids it where it can.

Every upload splits into a control plane and a data plane. The control plane is small JSON and protobuf messages the app handles: "start this upload", "here is the ETag", "finish it". The data plane is the file body, and it takes one of two routes depending on the storage:

Unencrypted object storage (S3, Azure): direct. The bytes go straight from the browser to the bucket using a pre-signed URL. The server hands out the URL but the bytes never pass through it. This is the good case, and it's the one that handles the large files: anything above the 1 MB threshold on an unencrypted bucket.
Encrypted storage, local disk, or cross-origin embeds: proxied. The bytes stream through the app server. There's no way around it: the server needs to see plaintext in order to encrypt it, and local disk has no endpoint a browser can PUT to directly. PlikShare keeps this path streaming (more on that at the end), but the bytes do go through the server here.

The diagram below shows the direct case, which is the one most worth optimizing because it carries the big files:

Browserholds the file

control: initiate / complete - small JSON & protobuf

PlikShare app servercontrol messages only on this path

data: PUT bytes to a pre-signed URL

Unencrypted object storageS3 / Azure - receives the file directly

A pre-signed URL is a time-limited, signed link that authorizes exactly one operation: "PUT this object into this bucket before this timestamp". The browser does a plain fetch PUT with the file as the body, the bytes land in the bucket, and on this path the server's bandwidth cost stays flat no matter how big the file is. These direct-to-storage URLs are valid for 15 minutes; the server generates them on demand and doesn't keep them around.

Which route a file takes isn't a runtime guess: it follows from the storage. Only S3 and Azure can hand out pre-signed URLs, so only they can route bytes directly. Local disk has nothing a browser can PUT to, and an encrypted store deliberately stays out of the direct route because the server has to touch the bytes. Anything that can't go direct goes proxied.

Three algorithms, chosen per file

When the client asks to upload a batch, the server looks at each file and assigns it one of three algorithms. The decision lives in ResolveUploadAlgorithm on the storage client, and it comes down to two things: how many parts the file needs, and whether the storage is encrypted.

A file in the batchsize + storage encryption mode

1 part & ≤ 1 MB

DirectUploadpacked into a shared bundle

1 part, > 1 MB, unencrypted

SingleChunkUploadone direct PUT to the bucket

> 1 part (or encrypted)

MultiStepChunkUploadparallel multipart

For unencrypted storage each part is 10 MB, so the part count is just ceil(fileSize / 10 MB). A 45 MB file is five parts; a 9 MB file is one. The other number that matters is the micro-file threshold of 1 MB. Together they define the three lanes:

A 45 MB file, unencrypted - five parts

PART 210 MB

PART 310 MB

PART 410 MB

PART 55 MB

≤ 1 MB, single part, unencrypted → DirectUpload. Too small to deserve its own round-trip. It gets packed into a bundle with other tiny files, and the whole bundle is POSTed to the app in one request. This is the one case where small bytes do pass through the server, and it's on purpose; see below.
> 1 MB but still one part, unencrypted → SingleChunkUpload. One pre-signed PUT, straight to the bucket. The bytes never touch the server: these files are always meant to go direct.
More than one part → MultiStepChunkUpload. Full multipart: each part is initiated, PUT, and confirmed independently, and then the upload is completed. This is where the 50 GB video goes.

Encryption narrows the choices. On managed or full encryption there's no SingleChunkUpload at all: a single-part file becomes a (proxied) DirectUpload, and anything larger becomes MultiStepChunkUpload. The server has to see plaintext to encrypt it, so a direct-to-bucket PUT isn't an option. More on that route at the end.

Small files: one request for the whole pile

Back to the folder with 40,000 files. If each one took an initiate call, a PUT, and a complete call, the upload would be three round-trips times forty thousand: minutes of latency before a single useful byte moved. The answer is to stop handling tiny files one at a time.

The browser keeps two queues, split at 1 MB: one for small files, one for large files. Small files are bin-packed (Best-Fit, largest first) into bundles of about 10 MB each, and each bundle is appended into a single multipart/form-data body and POSTed to the app in one request. A single HTTP round-trip can carry dozens of files. On the server side, that one request becomes many file rows via a single set-based SQL insert (more on that shortly).

This is the deliberate exception to the keep-bytes-off-the-server rule. For a 200 KB file, setting up a direct-to-S3 handshake costs more than just streaming the bytes through the app. So PlikShare amortizes: it pays one request's overhead for a whole pile of small files instead of three round-trips each. Tiny-file uploads are dominated by request count, and this brings that count way down.

Big files: parallel parts

The large-files queue runs the multipart algorithm: each part of the file goes through its own small lifecycle, and several parts are in flight at once. The lifecycle below is the unencrypted object-storage case, where parts go direct to the bucket. On encrypted or local storage the same shape applies, except each part's PUT is proxied through the server instead of going to a bucket URL. The parallelism and the part bookkeeping are the same either way.

Initiate part NPOST …/uploads/<id>/parts/N/initiate

server returns a pre-signed URL + the exact byte range

Slice the bytesBlob.slice(start, end) - a lazy view, zero copy

PUT directly to storage

Object storage holds part Nbytes never touched the app

read the ETag from the response header

Report the ETag backPOST …/parts/N/complete

after every part: complete the upload

Complete uploadstorage stitches the parts into one object

Two design choices here matter for speed.

The server owns the part math; the client just follows it. The browser never decides where to cut. It asks to initiate part N, and the server replies with startsAtByte and endsAtByte; the client slices exactly that range and PUTs it. That keeps the adaptive logic (staying under S3's 10,000-part limit, accounting for the larger first part on encrypted files where the header lives) in one place, on the backend, where it can change without shipping a new frontend.

Concurrency is capped globally, not per file. At most five storage requests are in flight at any time, shared across all parts of one big file and across every file uploading at once. A single shared pool enforces it: before launching a part, the uploader checks the pool, and if it's full it waits until a slot opens up. Five is intentionally modest, enough to saturate most uplinks but low enough that a tab uploading a decompressed zip doesn't spawn a hundred concurrent requests and run out of memory. A second cap limits how many files exist in the pipeline at all, thirty, so the queue feeds in new work only as in-flight uploads finish.

Interrupted uploads resume instead of restarting. Because every part is confirmed independently, the server knows which parts already arrived. On resume the client calls getUploadDetails, gets back the set of part numbers it already has, and uploads only the missing ones. A 50 GB upload that died at 90% sends the last 5 GB, not the whole thing again.

Initiating in bulk: the hot path

Every upload, large or small, starts with the same call: bulk initiate. The client gathers a batch of up to 30 files and asks the server to set them all up at once. This is the busiest call in the whole flow, so it's where most of the setup-time optimization lives.

Protobuf and gzip on the wire. A batch describing dozens of files, with their names, sizes, folder ids, and content types, is a lot of repetitive structure. The bulk-initiate request and response are encoded with protobuf and gzip-compressed rather than sent as JSON. It's the only upload endpoint that does this, because it's the only one whose payload grows with the file count. The per-part calls stay plain JSON; they're tiny and frequent, and protobuf wouldn't buy anything there.

The slow remote calls run concurrently, in batches of ten. Files that need a pre-signed link or a multipart-initiate against S3 are processed in groups of ten at a time, so the network latency of talking to the object store overlaps instead of stacking up. DirectUpload files skip this entirely, since they need no remote call at initiate time.

One bulk insert, not N inserts. However many uploads are in the batch, they hit the database as a single statement. The whole list is passed as one JSON parameter and expanded server-side with SQLite's json_each:

INSERT INTO fu_file_uploads (...)
SELECT json_extract(value, '$.externalId'),
       json_extract(value, '$.fileName'),
       ...
FROM   json_each($fileUploads)
RETURNING fu_id;

One round-trip to the writer, one transaction, N rows. A custom SQLite function converts JSON byte arrays to BLOBs inline, so even binary key material rides along in the same statement.

Cache pre-warming. As soon as an upload is initiated, its metadata is seeded into a cache. Every per-part call after that (initiate part, complete part) reads the upload from cache instead of the database. For a file with hundreds of parts, that turns hundreds of database lookups into zero.

Fast paths for the common case. The batch's total size is checked against the workspace quota once, as a single aggregate, rather than per file. And folder validation has a special case: when every file targets the same folder (which is what usually happens) it runs a single-row check that skips the JSON serialization the many-folders path needs.

Finishing without waiting: the completion queue

Telling S3 or Azure to stitch a multipart upload together is a slow, network-bound call that can fail. Making the user's browser sit and wait for it (and retry it inline if the cloud hiccups) would be the wrong place to spend that time, so PlikShare doesn't.

When the client posts "complete", the server verifies that every part arrived, inserts the file row with is_upload_completed = FALSE, and enqueues a completion job, all inside one SQLite transaction. The file row and the job row commit together, atomically, and the HTTP request returns right away.

Client: POST …/complete

one SQLite transaction

INSERT file rowis_upload_completed = FALSE

ENQUEUE completion jobcommitted with the file row

commit → request returns now

Background worker (later)CompleteMultiPartUpload → flip is_upload_completed = TRUE

A background worker picks the job up, calls the storage's CompleteMultiPartUpload, and flips the file to completed. If the cloud call fails temporarily, the queue retries it without the user being involved. If the storage was deleted out from under it, the job treats that as success and moves on. So the user's request returns immediately, and the slow, retry-prone call to the cloud happens out of band. Direct uploads skip all of this, since their bytes are already whole in storage, so the file row is written complete in one shot, with no job needed.

One writer, by design

PlikShare runs on SQLite, which allows exactly one writer at a time. Rather than fight that, PlikShare leans into it: every hot-path write goes through a single background thread, the DbWriteQueue. Callers submit a write and await a completion signal, and one consumer thread drains the queue. There's no write-lock contention because there's only ever one writer trying to write.

That single thread is then made about as fast as a single thread can be:

One long-lived connection, reused. The writer opens its connection once and keeps it. It auto-stops after a second of idleness and restarts on demand, so there's no parked thread sitting around when nobody is uploading.
Prepared statements, cached by SQL text. A command pool keeps each distinct statement compiled and reuses it, clearing only the parameters between runs. The upload path fires the same handful of statements millions of times, so compiling them once is a real win.
WAL journal mode, synchronous = NORMAL, connection pooling. The standard fast-but-safe durability setting for WAL, plus pooled connections so reads run on their own connections and never block the writer.
Set-based SQL everywhere. Bulk insert, bulk convert, bulk delete, and folder validation all use json_each over a single JSON parameter, with RETURNING to get generated ids back in the same statement. N rows, one statement, one trip past the writer.

The result is that the writer's critical section stays small. Even the new workspace size is computed before the write transaction opens, so the lock is held only for the insert itself and never for arithmetic.

When encryption changes the route

Everything above assumes the bytes can go straight to the bucket. On an encrypted storage they can't, because the server has to see plaintext to encrypt it. So for managed or full encryption (and for the local-disk backend, and for cross-origin embeds), the part's pre-signed URL points at PlikShare's own endpoint rather than at S3. The bytes stream through the server, get encrypted, and go out to storage from there.

Those app-level "pre-signed" URLs aren't S3 signatures at all: they're stateless tokens sealed with ASP.NET Data Protection: encrypted, authenticated, bound to a content type and an owner, and short-lived (one minute for a proxied part). Validating one needs no database lookup.

The nice part of having PlikShare issue its own pre-signed URLs is that the frontend can stay dumb. It asks the server to initiate a part and gets back a URL, and that's all it needs to know. It never decides whether the bytes should go straight to S3 or through the server: that choice lives entirely on the backend, baked into where the URL points. The client just calls the URL it was handed. The same upload code in the browser works for a plain S3 bucket, an encrypted store, and local disk, with no branching, because the only thing that changes between them is an address the client never inspects. It's the same idea as the server owning the part math: keep the decisions on the backend, and let the client just carry them out.

Even on this proxied path, the work stays light:

Streaming, never fully buffered. The request body flows straight into the storage write as it arrives, so the whole file is never held in memory.
Rate-limited toward the cloud. A rate limiter guards the server's own calls to S3/Azure, so a burst of uploads doesn't trip provider throttling and set off a retry storm.

The encryption design itself (the key hierarchy, the streaming AES-256-GCM frame, the segment-per-part layout that makes both chunked upload and ranged download possible) is its own topic, covered in managed encryption and full encryption. The point here is just that encryption changes which path the bytes take, and the browser doesn't have to know or care which path that is.

Memory discipline in the browser

The last place performance matters is the browser tab itself. A naive uploader reads files into memory to send them, and with a 40,000-entry zip that's how you crash a browser. PlikShare's client is built to hold almost nothing in memory.

Zero-copy slicing. A part is just Blob.slice(start, end): a lazy view, not a copy. The file is streamed off disk into the fetch body as the request consumes it, and it's never fully read into memory.
Streaming decompression for zips. Bulk uploads from a zip pipe each entry through a native DecompressionStream and read it sequentially, buffering only the small remainder that overflows a part rather than the whole archive.
Lazy slicer factories. The object that holds a file's stream is created only when its upload actually starts, and disposed the moment it finishes. As the code notes, for a bulk zip with 100k entries this is the difference between 100k live streams and about 30 at any given time.
Back-pressure on the feeder. A large zip isn't fully expanded into pending-upload objects upfront; the feeder blocks until the queue drains below a threshold, so memory stays bounded regardless of how many entries the archive has.

The whole toolbox

None of these is exotic on its own; what matters is that they work together. The full list:

Direct-to-storage uploads on unencrypted object storage - the browser PUTs bytes straight to S3/Azure, so the heavy files skip the app server entirely.
Three per-file algorithms - tiny files bundled, medium files one direct PUT, large files parallel multipart.
Small-file bundling - many sub-1 MB files in one batched request, bin-packed into ~10 MB groups.
Parallel parts with a global 5-request cap - shared across all files, so the tab never overcommits.
Server-computed part ranges - the adaptive chunk math lives in one place, on the backend.
Resumable multipart - only the missing parts are re-sent after an interruption.
Protobuf + gzip on bulk initiate - a compact wire format where the payload scales with file count.
Concurrent presign/initiate in batches of ten - overlapping cloud latency at setup time.
Set-based SQL with json_each - N rows per statement, one trip past the writer.
Cache pre-warming - per-part calls never touch the database.
Async queue-job completion - the slow, unreliable cloud stitch moves off the request path, with retries.
Single-writer SQLite - WAL, synchronous = NORMAL, a reused connection, cached prepared statements, small critical sections.
Streaming proxy path - for encrypted/local storage: the file streams through the server without being fully buffered, with rate limiting toward the cloud.
Browser memory discipline - zero-copy slices, streaming decompression, lazy slicers, back-pressure.

I spent a lot of time on the upload path, and most of it went into the details, the kind of thing nobody notices when it works and everybody notices when it doesn't. The part I'm happiest with is bulk uploads, which I think are one of the most useful things in the app. You can pick a zip and upload it with its whole folder structure preserved, files landing where they belong instead of in one flat heap. I haven't come across many other tools that let you do that, and it's the feature I reach for most myself.

Damian Krychowski