Learn with O.J. - Internal

I wanted to analyze my own AI chat history but the platform gave me zero export tools.

So I reverse-engineered it.

The problem: The app stores messages in Firestore and encrypts them client-side before writing to the database. No export button, no API, no way to get your own data out.

The extraction: I built a browser interceptor using Playwright that injects JavaScript to tap into Firestore's ReadableStream responses. As the app streams data, my script captures the raw chunks in real time.

The raw data comes in a length-prefixed wire format, so I wrote a two-pass parser. First pass uses the length framing, second pass uses regex with brace-matching to recover documents that got split across chunk boundaries.

That gets about a 99.75% recovery rate.

The decryption: The messages are AES-encrypted using CryptoJS with the user ID as the passphrase. Once I figured that out, I had the full archive: 20,000+ messages, fully decrypted, clean JSONL format with timestamps, speaker labels, and metadata.

The exporter runs repeatedly with SQLite deduplication, so I can capture new messages daily without duplicates.

The takeaway: If a platform won't give you access to your own data, that doesn't mean you don't have options. A browser interceptor, some decryption work, and a weekend of building can turn a locked-down app into a dataset.

I'll post what I built to actually analyze 20k+ messages and what I learned from it next week.

#TechTuesday #LearnWithOJ #SoftwareEngineering #DevOps #SRE #ReverseEngineering #Python #Playwright #DataOwnership #BuildInPublic

I Wanted To Analyze My Own Ai Chat History But The Platform Gave Me Zero Export Tools