Miro Doporto PRO
AI & ML interests
Recent Activity
Organizations
save tome learning and use an easy Ai of all ais and consider a friend James Murdza worldly hospitable attitude whos YT series will catch you up fast,
Today is no different as hes introduced me
to BackGrounder.dev
https://youtu.be/KFu0GTrV31g?si=jdM7DY9q49EM5FYA
Sandbox built in Multi Gfree chat code creat makes sense
and saves money can BYOapi just great for quick sandbox dev checks or just safety wise NO Regrets Code Insider Knowledge
save tome learning and use an easy Ai of all ais and consider a friend James Murdza worldly hospitable attitude whos YT series will catch you up fast,
Today is no different as hes introduced me
to BackGrounder.dev
https://youtu.be/KFu0GTrV31g?si=jdM7DY9q49EM5FYA
Sandbox built in Multi Gfree chat code creat makes sense
and saves money can BYOapi just great for quick sandbox dev checks or just safety wise NO Regrets Code Insider Knowledge
my bad thought it was chat
yes its both sides solving the same problem with timed BM the harder the problem the longer and more inaccuracies the issue is this is a major problem in ai as its wasteful and innacurrate but my math is the difference but thanks for insight I should do a third using regular numpy
those are not pre programmed their logistics computation predeterministic calculations dont mix up secs to microseconds for instance Gwen is goin 60mph t3boost is at 24 million mph I can do this for all design graphic and audio already have this model.ia old but benchmarks mathmatics which also proves stoke navier as its a roc 100 memory
hope this helps toolbox can uncover some useful info I also got hacked for my gmails and been 4 months data sold to Nvidia https://toolbox.googleapps.com/apps/recovery/ownership?email=admin%40phix.earth&domain=phix.earth&case=70573062&flow=contested https://www.hostinger.com/report-abuse
we are very much on the same wavelength think we may be able to do way better if combined tech
just saying I been able to reduce debugging 95% and build out predeterministic builds in language design and all encompassing math turned into 3dim code i call auqqua now if you look at utilizing that ram technique into parallel hold state yu also just reduced Debugging if you use my math Reimiro Miro as it is 400,000xs faster key is imementing Resonance in a customized way that seems like it would be slower bjt its def not
Show HN: I compressed a 160GB KV cache to 640MB at 0.9994 fidelity on a $300 GPU
Title: Show HN: DenseMem — 256x KV cache compression, 0.9994 fidelity, runs on consumer hardware
---
A 72B model at 32K context needs 160GB of KV cache. That's an H100 and $32,000 in HBM3e memory.
I built a protocol that stores the same KV cache in 640MB of DDR5 RAM — on a consumer RTX 4090 and Core i9.
256x compression. 0.9994 cosine similarity. 1.95ms average fetch latency. Verified.
**How:**
Transformer KV cache activations are highly structured and correlated. SVD at rank=64 exploits that structure. Random noise compresses to 0.12 fidelity. Real KV cache activations compress to 0.9994. The math works because the data isn't random — it has geometry.
The system manages a two-tier hierarchy: VRAM is the hot tier, DDR5 is the warm tier. An attention-weighted evictor (0.5 attn + 0.3 recency + 0.2 freq) decides what stays hot. A prefetcher using layer lookahead and token prediction pre-positions pages before they're needed. Average fetch latency: 1.95ms. Max under load: 3.96ms.
Current hit rate is 25% — bottlenecked by my i9's 2-channel DDR5 bandwidth (~38 GB/s). On an 8-channel Threadripper PRO (~224 GB/s) I'm projecting 65-75%.
**Running live:**
- Qwen2.5-7B on RTX 4090 at 32K context (was 4K)
- Every inference tick compressed INT8 via PCA → DDR5
- 2.4s cold start
**The cost math:**
- Uncompressed 72B KV cache: $32,000 in HBM3e
- FoldedMemory: $1.88 in DDR5
- 99.4% cost reduction. Verified on consumer hardware.
GitHub: https://github.com/thorshammerztp-arch/densemem-protocol
Patent Pending: US 64/045,595
Solo developer. Navy veteran. No funding. Consumer hardware.
As in kimi2s case it's got jaws dropping

