Memory & Context in AI

"One thing that should be learned from the bitter lesson is the great power of general purpose methods, of methods that continue to scale with increased computation even as the available computation becomes very great. The two methods that seem to scale arbitrarily in this way are search and learning."
Rich Sutton's The Bitter Lesson

With systems and products moving increasingly towards being AI-native, personalization is seemingly becoming the moat. But this personalization can not safely scale and be relied upon purely by context

Although "needle in a haystack" method for measuring long context retention and quality degradation is somewhat consistent with increasing context windows, for a practical use case this method is far too simple.

In recent experiments, it's been shown that non-lexical retrieval and text replication suffer from performance degradation as the input and context length grows, which would naturally increase with more complex real world scenarios and applications. As such, short term memory via purely increased context windows doesn't seem to be a safe bet (even newer 1M+ models are prisoners of this).

Instead of a one dimensional memory retention via context, a hybrid model - context, long term via vector databases and recall via RAG - could prove to be the key towards the very personalized native AI experience that feels hard to let go. supermemory and Mem0 stand out for the long-form memory layer for LLM applications and have interesting case studies in how their memory layer was leveraged for this same personalization that I'm talking about.