Is There A Non-Persistent Middle Ground For Offline Browsing?Is There A Non-Persistent Middle Ground For Offline Browsing?

Redmonk analyst Stephen O'Grady and I have been batting the challenges of persistence in Web browsers back and forth over the last week. Obviously exasperated with something, he <a href="http://twitter.com/sogrady/status/982393888">tweeted</a> (via Twitter) how he'd like to see a browser have the ability to recall recently visited Web pages from a local cache of some sort. I immediately replied, thinking what he was really looking for was off-line persistence of a Web application and, after Twit

David Berlind, Chief Content Officer, UBM TechWeb

November 5, 2008

5 Min Read
information logo in a gray background | information

Redmonk analyst Stephen O'Grady and I have been batting the challenges of persistence in Web browsers back and forth over the last week. Obviously exasperated with something, he tweeted (via Twitter) how he'd like to see a browser have the ability to recall recently visited Web pages from a local cache of some sort. I immediately replied, thinking what he was really looking for was off-line persistence of a Web application and, after Twitter failed us in the ability to carry the dialog, we took to the blogs (him, me) to continue the conversation.Offline persistence of anything in the browser -- pages, appications, data, etc. -- is an incredibly tricky problem. So much so that we'll be having an open conversation about it at Mashup Camp in less than two weeks (there's still room to join us).

O'Grady makes it clear that he's not looking for the whole enchilada -- the ability to interact with Web pages while offline as though he were online. He's after something that's not quite so ambitious -- just get him the old page(s) back (with whatever data he may have last filled into it) regardless of the browser's state of connectivity. Presumably, that information is either in the browser's cache already or something can be done to make sure it's there and easily retrieved. I agree. If this is easily done, then someone please do it. It would go a long ways towards improving the overall Web experience for a lot of us.

But is it easily done? Or, better put, is it trivial enough to warrant the effort or is so non-trivial that shooting for the moon -- offering the same capability as a part of a Google Gears-style off-line persistence of Web apps -- makes more sense. O'Grady blogged my question and his response:

[DB:] thanks in part to Gears and similar technologies from Adobe, Apache, Oracle, and Sun, we know more is possible. So, why accept less?" - [SO:] because it's likely to arrive far more quickly.

I'm not so sure and responded as such directly on his blog with the following comment:

Regarding your answer "because it's likely to arrive far more quickly", is that really true? In this case, the science of caching form data in a way that's recoverable during some future session is nearly as difficult as solving the persistence problem in general. A modicum of structure would be required and, as you know, the majority of form-based HTML pages lack structure.

With such a lack of structure to these pages, the only approach might be something like a plug-in that's the equivalent of an autosave (like what Gmail does), but instead, to the local hard drive where it can be retrieved. But I suspect that even that is almost as complicated as the persistence problem.

For example, go to any page with a form on it (even the one to fill out a comment on your blog), fill some of the form out, and then hit File Save and save the file to your local hard drive. Now, open that file with file open and you'll notice that whatever you filled into the form is no longer there.

State in combination with lack of structure (even though the form seems structured) is most definitely an issue. The more I think it through, the more I realize how it's really a thorny problem.

Thread-per-tab browsers (like Chrome) might be a part of the answer in that each tab could run in its own shell and those shells in turn could be capable of running some code against the currently loaded page in a way that doesn't interfere with the HTTP server's understanding of the page's state. I'm thinking "screen scraping" tech that independently (of the web server) creates its own last known state for every tab.

One question if this were working…: would ordinary users expect the cached-page to be able inject the recovered information into the "real pages" when they're available? You're a power user. The idea that you might be able to pull the cached page back and copy & paste some information somewhere so it doesn't get lost isn't exactly a great user experience. Most people would get to that recovered page and ask "Now what?." There would be an expectation that the recovered page could inject the data back into the real page (when it's available), in which case, a significant amount of flexible structure and intelligence would have to be incorporated into the solution…. an architecture that's remarkably close to being a persistence mechanism.

To be honest, I'm pretty sure most of what I'm saying is technically accurate (I mistakenly left out my concerns about security with such an architecture). But, on the same token, things under the hoods of browsers may have progressed to a point where I need to be taken out to the woodshed for a technical spanking. I've only managed to keep half an eye on the situation over the last year and know things have improved dramatically.

Or perhaps the answer is a bit simpler. For example, why, when I hit my browser's back button, am I sometimes returned to a Web form that still has all my user-entered data in it while at other times, hitting the back button takes me to a completely blank form (as though I never entered anything). I most often notice this in e-commerce situations. Then again, when the browser's back button is involved, the implication is that the original form was submitted (thereby updating the page's state).

If you have thoughts on this, please do share them with us.

About the Author

David Berlind

Chief Content Officer, UBM TechWeb

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights