I’ve been getting madder and madder about the increasing use of dorky web links; for example, twitter.com/timbray has become twitter.com/#!/timbray. Others have too; see Breaking the Web with hash-bangs and Going Postel. It dawns on me that a word of explanation might be in order for those who normally don’t worry about all the bits and pieces lurking inside a Web address.

How It Works · Suppose I point my browser at http://twitter.com/timbray. What happens is:

  1. The browser connects to twitter.com over the Internet and sends a query whose payload is the string /timbray.

  2. Twitter’s server knows what /timbray means and sends back the HTML which represents my tweetstream.

  3. The browser soaks up that HTML and displays it. The HTML will contain links to all sorts of graphics and chunks of Javascript, which the browser uses to decorate and enhance the display.

On the other hand, when I point the browser at http://twitter.com/#!/timbray, here’s what happens:

  1. The browser pulls apart that address, saving the part after the “#”, namely !/timbray, locally. This part is technically called the “fragment identifier”, but let’s say “hashbang”.

  2. It connects to twitter.com and sends a more-or-less empty query, because the address was all hashbang.

  3. The server doesn’t know whose tweetstream is being asked for because that information was in the hashbang. But what it does send includes a link to a (probably large and complex) chunk of JavaScript.

  4. The browser fetches the JavaScript and runs that code.

  5. The JavaScript fishes the hashbang (!/timbray, remember?) out of browser memory and, based on its value, pulls down a bunch more bits and pieces from the server and constructs something that looks like my tweetstream.

How To Be Found · Web pages are retrieved by lots of things that aren’t browsers. The most obvious examples are search-engine crawlers like Google’s. These things also peel off the hashbang, and since they typically don’t run JavaScript, if you’re not careful, your site just vanished from the world’s search engines.

It turns out that you can arrange to do this and still have a way to be indexed and searched; Google provides instructions. I would describe this process as a hideous kludge; not because the Googlers who cooked it up went off the road, but because the misguided rocket scientists drove these links so far off the road that there wasn’t an unkludgey way back onto it.

Contracts · Before events took this bad turn, the contract represented by a link was simple: “Here’s a string, send it off to a server and the server will figure out what it identifies and send you back a representation.” Now it’s along the lines of: “Here’s a string, save the hashbang, send the rest to the server, and rely on being able to run the code the server sends you to use the hashbang to generate the representation.”

Do I need to explain why this is less robust and flexible? This is what we call “tight coupling” and I thought that anyone with a Computer Science degree ought to have been taught to avoid it.

History · There was a time when there was a Web but there weren’t search engines. As you might imagine, the Web was much less useful. Because everything was tied together using simple links, when people first had the idea of crawling and indexing, there was nothing getting in the way of building the first search engines. I know because I was one of the people who first had those ideas and built one.

So, my question is: what is the next great Web app that nobody’s built yet that depends on the simple link-identifies-a-resource paradigm, but that now we won’t be seeing?

Why? · There is no piece of dynamic AJAXy magic that requires beating the Web to a bloody pulp with a sharp-edged hashbang. Please stop doing it.



Contributions

Comment feed for ongoing:Comments feed

From: Tony Fisk (Feb 09 2011, at 21:31)

Dumb question [after that tirade]: what is a hashbang [hashbong?] meant to achieve in the first place?

[link]

From: Tom Malaher (Feb 09 2011, at 21:39)

In point 5: "pulls down a bunch more bits and pieces from the browser" should be "...from the server"

And I've noticed that jquerymobile does exactly this. All links to pages are hijacked by javascript that fetches and rendres the desired page content and then updates the browser URL, without a full page reload, and replacing http://site/dir/page with http://site/dir#page in the address bar.

This is bookmarkable, and can get you back to the same place, but at the expense of two page loads as far as I can tell (one to load site/dir into the browser, from which the javascript then loads site/dir/page). Doesn't feel right, and doesn't seem to work reliably in some less-capable browsers (e.g. Blackberry, Opera Mobile)

[link]

From: Tony Fisk (Feb 09 2011, at 21:51)

Dumb question [after that tirade]: what is a hashbang [hashbong?] meant to achieve in the first place?

[link]

From: Ted Wood (Feb 09 2011, at 22:27)

I ask that, too... What's the point? To controller crawling?

The recently-built Gawker.com uses the hashbang approach and now page links into their site from search engines no longer work correctly. It's a real mess. What was the point?

[link]

From: Ted Wood (Feb 09 2011, at 22:49)

All I can say is Wow! I read that Isolani article you linked to and it brought up some excellent points that seem to be completely ignored by the brainiacs behind this change.

- How does this impact referrers? They're now officially broken.

- How does this impact web analytics? Does Google Analytics track hash-bang pages properly? How about other stats tools?

- Was the 5-hour outage really worth it?

I can't wait for them to revert this brilliant move. Twitter doing this, fine... but not a blog or news site, that's just ridiculous.

[link]

From: Harish Mallipeddi (Feb 09 2011, at 22:57)

The hashbang became popular in AJAX applications as a way to represent current application state. There are apps like GMail which basically load one URI and almost everything else is just a bunch of AJAX calls.

In GMail's case, if you were viewing your Inbox, it would add a hashbang like #folder/Inbox. If you move to another label/folder, it updates the hashbang. That way if the user bookmarks the page or just copy-pastes the URI somewhere else, you can restore the original state.

If used well like in GMail, this technique actually has some value. So it's not all bad as Tim Bray points out here but Twitter took it too far.

[link]

From: James Pearce (Feb 09 2011, at 23:06)

The recent outbreak of hashbang-phobia seems to belie an underlying fear of the web app as it encroaches on the web document's world.

Life would indeed be simple if, as it once was, the web simply comprised synchronous documents.

But life moves on. I welcome the emergence of web-based applications and the functionality they bring to our lives. That a single URL can now map me to an application, rather than a document, is nothing to be scared about.

So far, so benign. but doesn't the author of the application deserve to be allowed to track state within it? Indeed it's a feature that you can deep link into the Twitter application's state machine. And hahsbangs appear to be one of the few reasonable ways to do that.

Hashbangs are the command line arguments of the modern web app.

The fact that search engines and decade-old crawlers are caught in the headlights though is not a reason not to push the boundaries of what web technology can be made to do within the realm of a rich client application.

(Indeed, I consider curated 'web app stores' as an admission of defeat - hopefully only temporary - while search providers figure out how to augment or replace document-centric keyword algorithms with something more suitable for indexing the rich functionality of a web of apps)

Whether or not Twitter should have built an app at all (rather than sticking with documents) or whether Gawker's code was any good or not are valid - but entirely different - questions.

[link]

From: Mark Jaquith (Feb 09 2011, at 23:06)

Re: "What's the point?"

Older browsers (which at this point is pretty much everything that isn't Chrome or Safari) don't have the ability to change the whole URL in the address bar on the fly with JavaScript. They do, however, have the ability to change the fragment... the # and everything after it. So when you have a state change due to XHR ("AJAX"), and want to update the URL, the only option in older browsers is to use these hacky URLs. For instance, if you're on page 1, and you use AJAX to load page 2. You don't want the URL to still say page 1, or someone is going to copy that and send it to someone and it won't work as expected.

So it's a kludge that is going to be pointless in a year or two, as Firefox (and hopefully IE) gain support for HTML5's pushState/replaceState/popState functionality, which allows for updating the entire URL (on the same domain) to a <strong>REAL URL</strong>.

Our approach in WordPress, for some of the admin screens in the next version, is going to skip XHR ("AJAX") loading if your browser doesn't support the HTML5 stuff. No way in hell am I going to see WordPress spitting out hashbang URLs.

[link]

From: Amir Shimoni (Feb 09 2011, at 23:39)

There are benefits. Once the javascript code is loaded (maybe even from cache), because you only need to load small parts of the page as you navigate the site, the site can be quite a bit faster, and less bandwidth intensive (better for mobile). And as mentioned before, you can bookmark the page, and search engines that understand #! can still crawl the pages.

[link]

From: Michael Stillwell (Feb 10 2011, at 00:41)

Chrome at least, supports history.pushState(), which changes the URL visible in the browser without triggering a real page reload.

http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#dom-history-pushstate

This helps with some of the problems mentioned above. (The only visible/bookmarkable URLs are those without a #!; initial loads can be fulfilled by a single request to the server.)

[link]

From: Bruno (Feb 10 2011, at 01:05)

Of course the URL should be a robust identifier of specific content, and of course twitter has taken the technique too far by applying it to their entire website, but there is a definite force for websites to load more and more content with Ajax and then I don't know any better solution. At least the fragment makes the URL bookmarkable.

In other words, Ajax was added to HTML and HTTP after the facts, and the technique is a complete kludge, but what can we do now?

[link]

From: IBBoard (Feb 10 2011, at 01:10)

Re: hashbang (or shebang) it's a Unix scripting thing: http://en.wikipedia.org/wiki/Hashbang

In places like Trac's source code browser then I've seen the benefit of abusing the fragment in this way - I can expand a tree (which will be populated with JS), it alters the fragment, I view a file by clicking its link, I press back and it auto-expands the tree to where I was. Before they included it then I'd have to re-expand the whole tree manually as it would forget where I was.

That said, I agree that Twitter has gone too far. Fine, maybe they could do it with some sub-features (e.g. paging) but why break the old structure, which gave you proper URLs in referrers? If I get a referral from Twitter then I don't know who it was as the fragment isn't transferred. Given the number of tweets and the preponderance of shorteners then I can't easily find it either.

My only guess on the referrers is that Twitter are going to use their t.co links to provide some of that info - piggy-back some JS on the link in the web version and you've got a way of modifying things to catch the fragment, allowing you to sell bit.ly-style link shortener stats that no-one else can get.

[link]

From: Bradley Wright (Feb 10 2011, at 01:33)

The answer to "why people do this" is this:

http://code.google.com/web/ajaxcrawling/docs/getting-started.html

Google has recommend the #! URL construct as a way of indexing pure-JS URLs like they were regular URLs, so it has since become a defacto standard.

The fact is, with the new pushState API in the HTML5 DOM, that #!-only client-side routing is already the lazy solution. You can make "web apps" (exactly like Gawker isn't--it's a bunch of blogs) using regular old path-based URLs and update those URLs dynamically.

[link]

From: Matthew Todd (Feb 10 2011, at 01:43)

As for a piece of dynamic AJAXy magic that doesn't require beating the Web to a bloody pulp, consider the approach of https://github.com/blog/760-the-tree-slider

[link]

From: Dominic Pettifer (Feb 10 2011, at 02:51)

To those who don't understand what the point of the hashbang #! is, it's to make Ajax heavy sites (single HTML page with tonnes of JavaScript to load content in dynamically) bookmarkable. People like to save bookmarks, post in Twitter or Digg etc. You could say it's storing the current state of the page in the URL e.g. "example.com/#!/products/32gb-toaster" tells the JavaScript code I want to look at a product called a '32GB toaster' so it knows to load that content in.

My personal opinion is that hashbangs #! are a necessary evil until something called history.pushstate is more broadly supported. history.pushstate is a JavaScript API thing that allows URL's to be dynamically updatable via script alone (without a full page refresh). Basically, JavaScript can't update the URL in the address bar except to the right of the '#' hash symbol, but with history.pushstate API it can.

[link]

From: Bill (Feb 10 2011, at 03:00)

This mechanism was created by google, proposed by google, recommended by google, and endorsed by google for many years.

I dare say you should be mad at google!

I think it should be quite possible to build an ajax app where each page of data fetched from the backend is done with a parameter that says "send the json data, rather than rendering out an html page"... at the same url which, if visited, would present a normal html page. After fetching the new data, the app re-writes the url in the toolbar to represent that page.

So from ajax you can pull www.tbray.org/rant1?f=json and get json data, but rewrite the url to say www.tbray.org/rant1 and if a user later follows that link they get the fully rendered html page. Then if the user navigates to rant2, the url is change to rant2 while /rant2?f=json is pulled in the background, without a page load, but the link is still valid for search engines and sharing.

This solves both problems to my eye. But google recommended a different way, and consequently created a standard.

[link]

From: Jeff Rose (Feb 10 2011, at 03:21)

Your URL looks like a bunch of garbled nonsense as I type this comment, but who cares since I would never type it in by hand anyways. I've never understood why people get so worked up about beautifying their URLs. It seems like an obsessive compulsive tendency amongst web developers, or some kind of bragging right for hackers that couldn't find something better to spend their time on. Does a legible URL (beyond the domain name) really provide much value to anyone? I seriously doubt it.

[link]

From: rtp (Feb 10 2011, at 03:56)

HTML 5 History API seems rather promising. The site could be designed to support the history API if the browser supports it, and fall back to normal old-school behaviour if the browser doesn't. No broken links if the user tries to open a "hashbang" link, as there'll be no hashbang links, yet users equipped with a reasonable browsers will have access to the more "interactive" functionality. Also, there won't be a need for two requests when the page load, as the pages load as usual -- you merely put a layer on top of it that makes every in-site request use ajax.

Too bad the support for the history API is poor at this moment, but progressive enhancement and graceful degradation is key to making usable sites. Also, progressive enhancement is becoming more relevant by the day as there are more and more kinds of browsers available (from mobile device browsers to desktop browsers to TV browsers to... you name it).

[link]

From: Stefan Tilkov (Feb 10 2011, at 04:01)

Why did Google come up with this kludge in the first place? If it hadn't, there'd be a great incentive for everyone to just behave. Now we get this mess as a result.

[link]

From: David Radcliffe (Feb 10 2011, at 04:12)

Facebook has been doing this for a long time, much longer than twitter or any other site. No one seemed to have a problem with it then. Or maybe they just didn't notice? Interesting...

[link]

From: jmarranz (Feb 10 2011, at 04:43)

Tim I don't understand your complaints against using hash-bangs (or the name you like more).

The hash-bang is the Google's effort to standardize a de-facto reality, **AJAX is massively used**, and the more AJAX used loading new data/markup the more data is not going to be indexed by web crawlers unless you are conscious of this problem and provide the necessary hacks to get your AJAX-intensive web site indexed.

The hash fragment is the trick to get your AJAX-intensive web site indexed, Google's specification of hash-bangs helps us to make indexable AJAX sites easier, nothing more (there is another technique to get the same with other search engines, of course this technique is not so pretty).

We know the web was invented to link static documents, right, but this is no longer the *actual* web, the actual web is more and more an ecosystem of APPLICATIONS, and I'm sure you know how awful is the "page paradigm" developing pure (intranet) web applications, and you must recognize the line dividing web sites and web applications is becoming more and more blur, and this line has much to do with how AJAX is compatible with SEO, bookmarking, back/forward buttons, visit counters, accessibility etc.

I defend the future of web sites is Single Page Interface, that is AJAX-intensive sites with no reload (or anecdotal).

This is now here, the list is short but increasing (Twitter, FaceBook, Google Search, LifeHacker, Gizmodo...), in my opinion SPI adoption is like AJAX adoption, is a matter of time.

The state of art of web development says that we can build AJAX-intensive sites:

- SEO compatible (any web crawler)

- Bookmarking of "states" (state=page in an AJAX intensive web site)

- Working with JavaScript disabled (accessible without WAI-ARIA)

- Back/Forward buttons (history navigation in general) working as usual

- Visit counters used to monitor "states" instead of pages.

- No site duplication (AJAX/page based)

More info at The Single Page Interface Manifesto

http://itsnat.sourceforge.net/php/spim/spi_manifesto_en.php

[link]

From: stand (Feb 10 2011, at 05:49)

It also de-commodifies the web server to some extent, no? It's not clear to me that it is possible for me to deploy a caching proxy, for example against one of these hashbang sites, though maybe the magic javascript libraries are smart about setting http headers [I wouldn't take that bet though].

We could be inching in a direction where only the Facebooks and Twitters of the world will have the resources necessary to scale a website. This concerns me.

[link]

From: wetterfrosch (Feb 10 2011, at 05:56)

"Plus one."

[link]

From: Simon St.Laurent (Feb 10 2011, at 06:01)

Just wondered if you'd seen "Repurposing the Hash Sign for the New Web", at http://www.w3.org/2001/tag/2011/01/HashInURI-20110115 .

There's a lot of ugliness to this even before you get to the bang!

[link]

From: Ed Davies (Feb 10 2011, at 06:28)

And while we're bashing Twitter over the back of the head for encouraging messed up URIs, can we also tell people that the circumstances where shortened URIs make any contribution are also pretty limited?

But then I think sticking "www." on the front of a domain name is a silly and unnecessary affectation. How many organizations have a more important service running on example.com and relegate their main web presence to a separate machine/IP address mapped only from www.example.com?

[link]

From: Nathan (Feb 10 2011, at 06:30)

I'll save duplicating text here, and merely point to something I wrote earlier on the topic:

http://lists.w3.org/Archives/Public/www-tag/2011Feb/0095.html

IMHO, "hashbang" is orthogonal to the /real/ problem(s).

[link]

From: Adam Fields (Feb 10 2011, at 07:05)

You point out the specification that's endorsed by Google as the right way to do this, but you miss the obvious conclusion: Google must <b>retract this specification</b> and figure out another way to index these pages that doesn't break URL specificity. As long as Google recommends this as the right way to generate urls for dynamic pages, that's how people will do it.

I've also noticed twitter doesn't seem to honor the '?_escaped_fragment_=' form of the snapshot urls. Where's the static snapshot version of those pages?

[link]

From: davide (Feb 10 2011, at 07:18)

Agreed (although if used correctly like in GMail it's an useful technique).

In the matter of broken links, I'll throw in also the URL shorteners, which I just hate! On one hand, we make lot of efforts to have human-meaningful, permanent links. On the other, we made them cryptic and with an expiration date. Dumb.

[link]

From: Andrew Wahbe (Feb 10 2011, at 07:26)

Yes but the current path that HTML5 is on is only going to make this worse. The guiding design principle is to put as much functionality into Javascript APIs as possible. This means that HTML is becoming a JAR format for Javascript and not a declarative representation of the rendered page and its embedded controls. Development flexibility is being maximized at the cost of all other concerns. The Principle of Least Power has been abandoned. The Web is being transformed into a Mobile-Object architecture.

Development follows the path of least resistance provided by the languages, protocols and platforms. We've dug a big hole in the sand; we can't get mad that the water is now pouring in.

[link]

From: Big M (Feb 10 2011, at 07:40)

Amir is the only one who gets it. Typical bandwidth savings of switching to a scheme that uses client side caching is around 85%. For mobile devices, this also translates to power savings.

Stay tuned. More sites will start using this and they'll start using it correctly.

[link]

From: Zach Beane (Feb 10 2011, at 07:42)

Google Groups uses #! pervasively, and it's pretty annoying.

[link]

From: Mark Beeson (Feb 10 2011, at 07:43)

Tim, after looking at the new Gawker's source, I'm convinced I know why they're doing what they're doing. If you look through the HTML source, they're effectively putting _all_ their HTML into one file. We can debate whether or not that's a good idea, but my feeling is that they're _attempting_ to cache the layout of the entire site on your browser, all in one request.

The theory being that once I've cached the view layer of the site, the amount of data required to view individual "pages" is much smaller. I haven't broken out Firebug on gawker.com yet but I'd guess that they're simply returning fragments of JSON for each article.

Unfortunately, HTML was never meant to do this, and you're seeing the result; really hacky links, and--as it turns out-- the site isn't any faster at all. Twitter sort-of behaves the same way.

Luckily, there _is_ a way to cache the entire view layer on the client, while still respecting pages and still offering AJAX in correct places without monstrous URLS. Check out the source of skechers.com -- we're using client-side XSLT, which gets cached in the browser, and the one xsl file describes the layout of the entire site. That way, when navigating from page to page, you're only downloading a small amount of XML. We found that this speeds up our app immensely, and allows us to pull off a massive, massive amount of caching.

[link]

From: Farmer Bob (Feb 10 2011, at 07:51)

@James Pearce:

A lot of sites using this technique are clearly not web apps. They deliver content. Mostly text with some pictures. Like documents.

I wouldn't call this "pushing the boundaries" of the web, at least not in any useful sense. People have been doing this sort of thing with rich web apps for years. So, not only is this technique not new, it's absolutely being widely misused and abused. This is kicking the web in the face.

Your flip dismissal of "decade-old crawlers" seems pretty hasty and might be construed as fighting words by scores of web engineers. There are rules on the web. They're what make it work. Abusing something like this is like repainting the stripes on the road in purple and blue and waving away those who warn that they were painted the way they were for a reason. We have standards for a reason.

URLs are supposed to represent resources. A web app can be a resource, and there are techniques for managing state within those. Hashbangs might be one of these. But when large web properties are converting all their links to _articles_ and other _bits of text_ (tweets/twits/whatever) into these monstrosities, it's not innovation. It's a huge mistake that ought to be regretted now and will certainly be regretted in the future.

[link]

From: Adam Fields (Feb 10 2011, at 08:18)

By the way, there's an obvious-but-not-ideal answer to dealing with this. It's pretty simple - "#!" is not the same as "#" when parsing a url. Treat plain "#" as a fragment, and treat "#!" as part of the path component.

I forked Addressable to do this, and the change to the regexp was minimal:

https://github.com/fields/addressable

[link]

From: Tom (Feb 10 2011, at 08:38)

Something I haven't seen mentioned is how the hash-bang breaks the way I surf the web.

Since getting my HTC Incredible Android, I almost never surf on a "real" PC any more. I use Dolphin on Android to read my Google Reader feeds.

Lifehacker is now broken for me. Every link I follow to read an actual article lands me on the home page.

Great. Not that Lifehacker has/had much(any?) original content, but it was a really nice aggregator of a certain type of content.

Now, the Lifehacker RSS feed is useless to me. Maybe I should be thanking lifehacker - much of their content is about gaining/finding/managing/making time for important things. With this change, they have shown me how to gain back a good 15 minutes of my life every day - unsubscribe from lifehacker.

Well played, lifehacker. Well played.

(P.S. - I would like to see more discussion about how this breaks the mobile web experience in other ways, lifehacker just struck a nerve. I didn't mean to focus solely on them.)

[link]

From: Andy Baird (Feb 10 2011, at 08:48)

I'm confused on why you are so frustrated.

Everything after the .com is a query. Who cares whether that query is processed client side or server side, so long as it gets processed? It doesn't make URL's that much (if any) harder to read, especially since you can take out the #! when sending around the links if you really want.

[link]

From: onebyzero (Feb 10 2011, at 09:01)

+ 1.

Creates more problems. They've taken it too far.

Gmail employs the # it wonderfully well. Even wikipedia uses the # in the url to jump to a section.

Sites like gizmodo though make a hash of it. Unnecessary and they've taken it too far without gaining much.

[link]

From: Reed Hedges (Feb 10 2011, at 09:24)

We are still dealing with the legacy of HTTP servers that stopped at "map URLs to files on disks and return the files". A generation and a half of web developer have grown up with this idea fundamentally stuck in our idea of how the web is supposed to work.

There is no good abstract reason that http://twitter.com/timbray can't return a generic twitter stream view that does some Javascript AJAX thing to get your specific data, if this is how you want it to work for whatever reasons. (though I acknowlege the downsides to this in terms of searching and other semantic-web stuff).

[link]

From: KevBurnsJr (Feb 10 2011, at 09:29)

Actually that first part is only half true.

If you're logged in, the first request returns:

HTTP/1.1 302 Found

Location: http://twitter.com/#!/timbray

[link]

From: Kevin H (Feb 10 2011, at 09:37)

Nathan has it right, and so I'm going to repost his link in case you skimmed past it the first time:

http://lists.w3.org/Archives/Public/www-tag/2011Feb/0095.html

Web applications are a "Good Thing" and should not be discouraged. What we need is some good guidelines to be pushed forward that teach how to create a proper web-friendly representation of your data FIRST, and then to build an application on top of that. Twitter actually got this mostly right - I can still wget http://twitter.com/timbray/status/35569597440598019 and when I do, the representation returns proper links, such as: http://twitter.com/timbray

Gawker definitely got this part wrong, as a wget to any of their old URLs will bounce you to /#!some/path (is a hash even ALLOWED in an HTTP Location header?)

What I'd like to see is for sites like Gawker and Twitter to publish their web applications under something like an app.lifehacker.com subdomain and dedicate their www subdomain for normal web content. Of course the question then becomes, should they direct all capable browsers over to app automatically or should it be opt-in?

And at what ratio of app traffic versus total traffic does it stop making sense to expend resources on maintaining www?

[link]

From: Scott Jehl (Feb 10 2011, at 10:08)

Excellent post! I couldn't agree more. It's especially discouraging to find that Twitter feels much LESS responsive on modern browsers because of this direction.

@Tom Malaher

Several people have commented on jQuery Mobile's use of URL fragments, but there's an important difference between the Google-recommended, Twitter-applied, use discussed in this article, and what we're doing in the jQuery Mobile. In the former case, the application is entirely javascript-dependent; even Twitter's link hrefs that point from one page to another use that JS-dependent "hashbang" syntax. Turn off JS and you'll get a blank page! In jQuery Mobile, there's real HTML under the hood in every page you visit, and the hrefs in that HTML point to real server locations (with no # to be found, unless they're truly a local anchor). This means that from an SEO standpoint, a jQuery Mobile site works as you'd expect from any web-based site or app: real HTML, real links, and progressive enhancement.

Now, of course we do use location.hash to always maintain an addressable URL while "pages" are brought into a single DOM through Ajax; all in order to create mobile-friendly transition effects between pages, but it's an optional feature. To my knowledge, we're the only framework that attempts to play well with the browser's native history out-of-the-box, but we do it because users expect that age-old features like the back button, bookmarks, etc will continue to just work regardless of whether a "page" was generated through Ajax or HTTP, and whether that site happens to be an "app" or not.

Non-JS users never see # URLs when navigating a jQM site - each page is a fresh new load (turn off JS and browse our docs to see). Yes, these users may end up clicking a few times to return to a hash-dependent link sent from a user in a JS-capable browser, but that's far better than getting a blank page, and we're currently working on history.pushState support to lessen the likelihood of that occurring at all. I hope that clears this up a bit!

[link]

From: Communibus Locis (Feb 10 2011, at 10:15)

I'm not a huge visitor of Gawker but it seems slower, like there is a second delay before the content spins up.

[link]

From: John Haugeland (Feb 10 2011, at 10:27)

The reason they do this, of course, is SEO. Because the hash separates the URI from the URN, all twitter links appear to be pointing directly at twitter, which has a dramatic effect on their pagerank and similar metrics.

Tim, you work at Google. The way to fix this isn't blogging. The way to fix this is getting the ranking team to penalize this sort of obvious manipulation.

For the health of the web, it's time to hit the "punish" button.

[link]

From: jmarranz (Feb 10 2011, at 10:51)

"It's especially discouraging to find that Twitter feels much LESS responsive on modern browsers because of this direction"

"I'm not a huge visitor of Gawker but it seems slower, like there is a second delay before the content spins up."

I'm not sure what they are doing both but in my opinion a Single Page Interface web site, well done, is EVER more performant than the page based counterpart.

Try with this demo

http://itsnatsites.appspot.com

Play with the AJAX-intensive page, then try to disable JavaScript, now it works the same but fully reloading the page per-click.

Do you really feel the same performance?

The trick is to use innerHTML as much as possible for partial changes of the page.

[link]

From: seutj (Feb 10 2011, at 11:23)

Is it just me or did half the commenters completely miss the point?

Ajax driven applications aren't evil, those that require it and completely break without it are, as with any JavaScript "enhancement", it should first work without, then you add optional fancypants

[link]

From: PENIX (Feb 10 2011, at 11:26)

Usage of hashbang can be very useful, but the scope is limited, and it should only be used in moderation. Gawker's implementation of hashbang is just one more glaring mistake they are adding to their list.

[link]

From: Vanessa Fox (Feb 10 2011, at 11:45)

Someone linked earlier to the Google documentation on this. I wrote a bunch about this (and went into great detail) when Google was proposing it and when it was launched.

http://searchengineland.com/google-proposes-to-make-ajax-crawlable-27408

http://searchengineland.com/googles-proposal-for-crawling-ajax-may-be-live-34411

http://searchengineland.com/its-official-googles-proposal-for-crawling-ajax-urls-is-live-37298

Basically, it's to make sure Google can crawl the page and extract the content (as Google rightfully generally drops everything in the URL after the #). (It's a lot more complicated than that, of course. There's a whole headless browser that serves back a static version of the page, etc.)

[link]

From: Dave K (Feb 10 2011, at 13:43)

Finally the browser has becoming a RAD. Which means you'll now get sites that are essentially vertical GUIs (vertical for the browser/OS combos it supports instead of the OS platform now) and like most GUI developers they aren't going to bother to create a nice cli to drive the same functionality. There's no way to stop it at this point, the web is and will remain broken. As evidence see the comments above who think this is fine behavior for a web page and have no idea why it isn't. Bye-bye web sites, it was nice knowing you. Back to the bbs I guess.

[link]

From: Robert Young (Feb 10 2011, at 16:02)

Just out of curiosity, doesn't this all put the REST movement in the Tower of London?

[link]

From: Scott Sauyet (Feb 10 2011, at 17:55)

@Andy Baird

"Everything after the .com is a query. Who cares whether that query is processed client side or server side, so long as it gets processed?"

But there are many more potential consumers of a URL than a browser. In most of those environments, there is no facility to process the hash-bang; there is no client side. So one of the basic factors that made the Web so popular, resource addressability, is sacrificed. What jQuery mobile does (see Scott Jehl's comment) is an entirely different idea which does not suffer from the problem here. It simply uses the hash-bang within the confines of the browser and doesn't publish them.

[link]

From: Ian McKellar (Feb 10 2011, at 18:17)

So, I work at Rdio (http://www.rdio.com/) and I feel like we have a reasonable excuse for doing this. We are a streaming music site, so we want to keep music playing as you browse around. To do this we have an HTML page that does the playback and then dynamically load content into a DIV for the browsing. It's unfortunate that we need to do that but it's really the only way to keep the music playing.

[link]

From: Benjamin Lupton (Feb 10 2011, at 21:28)

#! (the hashbang) should not be used with the HTML5 History API, only the # (hash) should be without the ! (bang). This is because using #! will provide google with two sets of urls to index - the #! url and the non hashed url. Also Google is the only search engine to support the #!.

For those who are still wanting to use hashes in their urls regardless - what about the use case of having a HTML4 js-enabled user share a link with a js-disabled user - the link won't work as it relies on the hash. That is why the HTML5 History API (pushState, replaceState) is the successor - the links stay the same.

The issue then is that people need to code their AJAX properly. Instead of doing custom ajax controller actions, they just need to support ajax requests on the original urls - if they want to follow DRY and not duplicate code.

For users who are wanting to use the HTML5 History API - there are a fair few bugs between the browsers (that cause artifacts in google chrome, and cuasing Safari 5 and iOS Safari to not behave like other browsers). History.js provide cross-compatibility for all the HTML5 browsers and provides backwards and forwards compatibility for HTML4 browsers. For users who do not even want to support HTML4 browsers (hashes) at all (to prevent that use case above), a new version will be released soon that will allow the HTML4 support to be disabled - though by default History.js will support HTML5 and HTML4 browsers.

History.js is essential for all HTML5 History API developers, and anyone interested in a stateful web http://j.mp/fTGcBW

[link]

From: Jamon Terrell (Feb 10 2011, at 23:04)

You approach this as if the "traditional" web is more efficient and faster. That's dead wrong.

In the traditional model, the server assembles bits and pieces of data and templates into a document, which it sends over the wire. References javascript is loaded, parsed and executed, and images are loaded. When you click a link, a lot of the same bits and pieces are re-gathered and re-assembled into a document that is 90% the same as the previous one. The server does unnecessary work on every request by retrieving the data and building the repetitive parts of the document, the network does extra work by carrying the same information on every single request, and the browser does extra work by starting over from scratch to re-parse and re-render the document and javascript code.

With the new model of using ajax for everything, you can do the templating on the client side. You can also avoid doing a full page refresh when you're only changing 10% of the page content. That saves server resources, network resources, and client resources, and in typical scenarios will reduce latency in page-to-page transitions from 1-2 seconds down to <100ms.

Even more important is that the user experience is greatly improved, and more capabilities are available. As the rdio.com developer mentioned, you are free to have long running audio, video, chat, or other content on the page while the user still browses around.

It's a massive improvement over AJAX with no URL, and that's what you should be comparing it against. Unless of course you're arguing against ajax entirely, which you may be, under the guise of blaming the #! url.

Furthermore, by developing your application in this way, you can completely separate your UI code from your application code. Your UI code can run entirely in the browser and be served as static html/js/css, and it can consume the same API as your mobile applications, thick desktop client, etc.

The web is the only UI platform that has relied on shipping documents back and forth instead of event driven requests for additional information. This natural progression of the web is a result of the web platform being full featured enough to be an application development platform instead of just a document renderer with links to other documents. You can still use it as the latter, but you can expect that more and more content will be provided in the former.

[link]

From: damien (Feb 10 2011, at 23:17)

Hmm, on one level I can sympathise, but on another level, this whole thing is overblown as an overreaction to the use of one character over another.

The use of hashbang over slash really makes little difference.

Most webservers preprocess urls anyway, so you could just as easily use $ or @ or * for your path delimiters, as you can use #! or any other character or set of characters.

It is, however, a break from a tradition, which will make it harder to understand the structure of links.

As for the use of javascript, yes, its a nuisance. Turning the page-at-a-time web into something more like an application has been a fundamental conflict in the use of the web. Applications dont necessarily have well-defined pages, and are probably best described as a confederation of statefull containers, which in a RESTful world, would require a single URL describing the state of all the containers.

[link]

From: Dan Brickley (Feb 11 2011, at 02:49)

Twitter will move into TV and radio, as part of their mission to become an 'information network' rather than just a social hangout. The single page model makes it somewhat easier to embed media that keeps playing as users click around within the site/app.

I'd call this a prediction but it's just a guess. When they made the change I couldn't think why they'd bother, and avoiding page reloads seems the only plausible concrete benefit.

[link]

From: Tom Gibara (Feb 11 2011, at 14:10)

The broad adoption of hashbangs is a concern.

However, I found time to read all of the comments to this post and nothing that authors of the posts cited, Tim, or any of the commenters have written addresses the problem of what to do if your webapp needs the functionality that hashbangs provide.

My opinion is closest to that of commenter James Pearce and I posted a fuller response here:

http://blog.tomgibara.com/post/3214368343/hashbang-boom

[link]

From: Bill Higgins (Feb 13 2011, at 16:26)

Though I agree that there are many cases when #! is used stupidly, when I read:

"There is no piece of dynamic AJAXy magic that requires beating the Web to a bloody pulp with a sharp-edged hashbang. Please stop doing it."

I hear:

"You damned kids! Get off my lawn!"

As the web continues to expand beyond its document sharing origins, things like this are inevitable. It would be more helpful to talk about positive and negative consequences in the context of particular scenarios (e.g. "apps" vs. "news sites").

[link]

From: Jay Carlson (Feb 15 2011, at 22:43)

Don't forget the coming wave of push technology which will render the legacy web obsolete.

[link]

From: Johnny Lee (Feb 16 2011, at 22:32)

Doesn't this scheme also break the wayback machine @ archive.org?

[link]

From: micha (Feb 19 2011, at 14:54)

Some of the issues of making web applications accessible while still programming the app to run completely clientside are addressed in Golf: http://golf.github.com

[link]

From: Paul Connolley (Feb 21 2011, at 03:27)

I think it's worth mentioning that Google's use of the hashbang within the current Google Groups beta is failing to function properly also.

Looking through the console and at the XHR file list, it appears that it is suffering from the fatal data-didn't-get-through issue. So when something goes wrong, I can't do anything on Google Groups.

The bitterest pill is when I try to file a bug, _to help and benefit others_, I'm told:

> The feedback submission does not work for your browser.

> Google Feedback is only available for Google Chrome, Firefox 3.5+ and Internet Explorer 7+ with Adobe Flash 9+.

Even the super-engineers at Google can wander off the enlightened path of progressive enhancement

[link]

From: Maxx Daymon (Mar 07 2011, at 11:07)

This is another symptom of the tension developing between web _sites_ and web _applications_. Nothing about HTML, HTTP or web browsers was designed with application development in mind, but that's what people are demanding. These techniques make sense for web applications (think Google Docs), but horribly break web sites. The web stack is a horrible application platform. It's as if people decided to starting writing applications using Word as the platform.

Looking at what Google Docs is doing behind the scenes and you can see that building a rich browser-based application requires some serious contortions, and the problems aren't at all the same as what a site that is largely content is.

In the case of Twitter, the web site is becoming richer, but not enough to convince people to leave native rich clients like Twitterific. Of course, this begs the question, why not just write a rich native application and let the web site do what it does best, serve linked documents?

[link]

author · Dad · software · colophon · rights
picture of the day
February 09, 2011
· Technology (77 fragments)
· · Web (385 more)

By .

I am an employee
of Amazon.com, but
the opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.