A recent upgrade to the BlackBerry wireless system seems to have been the cause of a three-hour service disruption affecting millions of users in North America, the maker of the handheld device said on Tuesday.
Research in Motion said the routine upgrade was part of its efforts to increase system capacity for anticipated growth. A preliminary investigation of Monday's outage points to "a problem with an internal data routing system" that had been recently upgraded, the Waterloo, Ont., company said in a statement.
"The upgrade was part of RIM's routine and ongoing efforts to increase overall capacity for longer term growth," the company said.
Similar upgrades have been successful, but there was a problem with this specific upgrade, which led to the outage, it added.
It also repeated earlier assurances that no messages were lost because of the outage and the system is now operating normally.
"RIM has made significant investments to improve its system recovery infrastructure and processes over the last year, which enabled service levels to return to normal quickly."
Outages could affect RIM's reputation
Outages have been few in the last nine years, but the BlackBerry system went down last April and also had a minor outage in September.
While the BlackBerry remains popular, another crash could tarnish its reputation and send users looking for other smart phones, say analysts.
The outage affected millions of users of the device jokingly dubbed the "CrackBerry" for its addictive nature, cutting them off from the e-mail, Internet and text-messaging that has become essential to many.
Analyst Carmi Levy said another crash could damage the brand with users.
"From the CEO all the way to IT managers and the average person walking into a wireless store at the mall, they are all going to ask the question: 'Isn't that the device that's always going down?'" said Levy, senior-vice president of strategic consulting at Toronto's AR Communications.
"At some point, it is potentially damaging to the brand and RIM wants to squelch that now before it gets worse."
Consumers may consider alternatives
Technology industry watcher Iain Grant, of the Seaboard Group, said another crash would be "very awkward" if it happened again any time soon.
It would prompt users to start to look for alternative smart phones, such as the iPhone which isn't officially available in Canada but can be purchased, said Grant, who uses an iPhone.
For now, Research in Motion hasn't been hurt by the latest crash because "everyone cuts them some slack," Grant said.
"The user experience is so empowering. That's why the absence of it is so frustrating because people really feel that RIM has put its finger on a really human need, or at least the business person's need."
Critics have also belted Research in Motion for taking a centralized approach to its network. The concentration of RIM's BlackBerry service at a single network operation centre in the Ontario city of Waterloo, through which traffic such as e-mails are routed, exacerbates such problems and leaves it open to more crashes, said Levy.
"Clearly an architecture where all of your traffic is routed into a relatively small choke point is not sufficient when you are responsible for servicing tens of millions of customers," Levy said.
Centralized network draws criticism
"RIM needs to look at distributing what is essentially a vulnerable, centralized architecture. It needs to decentralize that to reduce those vulnerabilities," said Levy.
He used the example of Google, which rose from obscure search engine to one of the hottest Internet companies in the world in just a few years. Google's infrastructure is decentralized, with multiple so-called server farms located in different geographical areas. If the main system fails for whatever reason, traffic can simply be routed and processed at another server.
That would take longer than the nine or 10 months that have elapsed since the last outage in April, Levy added, and is also very expensive.
Jesse Hirsh, a technology analyst who operates the website JesseHirsh.com, said RIM's approach means its whole network is affected if there's a problem.
"The Internet was invented to provide a resiliency and a type of redundancy that really prevents any single point of failure," he said.
"But because BlackBerry is really using a proprietary network approach, all it takes is their servers to go off line, and the entire network crawls to a halt."
No need for outages, says analyst
Hirsh said while BlackBerry users will tolerate system outages to some extent, there's no need for them.
"The technology exists today to implement redundant systems that ensure that no matter what, everything stays flowing, and the Internet has demonstrated this."
Grant said the so-called server farms are expensive. He said that RIM needs to speak up about what happened.
"Someone senior should step up to the plate and say, 'There will be no more outages — I guarantee it.' "