There's a stark dynamic framing in the telecoms Operations Support Systems (OSS)
market. Until recently networks were expensive while the price tags for the OSS systems used to assure the services running across them were, by comparison, puny. Today that's all changed - not because OSS systems have become significantly more costly, but because network components are a fraction of the capital cost they were 15 years ago. The result is an apparent cost disparity that may be causing some operators to swallow hard and think about putting off their OSS investments, Thomas Sutter, CEO of Nexus Telecom tells Ian Scales. That would be a huge mistake, he says, because next generation networks actually need more OSS handholding than their predecessors, not less
Naturally, Thomas has an interest. Nexus Telecom specializes in data collection, passive monitoring and network and service investigation systems and, while Nexus Telecom's own sales are still on a healthy upswing (the company is growing in double figures), he's growing increasingly alarmed at some of the questions and observations he's hearing back from the market. "There is a whole raft of issues that need exploring around the introduction of IP and what that can and can't do," he says. "And we need to understand those issues in the light of the fundamental dynamics of computer technology. I think what's happening in our little area of OSS is the same as what tends to happen right across the high technology field. As the underlying hardware becomes ten times more powerful and ten times as cheap, it changes the points of difference and value within competing product sets." If you go back and look at the PC market, says Thomas, as you got more powerful hardware, the computers became cheaper but more standard and the real value and product differentiation was, and still is, to be found in the software. "And if you look at the way the PC system itself has changed, you see that when microcomputers were still fairly primitive in the early 1980s all the processor power and memory tended to be dedicated to the actual application task - you know, adding up figures in a spreadsheet, or shuffling words about in a word processor. But as PC power grew, the excess processing cycles were put to work at the real system bottleneck: the user interface. Today my instincts tell me that 90 per cent of the PC's energy is spent on generating the graphical user interface. Well I think it's very similar in our field. In other words, the network infrastructure has become hugely more efficient and cost effective and that's enabled the industry to concentrate on the software. And the industry's equivalent of the user interface, from the telco point of view at least, is arguably the OSS. "You could even argue that the relative rise in the cost of OSS is a sign that the telecoms market as a whole is maturing." That makes sense, but if that's the case what are these other issues that make the transformation to IP and commodity network hardware so problematical from an OSS point of view?
"There's a big problem over perceptions and expectations. As the networks transform and we go to 'everything over IP', the scene starts to look different and people start to doubt whether the current or old concepts of service assurance are still valid. "So for example, people come to our booth and ask, 'Do you think passive probe monitoring is still needed? Or even, is it still feasible? Can it still do the job?' After all, as the number of interfaces decrease in this large but simplified network, if you plug into an interface you're not going to detect immediately any direct relationships between different network elements doing a telecom job like before, all you'll see is a huge IP pipe with one stream of IP packets including traffic from many different network elements and what good is that? "And following on from that perception, many customers hope that the new, big bandwidth networks are somehow self-healing and that they are in less danger of getting into trouble. Well they aren't. If anything, while the topological architecture of the network is simplifying things (big IP pipes with everything running over them), the network's operating complexity is actually increasing." As Thomas explains, whenever a new technology comes along it seems in its initial phases to have solved all the problems associated with the last, but it's also inevitably created new inefficiencies. "If you take the concept of using IP as a transport layer for everything, then the single network element of the equation does have the effect of making the network simpler and more converged and cost effective. But the by-product of that is that the network elements tend to be highly specialized engines for passing through the data - no single network element has to care about the network-wide service." So instead of a top-down, authoritarian hierarchy that controls network functions, you effectively end up with 'networking by committee'. And as anyone who has served on a committee knows, there is always a huge, time-consuming flow of information between committee members before anything gets decided. So a 'flat' IP communications network requires an avalanche of communications in the form of signaling messages if all the distributed functions are to co-ordinate their activities. But does that really make a huge difference; just how much extra complexity is there? "Let's take LTE [Long Term Evolution], the next generation of wireless technology after 3G. On the surface it naturally looks simpler because everything goes over IP. But guess what? When you look under the bonnet at the signaling it's actually much more complicated for the voice application than anything we've had before. "We thought it had reached a remarkable level of complexity when GSM was introduced. Back then, to establish a call we needed about 11 or 12 standard signaling messages, which we thought was scary. Then, when we went into GPRS, the number of messages required to set up a session was close to 50. When we went to 3G the number of messages for a handover increased to around 100 to set up a standard call. Now we run 3GPP Release 4 networks (over IP) where in certain cases you need several hundred signaling messages (standard circuit switching signaling protocol) to perform handovers or other functions; and these messages are flowing between many different logical network element types or different logical network functions. "So yes of course, when you plug in with passive monitoring you're probably looking at a single IP flow and it all looks very simple, but when you drill down and look at the actual signaling and try to work out who is talking to who, it becomes a nightmare. Maybe you want to try to draw a picture to show all this with arrows - well, it's going to be a very complex picture with hundreds of signaling messages flying about for every call established. "And if you think that sort of complexity isn't going to give you problems: one of my customers - before he had one of our solutions I hasten to add - took three weeks using a protocol analyzer to compile a flow chart of signaling events across his network. You simply can't operate like that - literally. And by the way, keep in mind that even after GSM networks became very mature, all the major operators went into SS7 passive monitoring to finally get the last 20 per cent of network optimization and health keeping done. So if this was needed in the very mature environment of GSM, what is the driver of doubting it for less mature but far more complex new technologies? ''
Underpinning a lot of the questions about OSS from operators is the cost disparity between the OSS and the network it serves, says Thomas. "Today our customers are buying new packet switched network infrastructure and to build a big network today you're probably talking about 10 to 20 million dollars. Ten or 15 years ago they were talking about 300 to 400 million, so in ten years the price of network infrastructure has come down by a huge amount while network capacity has actually risen. That's an extraordinary change.
"But here's the big problem from our point of view. Ten years ago when you spent $200 million on the network you might spend $3 million on passive probe monitoring. Today it's $10 million on the network and $3 million on the passive probing solution. Today, also, the IP networks are being introduced into a hybrid, multiple technology network environment so during this transition the service assurance solution is getting even more complex. "So our customers are saying, ‘Hey! Today we have to pay a third of the entire network budget on service assurance and the management is asking me, 'What the hell's going on?' How can it be that just to get some quality I need to invest a third of the money into service assurance?' "You can see why those sorts of conversations are at the root of all the doubts about whether they'll now need the OSS - they're asking: 'why isn't there a magic vendor who can deliver me a self-healing network so that I don't have to spend all this money?" Competitive pressures don't help either. "Today, time-to-market must be fast and done at low cost," says Thomas, "so if I'm a shareholder in a network equipment manufacturing company and they have the technology to do the job of delivering a communication service from one end to the other, I want them to go out to the market. I don't want them to say, 'OK, we now have the basic functionality but please don't make us go to the market, first can we build self-healing capabilities, or built-in service assurance functionality or built-in end-to-end service monitoring systems - then go to the market?' This won't happen." The great thing about the 'simple' IP network was the way it has commoditized the underlying hardware costs, says Thomas. "As I've illustrated, the 'cost' of this simplicity is that the complexity has been moved on rather than eliminated - it now resides in the signaling chatter generated by the ad hoc 'committees' of elements formed to run the flat, non-hierarchical IP network. "From the network operator's point of view there's an expectation problem: the capital cost of the network itself is being vastly reduced, but that reduction isn't being mirrored by similar cost reductions in the support systems. If anything, because of the increased complexity the costs of the support systems are going up. "And it's always been difficult to sell service assurance because it's not strictly quantitative. The guy investing in the network elements has an easy job getting the money - he tells the board if there's no network element there's no calls and there's no money. But with service assurance much more complicated qualitative arguments must be deployed. You've got to say, 'If we don't do this, the probability is that 'x' number of customers may be lost. And there is still no exact mathematical way to calculate what benefits you derive from a lot of OSS investment."
The problem, says Thomas, is as it's always been. That is, that building the cloud of network elements - the raw capability if you like - is always the priority and what you do about ensuring there's a way of fixing the network when something goes wrong is always secondary. "When you buy, you buy on functionality. And to be fair it's the same with us when we're developing our own products. We ask ourselves, what should we build first? Should we build new functionality for our product or should we concentrate on availability stability, ease of installation and configuration. If I do too much of the second I'll have less features to sell and I'll lose the competitive battle. "The OSS guy within the operators organization knows that there's still a big requirement for investment, but for the people in the layer above it's very difficult to decide - especially when they've been sold the dream of the less complex architecture. It's understandable that they ask: 'why does it need all this investment in service assurance systems when it was supposed to be a complexity-buster?" So on each new iteration of technology, even though they've been here before, service providers have a glimmer of hope that 'this time' the technology will look after itself. We need to look back at our history within telecoms and take on board what actually happens.

