Packet Switch Architecture
Networks and the Internet are key to
all computer and communication systems. Routers form the basic infrastructure
for all IP networks. Switches are essential components of every router,
as well as the basic building block for any high-performance (non-shared-medium)
network. The CARV Laboratory has been highly active on Packet
Switch and Router Architecture since 1985.
Current research and development activities of the CARV Laboratory are guided by the vision of Commodity Switches becoming a reality in the forthcoming years. Commodity switches must be low-cost, high-performance, universal building blocks for switching and routing across the whole spectrum ranging from WAN to MAN, LAN, system area, storage area, embedded system networks, (multi-) processor-memory interconnects, and networks-on-a-chip (NoC). New markets for switches are emerging with the recent expansion of switched architectures from WAN to LAN, and their expansion to SAN, I/O, embedded and NoC. The volume of these markets is expected to become substantially higher than telecommunication switches and routers. As switches enter this wider market, their architecture needs to be adapted accordingly while their cost should be reduced. This economy-of-scale effect may then alter the telecommunication (WAN) router market, in the same manner that PC’s and workstations affected the supercomputer market: clusters of inexpensive, mass-made, generic commodity components replaced expensive, special-purpose machines.
Contemporary switch architectures vary widely, evolve rapidly, oftentimes suffer from excessive complexity, and have not yet met a number of objective goals, especially if one considers the different requirements imposed by different application domains. Therefore, one of the important challenges in the present context is to discover the unifying and simplifying concepts for the switches at all of the above scales. This will allow the reuse of components and designs, leading to great cost savings.
Multi-Gigabit Switching Fabrics
Buffered Crossbars. Crossbar switches are internally non-blocking, but require complex centralised schedulers and only work with fixed-size cells. However, by including small buffers at each crosspoint, operation with variable-size packets becomes feasible and scheduling is dramatically simplified. The CARV Laboratory has shown (2001-02) that such distributed WFQ scheduling approximates very well the ideal weighted max-min fair allocation, and has studied the factors affecting the convergence time. Since 2003, the CARV Laboratory is active on the design of a variable-size packet buffered crossbar switch (http//archvlsi.ics.forth.gr/bufxbar/).
Backpressure in Buffered Switching Fabrics: multi-stage switches scale to very large numbers of ports. Scalability requires distributed packet scheduling which, in turn, implies internal buffering in the switching elements. Multilane backpressure (credit-based flow control) in the fabric allows the switching elements to only use on-chip buffer memory, while the majority of the packets are buffered at the inputs, in virtual-output queues (VOQ), thus greatly reducing the cost of the fabric. In 1987, the CARV Laboratory proposed the use of backpressure, and subsequently applied it to the development of the Telegraphos (1993-95) and ATLAS I (1995-98) switches.
ATLAS I, a 10 Gb/s single-chip 16x16 ATM switch with backpressure: this 6-million-transistor 0.35-micron CMOS chip -a general-purpose building block for gigabit networking- was designed at CARV, (1995-98) and was fabricated by ST Microelectronics. It provided credit-based flow control (multilane backpressure) with 32,000 virtual channels, sub-microsecond cut-through latency, logical output queues in a shared buffer, 3 priority levels, multicasting, and load monitoring (http://archvlsi.ics.forth.gr/atlasI/).
Benes Fabrics with Internal Backpressure: the Benes topology is a multi-stage fabric known to yield, for large N, the lowest-cost NxN non-blocking switches. CARV applied its buffered fabric architecture to this topology (2001-2002) by combining per-flow backpressure, multipath routing (inverse multiplexing), and cell resequencing. Flow merging was needed to bring the cost of backpressure down to O(N) per switching element (http://archvlsi.ics.forth.gr/bpbenes/).
Pipelined Memory is a novel organization (USA patent 5,774,653, owned by FORTH) that CARV designed (1993-95) for the shared buffer and associated switching and cut-through functions in a switch or router. The advantage of this organization is that it is both simpler and smaller compared to other alternative ones (http://archvlsi.ics.forth.gr/sw_arch/pipeMem.html).
Per-Flow Queueing. Providing guarantees for Quality of Service (QoS) through modern, advanced-architecture network systems requires the decomposition of traffic into multiple flows and the provision of a separate queue for each of them. Managing a large number of queues (hundreds or thousands to possibly millions) at high speed typically requires the assistance of specialised hardware. CARV has been active on such multi-queue management implementations that have varying cost and performance characteristics (http://archvlsi.ics.forth.gr/muqpro/queueMgt.html).
Weighted-Round-Robin Scheduling. After the competing flows have been isolated using per-flow queuing, fair allocation of the available bandwidth requires a weighted-round-robin scheduler. In 1986, CARV initiated a detailed investigation of various methods to perform this at different cost and performance levels (IEEE JSAC Oct. 1987). Current related activities include the development of a pipelined heap manager (ICC’2001) for weighted fair queuing (WFQ) at the rate of 20 to 40 Gbps, and the development of a fast parallel comparator tree for WFQ at 40 Gbps and beyond, under fast changes to the set of eligible flows (http://archvlsi.ics.forth.gr/muqpro/wrrSched.html).
Wormhole IP over ATM. Inspired from the wormhole-routing multiprocessor interconnection networks of the 80’s, CARV has proposed (1998) this technique to turn existing ATM networks into gigabit IP routers with the mere addition of low-cost wormhole-IP devices. An FPGA-based prototype for a 155 Mbps link has been built and successfully tested in 1999 (http://archvlsi.ics.forth.gr/wormholeIP.html).
Telegraphos - High-Speed Network Interfaces. CARV has built (1993-95) the Telegraphos prototype for workstation clustering. The novel features of Telegraphos included protected, user-level network access for low latency communication, user-level DMA, and fast notification of message arrival (http://archvlsi.ics.forth.gr/telegraphos.html).
Other topics of past research and development in CARV include branch penalty reduction (1990), parallel supercomputer architectures (1991-94), interleaved Rambus memory controller (1994), JPEG entropy encoder chip (1994), consulting for a Silicon-Valley high-tech company (1999-2001) and five sub-system prototypes for commercial networking products that have been developed under contract with three companies (1998-2003). Current work includes network processor applications and architectures, network security, and home networking.
Past work in hardware
- parallel supercomputer architecture (1991-94);
- high-speed UART macrocell (1991, chip & board implemented);
- Sbus-to-TAXI interface (1992, chip design);
- interleaved Rambus memory controller (1994, chip design);
- Telegraphos I switch (1995, multi-FPGA board implemented);
- Telegraphos II switch (1996, chip & test board implemented);
- pipelined memory demonstrator (1995, full-custom chip implemented);
- PCI/i960 based systems, and device drivers for them (1997-98);
- SDRAM high-throughput buffer for switches (1998, FPGA board implemented);