InternetInABox
From RAD Lab
Contents |
Internet in a Box
Students
- Zhangxi Tan
- Jonathan Ellithorpe
Faculty
- Randy Katz
- David Patterson
Overview
Distributed systems such as the Internet are at the core of information technology. However, it is extremely difficult to systematically study such large-scale systems in a controlled environment with realistic workloads and repeatable behavior. Most of the existing distributed system research tries to address this problem with cluster based simulations or emulations, which only reach the scale of O(100) nodes. Besides the cost and power inefficiency, it is still very hard to observe the overall picture from a system view point. At the same time, many new Internet architecture and applications research needs a fare amount of modifications on both the data plane and the control plane of routers. Traditional simulation/emulation testbeds (e.g. Emulab/PlanetLab) often fall short on such flexibility requirements.
Our work involves using a multi-board FPGA based system, called RAMP (Research Accelerator for Multiple Processors), to build a reconfigurable testbed to accelerate the development of distributed systems. With the RAMP infrastructure, we can provide a highly controllable environment which scales to 1000 nodes and can be single-stepped, a message at a time to offer great observability and reproducibility. By employing reconfigurable technologies, we can maximize the programmability when emulating routers and switches within the “box”, thus providing a customized data/control path. This will allow significant advances in areas such as new Internet architectures, distributed hash tables, routing algorithms, distributed middleware, etc. Although nodes in our system only run at a clock speed of 200 MHz, which is 10 to 20 times slower than custom hardware, with over 1000 nodes it is still plausible to run real operating systems and applications. What’s more, many high level functions can be implemented directly in hardware. This will further reduce the performance gap with the off-the-shelf commercial clusters.
With over 1000 nodes within a box, many challenging issues are raised in our research. The first one is to find an appropriate internal architecture, including the level of abstraction used for different functional blocks, for building programmable data/control paths and efficiently mapping a high level network description to hardware resources. This internal architecture must also support hardware virtualization in order to reduce FPGA resource consumption. Implementing an efficient logging and checkpointing subsystem will be also be a research challenge, as dumping all event information in a system with thousands of nodes can be quite inefficient for achieving this functionality.
Version 0
Version 0 is released on June 2006. Version 0 runs on Xilinx XUP boards with 256 MB PC2100 DDR memory at 100 Mhz. Each FPGA contains four 32-bit Xilinx MicroBlaze processors, one RS232 serial port and a 10/100 Mbps Ethernet port. Four processors are connected through TCP/IP point-to-point network and run uClinux 2.4.32. The core area requires 11,700 LUT on Xilinx Virtex-II Pro XC2VP30.
A small cluster of 12 nodes has been built with 3 XUP boards, which has been successfully demonstrated on our summer retreat. We have ported a full version of Berkeley i3, a DHT based new Internet architecture, as the first application.
Key processor features
- Separate 16 KB I Cache and 16/32 KB D-Cache
- 50 BogoMIPS (measured from uClinux kernel)
- 64 MB DDR memory@100 Mhz for each processor (separate address)
- Hardware multiplier and divider
- One-cycle hardware shifter
- Unaligned memory access exception
- 32*512 bit high speed FIFO link (3.2 Gbps throughput, 1 cycle access latency) as inter-processor connection
Software Supported
| System tools | agetty, basename, crond, crontab, date, dmesg, echo, env, expand, flatfsd, free, hostname, init, insmod, kill, killall, login, passwd, ps, uname, version, whoami |
|---|---|
| Shell programming | egrep, false, find, grep, msh, null, sed, sh, true, xargs |
| Networking | arp, dhclient, dhcpd, dhcrelay, ftp, ftpd, ftpget, ftpput, ifconfig, ifdown, ifup, inetd, iptables, nslookup, ping, portmap, rdate, telnet, telnetd,ttcp tftp, traceroute, wget |
| File system | cat, chmod, cmp, cp, dd, df, du, gunzip, gzip, hd, head, ln, ls, lsmod, mkdir modprobe, more, mount, mv, pwd, rm, rmdir, rmmod, tail, touch, umount, which, zcat |
| Monitoring and debugging | netstat, rsyslogd, tcpdump, time, top, uptime, vmstat |
| Interpreters | Python (2.0, no math libraies) |
| Package management | dpkg, dpkg-deb |
| Web server | BOA, thttpd (with CGI support) |
| Editor | vi |
| Research Prototypes | i3serverd and a number of sample applications |
Source Code Download and Instructions
All the source codes released are under the very liberal BSD Licenses. Send email to <username> at cs dot berkeley dot edu (replace <username> with xtan) if you have problems. Run the system lower than 100 MHz if you have timing issues.
Version 0 hardware source code
The source requires Xilinx EDK 8.1i SP2 and ISE 8.1i SP3.
Prebuilt Kernel Images
- Board #1:
- Board #2:
- Board #3:
Test Suite
Test suite requires the Ruby intepreter and runs on Windows machine.
RAMP NIC driver source code
uClinux NIC driver for on-chip Ethernet emulation.
Instructions
The topology of the cluster is shown as the following figure. On each board, one CPU serves as the gateway and performs IP forwarding between any two nodes in the cluster. You can use 'route' command to add more routes to external network. Internal networks that connect different processors within the chip are emulated Ethernet. Each gateway can also be accessed through serial port console (115200 8/N/1).
1. Program the chip and download the kernel
Run bing.bat in your working directory (e.g. c:\IIAB). Kernel image and downloader files should be put in the same directory.
bing
2. Upload i3 application and configuration files through ftp
ruby basic_tests.rb
Modify the hotlist variable in the script file, if you want to upload files for a specific node.
3. Select any node as the receiver node of i3 (e.g. node 192.168.2.2). telnet to the receiver node using 'root' (password is also 'root').
telnet 192.168.2.2
Insert a public trigger
#/usr/bin/recv_public_trigger /etc/config/smallcfg.xml 111
4. Send the ID generated by recv_public_trigger on sender node (e.g. node 192.168.3.2) telnet to a sender
telnet 192.168.3.2
Send message to the receiver with its ID (generated in 3)
#/usr/bin/send_public_id /etc/config/smallcfg.xml a3a3a3a3a3a3a3a3482a92589ee4be660eaca93 038362bd8a3a3a3a3a3a3a3a3
5. Use another node to insert the same receive trigger to perform multicast.
6. Use middlebox and send_middlebox to perform middle box function using i3.
i3 is a receiver based architecture and can support many features like unicast, multicast, anycast and middlebox. The booting i3 server is on node 192.168.1.35. For more information about i3, please refer to here.
7. Run CPU real time statistic for the cluster. This feature requires Microsoft Excel.
ruby execel.rb
Reference
Version 0 was built while we were taking two graduate courses (CS252 and CS268). Our project report includes some preliminary results of the system. Note: the memory controller described in the report is different from the one used in Version 0.
- Internet in a Box, Zhangxi Tan, Wei Xu, Xiaofan Jiang. Course project report for CS252 and CS268. May 2006. pdf
Version 0.1
This version of IIAB is based on open source Leon3 (SPARCv8) processor.
- Here is an area report for Leon3 on Xilinx FPGA.
- Apply this patch to add Xilinx XUP board support and sample design to GRLIB IP Library 1.0.10.
Check out the latest emulator we created.
Disk and Thermal Emulation in Data Center using RAMP
We use Debian GNU/Linux distribution as the user-land applications for LEON3. See our guide on how to install Debian on LEON.
