Power, weight, and cost considerations often mean robots do not include computing capabilities capable of running large-scale multi-core CPU-based, graphics processing unit (GPU)-based, field-programmable gate array (FPGA)-based, or tensor processing unit (TPU)-based algorithms. For example, a light-weight drone with an attached gripper that uses a GPU-based grasp-planning module to compute grasp points for picking up objects[backus2014design] or perching [ramon2019autonomous], requires access to a GPU that the drone would not have onboard. While nearby computers can provide the necessary computing capabilities, this practice can be complex to set up, scale, and is prone to over-provisioning. Instead, we propose a framework based on the Fog Robotics [ichnowski2020fog, tian2017cloud, tanwani2019fog] idea of balancing between the compute available at the edge and in the cloud. This framework, FogROS, is an extension of the Robot Operating System (ROS) [quigley2009ros] that, with minimal effort, allows researchers to deploy components of their software to the cloud, and correspondingly gain access to additional computing cores, GPUs, FPGAs, and TPUs, as well as predeployed software made available by other researchers.
ROS, at its core, is a platform in which software components (nodes) communicate with each other via a publication/subscription (pub/sub) system. Individual nodes can publish messages to named topics and subscribe to other named topics to get messages published by other nodes. In practice, these nodes all run on the robot and perhaps a nearby computer. For example, on a robot, a sensor node publishes to a sensor topic, a planning node subscribes to the sensor topic and compute a plan based on the sensor messages, and then publishes messages that another node uses to execute the plan (Fig. 1).
With FogROS, a researcher can use the same code, and make a small change to a configuration file to select components of the edge computer software to deploy to cloud-based computers. On launch, FogROS provisions a cloud-based computer, deploys the nodes to it, and then transparently passes the pub/sub communication between the edge computer and the cloud. The only observable differences are: (1) the pub/sub latency increases, and (2) the cloud-deployed components can compute faster given the additional computing resources. The increased latency means that not all components will benefit from being deployed to the cloud, in particular, any component with real-time requirements (e.g., a motor controller) or any component that requires little computing power, should not be deployed to the cloud. On the other hand, for many applications, the increased computation speed may enable new robot capabilities, speed up tasks, and allow for higher accuracy in tasks such as object detection or segmentation due to the use of larger models.
FogROS also supports launching pre-built automation container images. These container images contain all the software and dependencies required to run a program. To date, many academic and industrial open-source communities leverage container services, such as Docker[docker], to distribute their applications. FogROS can deploy robot automation containers to the cloud without explicitly configuring the environment and hardware, facilitating ease of containerized software reuse.
This paper makes three contributions: (1) FogROS, an open-source extension to ROS that allows user-friendly and adaptive deployment of software components to cloud-based computers; (2) a method to pre-deploy containerized FogROS software that allows commonly-used software to be quickly integrated into applications; and (3) application examples evaluating the performance of FogROS deployment.
I-a Design Principles
FogROS aims to adhere to the following design principles:
Transparent to software
FogROS should preserve ROS abstractions and interfaces. Applications should notice no difference between cloud-deployed and on-board nodes (other than the latency of message processing).
Flexible computing resources
Different nodes require different computing capabilities. Some nodes benefit from additional computing cores, while others benefit from access to a GPU. FogROS should make selecting the appropriate configuration simple.
Minimal configuration required
Running software nodes on a cloud-based computer should be as easy as running them on the edge computer.
Some useful nodes require extensive setup, installation of a dependency structure, and may have conflicting dependency versions. FogROS should make it possible to use pre-deployed containerized software through configuration.
Different networking options may have different availability, performance, setup time, and costs associated. FogROS should allow the user to select the networking options best suited to their application.
Security and Isolation
The communication between cloud and on-board nodes should be secure, and FogROS should close ports that expose software to compromise.
Ii Related Work
Cloud computing has emerged as an attractive and economically viable [ichnowski2020economic] resource to offload computation for robot automation systems with minimal onboard computing power. kehoe2015survey survey the capabilities, research potential, and challenges of cloud robotics, as well as applications such as grasp planning, motion planning, and collective robot learning, that might benefit from the computational power of the cloud. Grasp planning and motion planning have both shown to be amenable to cloud computation. kehoe2013cloud, tian2017cloud, and li2018dex generate robot grasp poses in the cloud by implementing parallelizable Monte-Carlo sampling of grasp perturbations [kehoe2012estimating, kehoe2012toward, kehoe2014cloud] while mahler2016privacy explore cloud grasp pose computation that maintains privacy of proprietary geometries. In motion planning, lam2014path introduce path planning as a service (PPaaS) for on-demand path planning in the cloud and use Rapyuta to share plans among robots. bekris2015cloud and ichnowski2016cloud both devise methods for splitting motion planning computation between the cloud and the edge computer [ichnowski2020fog, anand2021serverless]. In addition to providing computing resources, the cloud can also facilitate sharing and benchmarking of algorithms and models between edge computers [tanwani2020rilaas]
for grasping, motion planning, or computer vision.
To leverage cloud resources, many in academia and industry endeavor to connect edge computers to the cloud. Example approaches include using SSH port forwarding [hajjaj2017establishing] or VPN-based proxying [lim2019cloud] to support unmodified ROS applications to share a single ROS master. FogROS builds on these approaches, and adds automation of ROS node deployment to the cloud and a virtual private cloud (VPC), saving time over prior approaches that require manual configuration of network access rules and IP addresses. For example, setting up a VPN-based proxying requires more than 12 steps for configuration and 37 steps for verification [hajjaj2017establishing]. The complex manual configurations scale poorly and are error-prone. ROSRemote [pereira2019rosremote] and MSA [xu2020cloud] replace the ROS communication stack with custom Pub/Sub designs. Although edge computers can communicate with nodes owned by other ROS masters, these systems require heavy code changes to ROS applications. FogROS, as an option, leverages rosbridge [crick2012rosbridge], an open-source webserver that enables an edge computer to interact with another ROS environment with JSON queries. Given diverse attempts to connect edge computers to the cloud, wan2016cloud and saha2018comprehensive call for a unified and standardized framework to handle cloud-robot data interactions. FogROS aims to be a painless solution to this open issue by allowing unmodified ROS applications to be launched on the cloud with minimal additional configurations.
Sharing a similar vision as FogROS, RoboEarth [waibel2011roboearth] is a successful example where edge computers share information on the cloud. However, in their use cases, edge computers mainly use the shared database on the cloud, and do not benefit from the powerful cloud computing resources. Rapyuta [mohanarajah2014rapyuta] and AWS Greengrass [greengrass] provide pipelines to deploy pre-built ROS nodes to edge computers or robots. Both platforms build ROS nodes or Docker images on the cloud, and push the built images to robots that are registered with their platforms. FogROS considers the reversed direction of Rapyuta and AWS Greengrass. Instead of pushing the computation from cloud to robots, FogROS is a lightweight platform that allows developers to rapidly prototype applications and gain quick access to extensive computing resources, without conforming to an additional framework.
In this section, we provide a brief background on the building blocks of FogROS, including (A) cloud-based computing, (B) ROS and its pub/sub system, and (C) how ROS-based robotic systems are configured and launched.
Iii-a Cloud Computing
Cloud-based computing services, such as Amazon Web Services (AWS), Google Cloud, and Microsoft Azure, offer network accessible computers of various specifications to be rented on a per-time-unit basis. Setting up a service typically requires a one-time registration and a credit card. Registered users can setup, reconfigure, turn on, turn off, and tear down virtual computers in the cloud. This can be done either through a web-browser interface, or programmatically through a network-based application programming interface (API). Computer configuration options include: amount of memory, amount of processing cores, type and amount of GPUs, and inclusion of custom processing hardware such as field-programmable gate arrays (FPGAs) and tensor processing units (TPUs). FogROS uses the AWS cloud service API to setup a cloud-based computer, deploy ROS and the code, secure network communications, and then run the node.
Iii-B ROS and Pub/Sub
In ROS, nodes (software components) communicate with each other using a pub/sub (publication and subscription) system. Nodes register as publishers and/or subscribers to named communication channels called topics. Each topic has message type that determines what data is sent over the channel. For example, a ROS node that monitors the joint state (e.g., angles) through sensors, would publish messages of type JointState on an appropriately named topic, and that topic would only contain JointState messages. When a node publishes a sequence of messages to a topic, all registered subscribers will receive the message in the same sequence they were published.
Coordination of the publishers and subscribers to topics is maintained by the ROS Master [ROSMaster]. The ROS Master exposes network API that allows nodes to connect over a network and register/unregister themselves as publishers and subscribers to topics. During the registration process, publishers get the current list of subscribers, and subscribers get the current list of publishers. Publishers may then connect directly to already-registered subscribers, and subscribers may connect directly to already-registered publishers.
Once publishing and subscribing nodes are directly connected to each other111As an implementation optimization, ROS nodes on the same machine can communicate via a shared-memory queue, instead of using a network., publishing nodes serialize message-specific data structure to a sequence of bytes and sends the bytes over the connection. When subscribing nodes receive the sequence of bytes, they deserialize the bytes to the message-specific data structure and process the message.
However, existing ROS pub/sub communication has several limitations: (1) all the nodes have to share the same master to communicate (inter-master communication is not supported by ROS pub/sub protocol stack, and one has to use out-of-band protocols for communicating across masters); (2) although it is possible to join nodes from multiple machines to share a single master, the communication for ROS is not secured, and users must configure security protocols.
Iii-C ROS Launch Scripts
Robot systems can be comprised of a complex graph of nodes communicating with each other via pub/sub. To consolidate an automation system deployment into a single file, ROS supports a launch configuration file. This file specifies which nodes are to be launched by code entry point, and allows for optional remapping of topic names (e.g., so that code written to process a standard message type can produce it from a topic with a name not known/specified when the code was written). Listing 1 is an example launch script that launches a client node and a server node from the mpt_ros package locally.
FogROS extends the launch script capabilities to allow the specification of which nodes to deploy in the cloud and on what machine type.
To meet the design principles, FogROS (1) extends ROS launch scripts to include an option of where to deploy and run a ROS node, the only place that requires user configurations; (2) provisions cloud-based computers, securely pushes the code or containers to them, and runs the code; (3) sets up one of two networking options (VPC or Proxy) to transparently and automatically proxy the pub/sub communication between the edge computer and the cloud; (4) provides introspection infrastructure for monitoring network conditions; and (5) supports launching containerized FogROS nodes from pre-built Docker images.
Iv-a Launch Script Extensions
FogROS uses standard ROS launch scripts as the user interface. Users specify which nodes are to be deployed and what type of cloud computing instance is used in the same launch file as the nodes that users want to deploy locally. They can push multiple nodes to the cloud at the same time by providing the path to a separate launch script. FogROS parses the launch script, finds and collects all the packages in the script, and pushes them to the cloud computer. As part of the configuration process, users can optionally specify a bash script that installs dependencies outside of the FogROS launch process (e.g., mirroring the steps to install dependencies on the edge computer).
Listing 2 provides an example of a FogROS launch script that serves the same functionality as Listing 1, but with the server node running on a cloud computer. Local ROS nodes, such as client, are launched as before. With FogROS, the user specifies the launch file that contains the server node (server.launch), the type of cloud computer (c5.24xlarge), and optionally, a setup script (init.bash).
Iv-B Cloud-Computer ROS Nodes
When FogROS launches cloud-based nodes, it performs the following sequence of steps that result in ROS nodes running on a cloud computer with messages being transparently proxied between the edge computer and the cloud computer:
Provision and start a cloud computer with the capabilities from the launch file and pre-loaded with ROS
Push code for ROS nodes to the cloud computer
Run the environment setup script
Set up secure networking (via Proxy or VPC)
Launch the pushed code
Before FogROS provisions a cloud computer, it uses the cloud service provider API to create security rules to set up a secure computing infrastructure suitable for ROS application configuration. It closes network ports not needed for communication between nodes. Then it provisions the cloud computer with a specified location and type. To speed up the launching process, FogROS specifies an image pre-loaded with the core ROS libraries to run on the cloud computer. As part of the launch process, FogROS generates and installs secure credentials on the cloud computer, and gets its public internet protocol (IP) address.
Once the computer is started, using the IP address and secure credentials, FogROS recursively copies the ROS code to the cloud computer securely over a secure shell (SSH) [rfc4251] connection, optionally runs the user-specified setup script, and builds the code on the cloud. With the code ready to run, FogROS then starts the configured secure networking components for VPC (Sec. IV-C) or proxying (Sec. IV-D), and runs the ROS nodes in the cloud.
Iv-C Networking: Virtual Private Cloud
To allow the edge computer and cloud-based computers to communicate securely with each other, FogROS automates the setup of a Virtual Private Cloud (VPC). A VPC secures point-to-point communication between cloud computers by assigning private IPs that are only accessible for other nodes within the VPC. FogROS creates a Virtual Private Network (VPN) between the edge computer and the VPC. A VPN is a secure network communication channel provided by the operating system. With this setup, from the perspective of a ROS node, all nodes appear as though they are on the same private network.
FogROS automates the setup of the VPC and the VPN when it provisions the cloud computers to run the ROS nodes, by using the cloud service providers API to: (1) create a VPC instance and a security group for it, (2) establish credentials for the cloud-computers that will participate in the VPC, (3) configure the cloud computers to use the VPC for cloud-to-cloud communication, and (4) set up a VPN endpoint to which the edge computer will connect. Once set up, the cloud service provider manages the VPC, while FogROS manages the VPN. As part of the setup process, FogROS sets a unique private IP address for each of the computers participating, so that the ROS nodes can establish connections between computers.
Iv-D Networking: Pub/Sub Proxying
In addition to VPC networking, FogROS also supports a proxied-network option that enable communication between the edge computer and the cloud. This option is available for cases where the VPC solution may be unavailable due to service provider restrictions or costs, or when an additional level of isolation between the edge computer and the cloud is desired. There are also performance differences (see Section V) when considering the network options, and a user of FogROS may wish to measure performance in their application before choosing a suitable option.
In FogROS, a proxy consists of two ROS nodes, one running in the edge computer and one running on the cloud. These nodes connect directly to each other via a secure network connection, and register as publishers and subscribers to topics on the ROS Master running on each computer. When a proxy node receives a message from a subscription, it sends it to the other proxy node, which then publishes it to the subscribers registered on its ROS Master.
There are two options for FogROS to identify topics to proxy: (1) user-specified in the configuration file, or (2) automated. If topics are specified by the user in the configuration file, FogROS subscribes and publishes to the topics specified. If the user does not specify topics, FogROS communicates with the ROS Master on each end and identifies which topics have registered subscribers and publishers. When a topic has a publisher on one end, and a subscriber on the other, the ROS proxy nodes coordinate with each other to proxy the associated topic (see Fig. 3). While the automated process is simpler to setup for the developer, it may result in increased setup time as the proxy nodes coordinate the setup of proxied topics, or wasted bandwidth on topics that do not need proxying.
Iv-E Network Monitoring
With the proxying network option, FogROS also provides interfaces to monitor network conditions via ROS topics /fogros/latency and /fogros/throughput on both the edge computer and the cloud. These interfaces do not introduce additional overhead unless an active subscriber subscribes to them. Users can also inspect and interact with ROS topics with standard ROS tools such as rostopic. In addition, FogROS provides the same fault tolerance as ROS running locally, where ROS nodes can re-join the pub/sub communication after network interruption.
Iv-F Pre-Built ROS Nodes
FogROS also supports launching containerized ROS nodes with a similar interface to the FogROS launch script extension described in Section IV-A. While an increasing number of ROS developers are using pre-built docker images to host ROS nodes, this functionality is not natively supported by ROS. With FogROS, users can specify the name of a publicly-available image on DockerHub as well as the destination machine on which they want to launch it. FogROS then uses a template environment setup script to pull and run the image on the specified machine. It analyzes the machine type and configures the docker run command to match the hardware (e.g., GPU) available on the computer.
Listing 3 shows an example launch script for a Dex-Net grasp planning node in a docker image. FogROS provisions and starts a cloud computer (g4dn.xlarge) with a GPU, pulls the dexnet:gpu image from DockerHub [docker], and attaches the docker container to the GPU, and runs it.
Here we present three example applications on FogROS: (A) visual SLAM, (B) Dex-Net grasp planning and (C) multi-core motion planning. The nodes, topics, and split between a single-core edge computer with 2GB RAM and the cloud are shown in Fig. 4. In addition to showing the network latency and performance with FogROS, we highlight the simplicity and minimal configuration of deploying these applications.
V-a Visual SLAM Service
|Edge||Cloud — FogROS — Network|
|Scenario||FPS||Create (s)||FPS||Create (s)||VPC (s)||Proxy (s)|
ORB-SLAM2 [mur2017orb] is a visual simultaneous localization and mapping system that uses monocular video input. In this experiment, a Camera Node publishes a resolution video with each frame 48 KiB on average to the cloud (Fig. (a)a). On the cloud an ORB-SLAM2 node subscribes to the video feed [sturm12iros]
and computes a pointcloud map along with the current estimated location within the map, which are sent back to the robot. For more details on the ORB-SLAM2 algorithm, we refer readers to the paper and open-source code available from mur2017orb.
To configure FogROS to work with ORB-SLAM2, we build ROS docker images and push them to Dockerhub [docker]. We wrote a bash script to pull and run the docker image and include its path as in Listing 2; FogROS then runs the script when configuring the environment. After initialization, FogROS sets up and secures communication between the robot and the cloud SLAM server.
To evaluate the performance of FogROS when deploying ORB-SLAM2, we compare the cloud-deployed performance to an edge-computer-only implementation. We select a 36-core cloud-computer (AWS c4.8xlarge) for the ORB-SLAM2 node, and compare it with ORB-SLAM2 running on a one-core edge computer We report frames-per-second (FPS) and latency that creates the first map (in seconds) [nardi2015introducing]. Table I suggests that cloud-based SLAM can achieve higher FPS, meaning that it can aggregate more data and produce higher quality maps in a real-time setting. Cloud-based SLAM also has less latency in generating a new map.
V-B Dex-Net Grasping Service
Grasp analysis computes the contact point(s) for a robot gripper that maximize grasp reliability—the likelihood of successfully lifting the object given those contact points. To plan grasps on rigid objects in industrial bins using an overhead depth camera, we use an open-source implementation of the fully-convolutional grasp-quality convolutional neural network (FC-GQ-CNN)[satish2019policy, mahler2019learning] from Dex-Net[mahler2017dex]
. We wrap FC-GQ-CNN in a ROS node and deploy it to the cloud along with pretrained neural-network weights as a Docker image. We refer the reader to satish2019policy and mahler2019learning for details and code for the neural network and grasping environment.
This node subscribes to 3 input topics containing a scene depth image and mask for objects to be grasped, and a message of type sensor_msgs/CameraInfo containing camera intrinsics. Internally, the node feeds this to FC-GQ-CNN, which outputs a grasp pose and associated estimate of grasp quality. The node wraps these outputs, along with the gripper type and coordinates in image space, into a gqcnn_ros/GQCNNGrasp message, and publishes it.
While the node can be run both locally or in the cloud, using cloud GPU instances as opposed to a CPU for neural-network inference can greatly reduce computation time. In either case, the node is wrapped inside of a Docker container, reducing the need for resolving dependency issues between deep-learning libraries, CUDA, OS, and ROS versions. The pretrained models in the image is intended for a setup similar to that shown in Figure 5; variations in camera pose, camera intrinsics, or gripper type may require retraining the underlying model for accurate predictions.
We run FogROS with the Dex-Net docker image using the launch file in Listing 3. We compare grasp planning times across 10 trials using both the CPU onboard the edge computer and FogROS with the Docker images on the cloud. We also show compute times when using a compressed depth image format to transfer images instead of transferring raw images to the cloud directly. For the latter case, images are compressed and decompressed using the republish node from the image_transport ROS package [ROS_image_transport]. Table II shows the results for both compressed and uncompressed image transport between the nodes.
|Edge||Cloud||FogROS VPC||FogROS Proxy|
V-C Multi-Core Motion Planning
|Edge||Cloud||FogROS VPC||FogROS Proxy|
Motion planning computes a collision-free motion for a robot to get from one configuration to another. Sampling-based motion planners randomly sample configurations and connect them together into a graph, rejecting samples and motions that are in collision. These planners can be scaled with additional computing cores.
Using FogROS, we deploy a multi-core sampling-based motion planner [Ichnowski2014_TRO, ichnowski2019mpt] to a 96-core computer in the cloud to solve motion planning problems from the Open Motion Planning Library (OMPL) [OMPL] (see Fig. 6). This planner node subscribes to topics for the collision model of the environment and motion plan requests (Fig. (c)c). When the planner node receives a message on any of these topics, it computes a motion plan, and then publishes it to a separate topic. For more details on the multi-core motion planner, we refer the reader to the paper and the open-source code by ichnowski2019mpt. To configure FogROS to work with multi-core motion planner, we record the steps we use to setup the dependencies (e.g., FCL [pan2012fcl] and Nigh [ichnowski2018concurrent]) in a script. By providing the script, we configure FogROS similar to Listing 2.
We compare the planning time as the difference between publishing a motion plan request message, and receiving the plan result message, and show the results in Table III. The same motion planning problem is solved in a fraction of the time on the cloud when compared to using the edge computer. If the motion planner is asymptotically-optimal (finds shorter/better plans the longer it runs and with more CPU cores), then one could potentially run the motion planner for the same amount of time but get a better path using the cloud. anand2021serverless explored and shown the benefit of using the tradeoff between more cores and the resulting motion plan optimality.
We present FogROS, a user-friendly and adaptive extension to ROS that allows developers to rapidly deploy portions of their ROS system to computers in the cloud. FogROS sets up a secure network channel transparent to the program code, allowing applications to be split between edge and the cloud with little to no modification. In experiments, we show that the added latency associated with pushing software components to the cloud is small when compared to the time gained from using high-end computers with many cores and GPUs in the cloud. However, in some simple tasks, using high-end cloud servers may lead to marginal benefits and can be considered as an overkill.
In future work, we will address the interactions of multiple hardware systems with different ROS masters, and handle the decentralized communication efficiently and securely. We will also support real-time compression on the proxy connection between edge computer and cloud to help reduce latency especially on low-bandwidth connections.
This research was performed at the AUTOLAB at UC Berkeley in affiliation with the Berkeley AI Research (BAIR) Lab, and the CITRIS “People and Robots” (CPAR) Initiative. The authors were supported in part by donations from Google, Siemens, Toyota Research Institute, Autodesk, Honda, Intel, Hewlett-Packard, and VMWare. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE 1752814 and NSF/VMware Partnership on Edge Computing Data Infrastructure (ECDI), NSF award 1838833. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the sponsors. We thank our colleagues who provided helpful feedback and suggestions, in particular Joseph M. Hellerstein and Alvin Cheung.