Analyzing network topologies and communication graphs plays a crucial ro...
Some faults in data center networks require hours to days to repair beca...
We show communication schedulers' recent work proposed for ML collective...
Routing is, arguably, the most fundamental task in computer networking, ...
Optical interconnects are already the dominant technology in large-scale...
Resource allocation problems in many computer systems can be formulated ...
Continuously monitoring a wide variety of performance and fault metrics ...
Random uniform sampling has been studied in various statistical tasks bu...
Many big-data clusters store data in large partitions that support acces...
Bulk transfers from one to multiple datacenters can have many different
...
Several organizations have built multiple datacenters connected via dedi...
Large inter-datacenter transfers are crucial for cloud service efficienc...
Using multiple datacenters allows for higher availability, load balancin...
We present a scheduler that improves cluster utilization and job complet...