Enabling Microsoft OneDrive Integration with HTCondor
Accessing data from distributed computing is essential in many workflows, but can be complicated for users of cyberinfrastructure. They must perform multiple steps to make data available to distributed computing using unfamiliar tools. Further, most research on data distribution has focused on the efficiency of providing data to computing resources rather than considering the ease of use for distributing data. Creating an easy to use data distribution method can reduce the time researchers spend learning cyberinfrastructure and increase its usefulness. Microsoft OneDrive is a online storage solution providing both file storage and sharing. OneDrive provides many different clients to access data stored in the service. It provides many features that users of cyberinfrastructure could find useful such as automatic synchronization with desktop clients. A barrier to using services such as OneDrive is the credential management necessary to access the service. Recent innovations in HTCondor have allowed the management of OAuth credentials to be handled by the scheduler on the user's behalf. The user no longer has to copy credentials along with the job, HTCondor will handle the acquisition, renewal, and secure transfer of credentials on the user's behalf. In this paper, I will focus on providing an easy to use data distribution method utilizing Microsoft OneDrive. Measuring ease of use is difficult, therefore I will will describe the features and advantages of using OneDrive. Additionally, I will compare it to measurements of data distribution methods currently used on a national cyberinfastructure, the Open Science Grid.
READ FULL TEXT