WSEmail: A Retrospective on a System for Secure Internet Messaging Based on Web Services

Web services offer an opportunity to redesign a variety of older systems to exploit the advantages of a flexible, extensible, secure set of standards. In this work we revisit WSEmail, a system proposed over ten years ago to improve email by redesigning it as a family of web services. WSEmail offers an alternative vision of how IM and email services could have evolved, offering security, extensibility, and openness in a distributed environment instead of the hardened walled gardens that today's rich messaging systems have become. We demonstrate the flexibility of WSEmail using three business use cases: secure channel IM, business workflows with routed forms, and on-demand attachments. Since increased flexibility often mitigates against security and performance, we designed WSEmail with security in mind and formally proved the security of one of its core protocols (on-demand attachments) using the TulaFale and ProVerif automated proof tools. We also provide performance measures for the basic WSEmail functions in a prototype we implemented using .NET. Our experiments show a latency of about a quarter of a second per transaction under load.

READ FULL TEXT VIEW PDF

Authors

page 4

page 5

page 6

page 9

01/15/2021

Bulwark: Holistic and Verified Security Monitoring of Web Protocols

Modern web applications often rely on third-party services to provide th...
09/24/2018

SPX: Preserving End-to-End Security for Edge Computing

Beyond point solutions, the vision of edge computing is to enable web se...
10/01/2021

A Step Towards On-Path Security Function Outsourcing

Security function outsourcing has witnessed both research and deployment...
09/21/2018

Using JSON-LD to Compose Different IoT and Cloud Services

Internet of things and cloud computing are in the widespread use today, ...
03/07/2010

Indexer Based Dynamic Web Services Discovery

Recent advancement in web services plays an important role in business t...
08/24/2020

Who ya gonna call? (Alerting Authorities): Measuring Namespaces, Web Certificates, and DNSSEC

During disasters, crisis, and emergencies the public relies on online se...
02/03/2019

Smart Web Services (SmartWS) -- The Future of Services on the Web

The past few years have been marked by an increased use of sensor techno...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Web services are a mature technology nearing their twentieth birthday. They have created the foundation for highly interoperable distributed systems to communicate over the Internet using standardized protocols (e.g. SOAP, JSON) and security mechanisms (e.g. OAuth, XMLDSIG). Legacy systems and protocols must be reevaluated to see how they can benefit from modern architectures, standards, and tools. As a case study of such an analysis and redesign, we present an expanded study of WSEmail [17], electronic mail redesigned as a family of web services which we first implemented and presented in 2005.

Such a return is warranted due to a consideration of how internet messaging technologies have evolved in the past decade and a half. When we first implemented WSEmail, email and instant messaging (IM) services were strictly disjoint. Instant messaging solutions (e.g.

AIM, ICQ) were server centric, offered little to no security, had weak authentication, and worked only when both sides were online. Email services were more mature with endpoint security and authentication options, but in order to support uniformity across a vast installed based, the resulting system had shortcomings in the areas of flexibility, security, and integration with other messaging systems. For instance, problems of remote authentication and extensibility plagued attempts to reduce spam, while poor integration with browsers and operating systems made it a vector for the propagation of viruses, malware, and ransomware. In the intervening years, IM and email have evolved separately.

Email has become hardened with the standardization of spam blocking (e.g. real time black hole lists, policy block lists), DomainKeys Identified Mail (DKIM), Domain Name System Security Extensions (DNSSEC), and encryption by default between Mail Transfer Agents (MTAs). Together, they made email more secure in transit and reduced the quantity of received spam. Push notifications changed the speed at which users see email, but the fundamental message format (7-bit ASCII with MIME) has remained unchanged. Importantly, it has remained an open system with distributed management and no central point of control.

In parallel, IM underwent fundamental changes as the new generation of tools (e.g. Facebook Messenger, WhatsApp, Skype, Slack, WeChat) introduced stronger authentication, end-to-end message security, and rich communications features such as bots, mini-applications, and video chat. IM apps cache messages sent or received while offline and can show proof of delivery. In contrast to email services, IM networks have become “closed gardens” with little to no interoperability. With few exceptions, access is available only via dedicated clients via a central point of control. Support for offline sending and receiving blur the conceptual boundaries between IM and email, but IM security protocols (e.g. Signal Protocol [10]) are centralized, preventing distributed management and customizations. No integration with email is possible.

It didn’t have to be this way.

We built WSEmail to replace the legacy protocols with protocols based on SOAP, WSDL, XMLDSIG and other XML-based formats. The protocols are proven technologically (they have been standardized for over 15 years) and they are inherently extensible, give stronger guarantees for message authentication, and are amenable to formal modeling and proof.

WSEmail is designed to perform the functions of ordinary email but also enable additional security functions and more flexibility. The primary strategy is to import these virtues from the standards and development platforms for web services. Our exploration of WSEmail is based on a prototype architecture and implementation. WSEmail messages are SOAP messages that use web service security features to support integrity, authentication, and access control for both end-to-end and hop-by-hop message transmissions. The WSEmail platform supports the dynamic updating of messaging protocols on both client Mail User Agents (MUAs) and server MTAs to enable custom communications. This flexibility supports the introduction of new security protocols, richer message routing (such as routing based on the semantics of a message), and close integration with diverse forms of communication such as IM.

The benefits of flexibility can be validated by showing diverse applications. We show the flexibility of WSEmail by detailing three applications we have implemented based on its framework: secure instant messaging, secure business workflow messaging, and “on-demand attachments,” in which email with an attachment leaves the attachment on the sender’s server rather than placing it on the servers of the recipients. We achieve all of this while avoiding becoming a “closed garden” by explicitly considering extensibility and on-demand download of client extensions. This allows endpoint servers to design their own rich messaging extensions and deploy them locally while maintaining interoperability with external systems.

Flexibility, however, often has a high cost for security and performance. We therefore develop techniques to measure and mitigate these costs for WSEmail. WSEmail’s first contribution was a case study of a formal analysis of on-demand attachments. The challenge was to design the associated security for the attachment based on emerging federated identity systems. Due to space restrictions, the proof is not reproduced here, but can be found in Lux et al. [17] and in an online appendix (http://www2.kinneret.ac.il/mjmay/wsemail/). In this work, we first detail the architecture of WSEmail and three of its applications. They show the flexibility that we can achieve using our architecture while still providing strong security guarantees. Second, we carry out a set of experiments intended to determine the efficiency of our base system, including its security operations. Since email systems need to also have good performance on older hardware, we show our experiments on a testbed built from older hardware. Both of these studies demonstrate promise for security and performance for web services in general and WSEmail in particular.

The paper is organized as follows. First, we sketch the architecture of WSEmail focusing on its security assumptions and then continue with a discussion about how plug-ins function. In section 3 we discuss interface details of the WSEmail base architecture, including detailed descriptions of how messages are sent and received. Section 4 discusses applications we have explored with WSEmail, including instant messaging (IM), semantic based routing for business workflows, and on-demand attachments. In section 5 we discuss our implementation and its performance. Section 6 discusses related work and compares WSEmail to similar messaging systems. Section 7 concludes. Interested readers can find more information on our project web page at http://www2.kinneret.ac.il/mjmay/wsemail/.

2 Base Architecture

The base protocols for WSEmail are illustrated in Figure 1. In the common case, similar to SMTP, an MUA Sender Client makes a call on its MTA Sender Server to send a message . This and other calls are SOAP calls over TCP; the message is in the body of the SOAP message and the SOAP header contains information like the type of call and security parameters. The message is structured as a collection of XML elements, including, for instance, a subject header. A sample trace of WSEmail messages can be found at http://www2.kinneret.ac.il/mjmay/wsemail/. After receiving the call from , the server makes a call on the Receiver Server to deliver the mail from the Sender Domain to the Receiver Domain . The Receiver Client makes calls to to inquire about new messages or download message bodies. In particular, makes a call to to obtain message headers and then can request message .

Figure 1: Messaging architecture

Our design is based on a three-tier authentication system combined with an extensible system of federated identities. The first tier provides user (MUA) authentication based on passwords, public keys, or federated identity tokens. The second tier provides server (MTA) authentication based on public keys with certificates similar to those used for TLS. The third tier uses root certificates similar to the ones in browsers. Overall, this addresses interdomain authentication in a practical way at the cost of full end-to-end confidentiality. Confidentiality is preserved between hops by TLS or another tunnel protocol. In a basic instance, the message from to will be given an XMLDSIG signature by that is checked by both and .

The novel aspects of WSEmail’s architecture are in the integration and flexibility of the MUA authentication and the ability of both MUAs and MTAs to add new security functions dynamically. To illustrate a variation in the base protocol, consider our design for IM. Referring again to Figure 1, an instant message is dispatched from a client to while is outside its home domain . In this case contacts to obtain a security token T that will be recognized by . Once this is obtained, sends authenticated with this credential to and indicates (in a SOAP header) that it should be treated as an instant message by and . Instant messages are posted directly to the client, with the client now viewed as a server that accepts the instant message call. and are able to apply access control for this function based on the security token from . This token is recognized because of a prior arrangement between and .

Figure 2: Client components

The WSEmail MUA and MTA are based on a plug-in architecture capable of dynamic extensions. Security for such extensions is provided though a policy for trusted sources and the enforcement mechanisms provided by web services. On-demand attachments are an example of such a plug-in, as are a variety of kinds of attachments with special semantics. A party that sends a message with such an attachment automatically includes information for the receiver on where to obtain the software necessary to process the attachment. The client provides hooks for plug-ins to access security tokens, after first performing an access control check on the plug-in. Figure 2 illustrates the MUA (client) components. Screen shots of the GUI can be seen in Figure 8 (detailed below) and at http://www2.kinneret.ac.il/mjmay/wsemail/. A figure illustrating the server (MTA) components is shown in Figure 3.

Figure 3: Server Components

3 Interfaces and Plug-ins

To explain how the WSEmail architecture works, we outline its code and plug-in architecture and interfaces. The interfaces describe how much access the application has to the mail server or client software, in addition to where the plug-in is activated for message delivery or reading. Plug-ins are used on both the server and the client. Their interactions and communication paths are the source of WSEmail’s flexibility and extensibility.

3.1 Server-side Plug-ins

Server plug-ins are libraries of code that conform to certain interfaces known to the server. When the server is initialized, it goes through a list of plug-ins to load from a configuration file. For each plug-in, a specific object class is listed along with the name of a library from which it can be loaded. The server looks for the library and tries to instantiate the object. If it successfully loads the plug-in, the server will request further configuration data from the plug-in and use it to place the plug-in in the appropriate execution queue. The process of loading or unloading plug-ins is dynamic, so they can be loaded or unloaded at any time during execution. In addition, the execution queues can be reprioritized or disabled while the server is running.

All server plug-ins implement the IServerPlugin interface, which allows the server to understand the purpose of the plug-in and add the plug-in to the appropriate processing queues. There are two main classes of plug-ins in WSEmail: message dependent and RPC-like (or message-independent). Some example plug-ins and their classifications are shown in Figure 4.

Figure 4: Example server plug-ins and their classifications

Message dependent plug-ins depend on a message to be present to execute. They can be inserted in various places in the delivery cycle, including the initial receipt of a message (ISendingProcessor) or the final destination of a message (IDeliveryProcessor). Plug-ins that implement ISendingProcessor perform processing similar to that done by sendmail (such as verifying relay permissions, stripping oversized attachments, etc) in regular mail systems. Plug-ins that implement IDeliveryProcessor do similar actions to user-space programs such as procmail or vacation messaging scripts in regular mail systems. A diagram depicting the interactions and data flow of incoming WSEmail and extension requests is show in Figure 5.

Figure 5: An overview of server side plug-ins

RPC-like plug-ins implement a generic “catch-all” interface (IExtensionProcessor). On initializing, the plug-ins provide the server with an “extension identifier”. The extension identifier is used by the server to route incoming requests to the appropriate plug-in. There is no required or defined structure for the requests. Most plug-ins view requests as XML documents or fragments. This provides flexibility in terms of the data an application can process.

The plug-ins in the system are executed after the core server has performed message authentication. The server authenticates a message by examining its attached security tokens such as X.509 certificates, username signatures, or federated identity certificates. If the security tokens are valid, the message is entered into a queue to be processed by the appropriate plug-ins. In the case of IExtensionProcessor and IDeliveryProcessor plug-ins, an “environment” object is created and passed to the plug-in to allow access to authentication tokens and raw XML streams directly from the server. Other plug-ins are only given the message that triggered their execution.

Plug-ins can implement more than one interface, which allows increased functionality in one piece of code. Some plug-ins implement both the IExtensionProcessor and the IDeliveryProcessor (or any variant) interfaces. This allows them to interact with messages as they are being delivered, but also to be configurable using a protocol that interacts with the IExtensionProcessor interface. Implementing multiple interfaces allows plug-ins to share data that should be accessible though multiple paths. Examples of useful composite plug-ins (plug-ins implementing multiple interfaces) are given in Section 4.

We implemented some server plug-ins in WSEmail that would be expected for enterprise applications: data store access (IDataAccessor), database connection management (IDatabaseManager) message queues (IMailQueue), and local delivery (ILocalMTA).

3.2 Client-side Plug-ins

Client-side plug-ins affect message reading and are designed to be more dynamic than server-side plug-ins. Similar to server-side plug-ins, client-side plug-ins are code libraries, but for ease of implementation they extend an abstract class (DynamicForms.BaseObject) instead of implementing an interface. They also contain more information than their server-side counterparts, including version and network location information. With this additional information, a plug-in can create messages that can be processed by other clients that do not have the plug-in installed. This is accomplished by having a stub that is executed by the receiving client to download the plug-in using its supplied location information. The user is given the chance to approve the downloaded before it is performed. Plug-in code is self-signed using Microsoft Authenticode. The plug-in information (version, name and library location) is saved in a local registry that allows the recipient to use the same plug-in at a later point.

The interface that a plug-in presents to the user is up to the plug-in designer. A plug-in can have no interface at all, a few message boxes, or a rich graphical user interface (GUI). Since WSEmail is based upon the .NET framework, plug-in designers can use .NET’s rich UI elements without the need to transfer libraries for rendering graphics. A sample plug-in interface is shown in Figure 6. Plug-ins also have access to authentication information in the mail client and may petition users for access to their federated token. This allows plug-ins to perform secure web service calls to automatically fill in information or perform other functions.

Figure 6: Sample application (a timesheet)

For convenience and to limit bandwidth used, multiple plug-ins can be contained within one library file. Each plug-in will be enumerated (via .NET’s reflection libraries) and registered with the client application. This allows system administrators to deploy one library with updated plug-ins instead of deploying each one separately.

3.3 Sending a message

When Alice wants to send a message to Bob that takes advantage of a plug-in, she attaches a “form” from her registry to her message (Figure 8). The form presents Alice with a user interface (UI) that lets her fill in required information. After filling out the form, the information in it is serialized to an XML document that is attached to the message. The original message is notified by the plug-in where about where it should be sent next for the message to be appropriately processed; in this case, Bob’s inbox. The plug-in is unloaded and a flag is set on the message header that indicates the presence of a form. The flag is displayed in Alice’s list of sent messages and Bob’s inbox when he receives it, allowing them to see which messages in their inbox contain a form by viewing the header information and without downloading all message content. Figure 7 illustrates the process.

Figure 7: Client-side plug-ins from the sender’s perspective
Figure 8: New message screen

3.4 Receiving a message

When Bob’s server receives Alice’s message, he sees the new message appear in his inbox with an annotation indicating that the message contains an attached form. He can view the message normally, but can also view the attached form. When Bob tries to view the form, the mail client attempts to load the appropriate plug-in or obtain it if it is not present on the system using information contained in the recipient’s plug-in registry and information contained within the form. After the plug-in is loaded, the XML document containing the payload of the form is pushed to the plug-in which deserializes the data and loads necessary state. The plug-in then takes control, displaying a graphical interface that includes the information Alice sent. Figure 9 demonstrates the typical flow for the recipient of a message with a form attached.

Figure 9: Client-side plug-ins from the recipient’s perspective

4 Applications

WSEmail offers the possibility to have rich XML formats, extensible semantics on clients and routers, and a range of security tokens. Since there are substantial development platforms for these features from major software vendors, it is easy to use WSEmail as a foundation for a suite of integrated applications that share common code, routing, security, and other features. As an illustration, we sketch three applications that we implemented with our prototype system.

4.1 Instant Messaging

Instant messaging is similar to email but is intended for communicating short text messages synchronously. An overview of a standard instant messaging architecture is shown in Figure 10. Instant messaging systems are typically disjoint from email systems using different clients, servers, routing, and security. This is unfortunate since the two messaging systems have many things in common. We experimented with a form of integration for the two by allowing WSEmails to be marked as instant messages (see the New Message screen in Figure 8). Such messages are posted directly to a window on the recipient client by the client server, subject to an access control decision. Our implementation uses the same client, server, software, and security as the email functions. There is an option that allows multiple parties to use SSL tunnels to a single server.

Figure 10: Instant Messaging Overview

WSEmail’s instant messaging plug-in is a composite plug-in, made up of an IDeliveryProcessor and an IExtensionProcessor implementation. When the instant messaging client program is started, the user automatically registers her location on the server using the IExtensionProcessor interface. The plug-in records the location information in an internal table. Later, when the server receives a message flagged as an instant message, it passes delivery control of the message to the instant messaging plug-in using the IDeliveryProcessor interface. The plug-in consults its table of user locations and, if it finds a match, sends the message directly to the client. If a match is not found, the plug-in relinquishes control of the message, passing it back to the server. The server can then attempt delivery using a different matching plug-in.

Instant messages are sent to clients using Microsoft .NET Remoting. Similar to Java’s RMI, Remoting allows an object to be distributed across a network. Clients remote their instant message queues and have a separate thread watch it. As the server pushes messages into the queue, the client watcher thread pulls them out and coordinates their display into a conversation-like interface.

After a conversation has been established, users can choose to change to a synchronous channel. An additional server-side plug-in coordinates the shift from asynchronous WSEmail messages to a “party line” secured with Transport Layer Security (TLS). At the user’s request, the plug-in allocates a TCP port running TLS and notifies all participants in the conversation of the available channel. The TLS ports do not have to be opened on the mail server itself, so it is possible for the mail server to act as a broker and pass the connection request on to a secure chat server farm. The recipients of the invitations are given a choice to accept the channel conversion. When connecting to the secure chat server, the clients are presented with the server’s X.509 certificate and are asked to present their own certificates for authentication purposes. Clients who do not provide a certificate are not able to join the new secured chat session.

The secure channel instant message brokering gives allows users to create secure channels with arbitrary groups of people. Since the messaging flows over the TLS channel, they do not have the lag of the composition and forwarding of a WSEmail message. If the users have an existing X.509 architecture, they can easily authenticate to each other. If they lack a shared X.509 architecture, everyone (including the server) will need to set up certificate trust relationships.

4.2 Business Workflows

Many organizations are working to carry out more of their management of workflows (i.e. forms) using web forms or other web techniques. In implementing such a system, there is a choice between a centralized system where a single web server is used by all parties, versus a decentralized system where information is routed by email. Email systems tend to work better with loosely described workflows and loosely coupled participants, such as ad hoc collaborations between enterprises where neither organization is willing or able to carry out all functions on a web server managed by the other party.

WSEmail supports the management of workflows using routed forms, attachments that are sent to particular people or roles in a specified order. Routed forms use the client-side plug-ins to create a rich user interface. The forms are designed to look similar to their paper counterparts. We created prototypes for time sheets (Figure 6) and requisition forms, including an interface to enter the required data and rules that specify which roles in the organization must “sign off” on the form to have it approved. A sample workflow scenario of a form which must be passed through a chain of recipients to receive final approval is shown in Figure 11.

The sender follows the steps described above in the message sending section. In our prototype the forms are much smarter than their paper counterparts. They can, for example, provide basic spreadsheet-like functionality or automatically populate data using the user’s federated token and a secure web service query to a human resources database. To address the security of the workflow, each user has a unique X.509 certificate with the certificate’s common name (CN) as the user’s email address. A person in the workflow signs off on the form by attaching a digital signature to the XML. The message thereby acquires an approval list that can be verified and audited by a third party. In particular, the verifier can use X.509 certificates to check that the data has not been tampered with and can authenticate the approval of each member in the workflow.

As an extension of the workflow model, users can delegate their responsibilities. Delegation is done by a user providing the name or names of people who can sign off on a form instead of him. This adds a powerful automation feature. Using server-side plug-ins, a form can be received by a program which makes decisions about delegation given the current approval list and data contained within the form. A common business process that could use such a feature is a requisition form. For example, a department may be allowed to buy items under a certain fixed price, but if the total price is greater than a certain amount, additional approval by a member of the purchasing department might be required. A requisition workflow program could easily detect this condition and expedite the purchase process by automatically forwarding the form to people who can approve the purchase or to people to whom they have delegated their responsibility.

Figure 11: Sample Workflow Scenario

A workflow in our system can send its result to another program. The receiving program could then validate the entire form and perform the required operations (e.g. File an order, perform database manipulations). With an increasing number of online retailers exposing their order processes as web services, it becomes possible to automate a larger number of business functions end-to-end within a common application such as WSEmail.

Because WSEmail also functions as a decentralized system, a workflow form can extend across multiple enterprises. All of the enterprises in the workflow would need to have an agreement to trust a common certification authority (CA) or cross-trust each other, but that is the only additional configuration that is needed. With that setup, they can use routed forms in WSEmail to create multi-enterprise processes. There are many applications that could use such a setup such as negotiating prices with a supplier or gathering approvals for press releases from interdependent corporations. Since the data is contained within the email message, the question of who hosts the data and applications for the exchange is eliminated. WSEmail simply sends the data wherever it is required.

4.3 On-Demand Attachments

Attaching files to email has long been a simple and convenient way to send files to a group. However, there are a variety of problems with email attachments that WSEmail sought to improve. Except for certain proprietary email systems, there is no version control system for email attachments, which usually results in many message resends so that each person gets the newest version of a file. Second, the way attachments are bundled in POP3 requires users to download the entire message and attachments, even if the user only wants to read the message (a problem solved in IMAP4 (IETF RFC 3501) and critical to bandwidth and power limited devices such as smartphones). A common solution is to post the attachment on a secured website and just send a link in the message as various cloud providers allow. Unfortunately, this creates an administrative and security headache since attachments are stored on third-party servers and senders must set up access control rules and authentication on an external server or for recipients who are not in their administrative domain.

Figure 12: On Demand Attachments Protocol

WSEmail solves the problem by introducing the concept of “on-demand” attachments. Simply, message attachments are handled as a plug-in to the WSEmail base protocols. The plug-in implements both the IExtensionProcessor and ISendingProcessor interfaces. The client creates a message that contains information about the attachment such as its size, description and a SHA1 hash in addition to the normal fields a message contains. The request to the server contains the message as normal, along with the attachments in DIME format. As the message is received by the server, the request is intercepted by the “on-demand” plug-in using the ISendingProcessor interface. The plug-in can gain access to the DIME attachments in the requests. The attachments are stripped from the request and saved to a database along with a list of all the recipients. Globally Unique Identifiers (GUIDs) are generated and injected in to the original message such that each GUID relates to one of the DIME attachments. Delivery of the message now continues normally.

A user who receives a copy of the message will see that a file is attached, but will not have a copy it yet. The user just has the GUID for the file and the location from which it can be obtained. If the user decides to retrieve the attachment, she first acquires a federated identity token. The token is presented using the IExtensionProcessor interface to the server that originally stripped the attachment along with the GUID. The server verifies the authenticity of the token and that the supplied user’s token is permitted access to the GUID. If there is a match, the server sends the attachment back to the requestor as a DIME attachment. An overview of the protocol is shown in Figure 12.

We specified the on-demand attachments protocol formally using the TulaFale [4] specification language, which has constructs for public key signatures and salted password authentication. The TulaFale script compiles to a script that is verifiable with the ProVerif protocol verifier of Bruno Blanchet (version 1.11) [7]. With this we were able to prove the following correspondence theorem for on-demand attachments: if a receiving client (RC) retrieves an on-demand attachment with SC (sending client) as its return address, then SC sent the attachment. Details of the proof and its construction can be found in Lux et al. [17] and at http://www2.kinneret.ac.il/mjmay/wsemail/.

5 Experiments

Web services are often criticized for being slow based on their design and existing implementation platforms. Security and flexibility also provide a performance challenge. Hence a secure, flexible implementation of messaging based on web services raises concerns about performance. We implemented a prototype for WSEmail as a way to address these concerns at the same time as illustrating the benefits of flexibility. In order to evaluate the efficiency of our messaging system, we built a test bed to stress test our implementation’s application and protocols. In this section we describe the implementation, the test bed, and our experiments.

We simulated a real world email environment where many users share a common email server. Users may exchange messages with other users within the local domain or external domains. Users may also interact with their personal inboxes to view and delete messages. For our test we defined four standard email operations: send, list, retrieve, and delete. These operations are discussed in detail below.

All of our code can be downloaded from https://github.com/lux-k/wsemail.

5.1 Implementation

Our WSEmail prototype runs on Windows server and client systems. Version 1.0 was implemented over the .NET framework version 1.1 and relies on the Web Services Enhancement (WSE) 1.0, CAPICOM 2.00, SQL Server 2000 (to store messages for the server), and IIS 5.0. The current version consists of 68 interfaces and 343 classes organized into 30 projects (see Appendix B for a UML model illustrating the design). Most of the software is C# .NET managed code created with Microsoft Visual Studio. Unmanaged code was needed to gain access to lower-level DNS functions necessary to query for SRV records. Our instant messaging system also exploits a TLS package from Mentalis (mentalis.org) since the .NET 1.1 platform does not provide native support for server TLS sockets. In November 2004 we upgraded WSEmail to version 1.1 in order to get WS-Policy support from Microsoft WSE 2.0. This was challenging because primitive functions from WSE 1.0 that we needed for our WSEmail 1.0 implementation were removed from the WSE 2.0 package forcing us to use both WSE 1.0 and WSE 2.0 to implement WSEmail 1.1.

WSEmail uses DNS SRV records (IETF RFC 2782) to determine routing. This makes it possible to run WSEmail over other protocols without changing the way DNS is queried, and we can exploit the priority and weight attributes in the records. These properties of the SRV record allow for future enhancement and present day configuration that is extremely similar to the way SMTP is deployed now.

5.2 Test Bed

Our test bed consisted of a total of four client machines, two mail servers (designated as local and external), one test coordinator and one database/DNS server. The arrangement of the test bed is depicted in Figure 13.

Figure 13: Testbed Architecture

The test clients (labeled as T1 through T4) all performed operations by sending requests to the “local” email server, . The test client actions were coordinated by the test coordinator, . There also was a second server, , which acted as both an “external” email server and a load generator for the “local” system. hosted a message storage database and DNS records for and . The clients all had Pentium 4 2.8GHz processors with 512MB of memory and the Windows XP Pro operating system. They performed four different operations during the test execution: send a message to a recipient; list the headers of messages in the client’s inbox, retrieve a particular message, delete a particular message. We explored various ways to include a mixture of applications with these basic operations but found it difficult to isolate performance issues clearly in doing this, so we restricted our focus to a demonstration of the basic operations.

The test coordinator, , was responsible for distributing the test specifications, starting the test and receiving the results from each client. The coordinator broadcasted its network address, instructing all clients to connect to it and download the test specifications file. The clients then waited for to announce the start of the test, after which the clients executed requests to in compliance with the specs they downloaded. After each client finished, the latencies for each request were reported back to .

The test specifications document described exactly what each client was to do. It indicated whether the client should authenticate using a username token (user name and password) or X.509 certificate. It also specified how many messages were to be sent from each client, to whom they were to be sent, and the size of the message body. The specification document also indicated the total number of requests that should be sent and the ratios of the four types of requests.

The local server was the focus of our test. It accepted incoming messages from the clients and an external server . It performed the necessary authentication and forwarded external messages to the appropriate destination after performing DNS resolution. If the destination was local (for example, the recipient is on ), then the message was stored in . If the destination was external, the message was forwarded to . We allowed the local and external server to share a database and DNS server since these were not performance bottlenecks in the system.

The external server played two roles in our test bed. First, it imitated the entire external client list, so that all emails directed to any external client were forwarded to it. On reception of a message addressed to one of the clients that it simulated, it did not save it to the database server. This was done to prevent from experiencing extra latency due to ’s database transaction. Rather, it performed the required certificate checking to verify authenticity and then discarded the message. Second, it acted as a load generator and sent one message per second addressed to each of the four clients: T1 - T4. These messages were all received by , authenticated, and stored in .

5.3 Procedure and Results

The test coordinator provided a test specification document that instructed each client to run one execution thread sending 2,000 requests to . The clients chose send, list, retrieve, and delete operations with 25% chance. In cases where the delete operation was to be performed on an unpopulated inbox, it was considered a no-op and not counted towards the results. To avoid this condition, each client’s inbox was primed with about six messages. To get the most out of each send event, each message was addressed to both a randomly chosen local client and an external client. The clients were all instructed to authenticate to using username token authentication. and authenticated to each other using X.509 certificate signing. The duration of the test was 1826 seconds.

In order to get a client-side view of the efficiency of the system, we measured the latency of each request. A timer was started as the client contacted with a request and stopped after the client received the appropriate response (e.g.

inbox listing, message received confirmation). The time difference between the client’s request and the server’s complete response was the latency of the operation. The results of this calculation point to an average of 0.284 seconds per request with a variance of 0.1389 seconds. The minimum and the maximum latencies were 46.876 ms and 4.0 seconds respectively. Note that the “Message Received” confirmation does not mean that the message was delivered to the ultimate recipient, just that the message was placed in the delivery queue.

The test results in Table 1 show the throughput of bytes sent in MB as a breakdown of the number of requests (send, list, retrieve and delete). Therefore the total data in MB from the clients to is 36.18 MB and from the to all the clients is 369.35 MB.

Operation Send List Retrieve Delete
# of requests 1970 2024 2026 1980
% of all requests 24.6 25.3 25.4 24.7
ClientServer Data (MB) 10.74 8.42 8.62 8.4
ServerClient Data (MB) 12.31 324.55 20.41 12.08
Table 1: Bytes sent between clients and

Since each message is also sent to an external client, each send action also sends a message from to . The data are measured according to the representation in Table 2. Therefore, the total volume of data exchanged from to is 30.95 MB and from to is 30.69 MB.

Server Name # messages Sent (MB) Received Confirmations (MB)
1970 2024 2026
24.6 25.3 25.4
Table 2: Bytes sent between and

The entire test bed data transfer was recorded using the Ethereal network monitor, which was run at and . The TCP/IP sessions were reconstructed using tcpflow (https://github.com/simsong/tcpflow) and post-processed with Perl and awk. Since , acting as a load generator, sent one message per second, 1826 messages were sent from to over the duration of the test. The corresponding byte count represents the messages that were sent and the notification messages that were received.

5.4 Analysis

A best case test of SMTP with no load on the server or network and no contention for resources yielded an average latency of 0.170 ms to send a message of about the same size as the WSEmail messages we sent in our experiment. The average difference in latency between WSEmail and the SMTP test is 0.114 ms, which accounts for the additional overhead of the XML parsing and cryptography. In that short time span a large number of operations took place: one secret key signature, one private key signature verification, two public key signatures, and one public key signature verification. Since the entire system uses XML, we conclude that performance is not a barrier to secure web services in this type of application. The extra latency would likely be unnoticeable a typical client/server environment.

XML and XMLDSIG do have a drawback in their verbosity. Our test bed sent 1 KB mail messages which ballooned to 10 KB responses to the retrieve message action in order to make XMLDSIG work. At least 30% of those bytes were the Base64 encoded representations of the certificates used for signing messages. After the certificate size, the WS-Security structures were also a significant amount of overhead, accounting for about 30% of the bytes transferred. WSEmail might need to explore ways to distribute certificates so that they are not replicated excessively. It might also be useful to examine how messages are signed to minimize their verbosity.

Our experiments bode well for web service efficiency, especially for high volume messaging. Extending our experimental results, we find that WSEmail is theoretically capable of handling approximately 1787 messages a minute (combination of incoming and outgoing). We looked for published benchmarks to compare this against and found that the University of Wisconsin-Parkside had a peak usage of 1716 (total of incoming and outgoing) messages per minute over a year, meaning it should be possible for a single WSEmail server similar to our test system to routinely handle the normal load at that institution.

6 Related Work and Similar Systems

Work related to WSEmail can be divided into two general areas: the analysis of web service security and improved Internet messaging systems.

Email Improvements

Improvements to the SMTP messaging system have often been motivated by two, sometimes overlapping goals: strong message authentication and spam prevention. PGP offers authentication tools that include public/private key signing and encryption. Privacy Enhancement Mail (PEM) (IETF RFCs 1421-4) has mechanisms for privacy, integrity, source authentication, and nonrepudiation using public and private key encryption and end-to-end encryption techniques. Zhou et al.[21] use formal tools to verify the properties of the PEM system. Abadi et al. [1] use a trusted third party to achieve message and source authentication and formally prove correctness of their protocol.

Changes to the SMTP system aimed at spam reduction include DomainKeys Identified Mail (DKIM) (IETF RFC 6376) which uses public key cryptography and an option for server signed (rather than client-signed) messages and Petmail (http://petmail.lothar.com/). Petmail uses the GPG encryption utility for public key encryption and signing of messages. Users are identified by IDRecords, self-signed binary blobs that include public key, identity, and message routing information. Petmail agents can enforce IDRecord whitelists and policies for contact from first time senders. First-time senders may be forced to obtain tickets from a third party Ticket Server which may perform checks to ensure that the sender is a human (using CAPTCHA reverse Turing tests.) Messages can be encapsulated and sent using SMTP, Jabber, or some other queuing transport protocol. Patterns and options for sender anonymity are offered as well. Our most recent work on extensions of WSEmail show how to do several of these things and more based on WS-Policy negotiations and our dynamic plug-in capability.

Web Service Security

Regarding web services security analyses, the Samoa project at Microsoft Research developed important fundamentals, including a formal semantics for proving web services authentication theorems [5] and the TulaFale language for automating web service security protocol proofs [4]. Based on their ideas and others, we have performed followup work based on WSEmail, including adaptive middleware messaging policy systems [2], attribute-based messaging [8], and a secure alert messaging protocol [11].

6.1 Instant Messaging Systems

Several vendor-specific all-in-one internet messaging systems have been developed recently, including Google Talk, Skype, WeChat, Slack, and WhatsApp. As we noted above, all provide security at the expense of openness. Some allow extensions and integrated apps, but only via a centralized service.

GMail and Hangouts

Google’s GMail platform evolved from an email system to include a chatting service called Google Talk in 2005 and a unified platform called Hangouts in 2013. The talk application has similarities to WSEmail in that it integrates IM with email messages (allowing conversion between the two) and integrates with other chat protocols such as Jabber. Talk differs from WSEmail in that it uses hop-by-hop encryption instead of end-to-end [13]. It is also primarily client-server based (via Google), although it will route calls in a peer-to-peer manner if possible [12]. Talk and Hangouts use proprietary communication protocols, so the platforms are not amenable to third party extensions.

Skype

Skype integrates voice and chat into one app. Chat messages sent to offline users are sent like emails - stored on the server and delivered to the target at next login. Skype’s security model is similar to WSEmail’s secure chat architecture in that it uses public key encrypted messages and challenges for authentication, establishes a shared key, and then uses the shared key to create a secure end-to-end channel. It uses proprietary protocols and does not offer integration with other chat or voice communication tools [6].

WeChat

WeChat is a popular chat service that provides instant messaging and chatting services. Its communication protocol is proprietary, but forensic analyses and protocol analyses have found that its communication protocols are server based and encrypted using a custom combination of a fixed RSA key and derived AES keys [14, 18]. WeChat allows for integration of miniprograms within its tool via its centralized servers.

Slack

Slack’s security protocols are based on TLS 1.2, SHA2, and AES [19]. The details of the protocols are proprietary, but black box testing and protocol analysis have shown them to be server based with no peer-to-peer or direct connections [15]. In contrast to other closed systems, Slack enables the introduction of bots, software agents which listen to conversations and data and act based on them.

Otr

The Off the Record (OTR) [9] protocol introduces a mechanism for secret, authenticated low-latency communication which preserves the ability for participants to repudiate their messages later. OTR has been installed a number of operating systems and secure chat tools such as cryptocat.

WhatsApp

The WhatsApp client authentication protocol has undergone significant changes in the past three years. The pre-2016 WhatsApp client authentication steps (see Karpisek, et al.[16] and Anglano [3]) are as follows. At installation time, a shared password was generated for the user account which was stored in /data/data/com.whatsapp/files/pw on the device and transferred to the WhatsApp servers. Login then consisted of the following steps:

  1. At log in, the client sends an initial auth message with its client number and the method it wants to use for authentication. The message isn’t encrypted. The message includes information about the client software version and capabilities.

  2. The server responds with some parameters which include a nonce .

  3. The client and server use and to generate four keys using PBKDF2: (server encryption), (server integrity), (client encryption), and (client integrity).

Since 2015, WhatsApp has used the Signal protocol, an end-to-end encryption scheme based on an elliptical curve public/private key pair generated at install time [20]. It uses the Signal ratcheting protocol for instant messaging and voice communication [10]. WhatsApp is a closed source system that does not enable extensions or integration with third party clients.

7 Conclusion

We have explored WSEmail, the development of email functions as a family of web services, by developing a prototype system based on an architecture that emphasizes flexibility, security, and integration. We have shown that WSEmail is amenable to the addition of new protocols and the formal analysis of these protocols. We have also shown that the basic WSEmail functions have satisfactory performance. In ongoing work, we are exploring several directions such as: new applications that exploit improved integration between web-like data retrieval functions and the messaging system; challenges to interoperability with a Java implementation of the MUA; and ways to express and negotiate messaging policies. For widespread use, WSEmail faces substantial problems with standardization and interoperability with SMTP, which may be mitigated by writing more plug-ins like our SMTP-compatible relay agent. However, it is well-suited to some high-security applications even now, offers ideas in exploring the general design space for Internet messaging, and can rely on the standardization advantages of XML as an aid to addressing interoperability challenges. We also aim to support WSEmail on diverse platforms. A project of Heo, Patel, and Shah was partially successful in doing this for a Java WSEmail client based on Sun’s JWSDP 1.4 with X.509 security.

Acknowledgements

This work was supported by a gift from Microsoft University Relations, NSF grants CCR02-08996 and EIA00-88028, ONR grant N000014-02-1-0715, and ARO grant DAAD-19-01-1-0473. We are grateful for discussions of WSEmail that we had with Martin Abadi, Raja Afandi, Noam Arzt, Karthikeyan Bhargavan, Luca Cardelli, Dan Fay, Eric Freudenthal, Cedric Fournet, Andy Gordon, Ari Hershl Gordon-Schlosberg, Munawar Hafiz, Jin Heo, Himanshu Khurana, Ralph Johnson, Bjorn Knutsson, Jay Patel, Neelay Shah, Kaijun Tan, and Jianqing Zhang. We are also grateful to Bruno Blanchet for technical support.

References

  • [1] M. Abadi, N. Glew, B. Horne, and B. Pinkas (2002) Certified email with a light on-line trusted third party: design and implementation. In Proceedings of the Eleventh International Conference on World Wide Web, pp. 387–395. External Links: ISBN 1-58113-449-5, Document Cited by: §6.
  • [2] R. Afandi, J. Zhang, and C. A. Gunter (2006) AMPol-q: adaptive middleware policy to support qos. In Proceedings of the 4th International Conference on Service-Oriented Computing, ICSOC’06, Berlin, Heidelberg, pp. 165–178. External Links: ISBN 3-540-68147-7, 978-3-540-68147-2, Document Cited by: §6.
  • [3] C. Anglano (2014) Forensic analysis of WhatsApp messenger on Android smartphones. Digital Investigation 11 (3), pp. 201 – 213. Note: Special Issue: Embedded Forensics External Links: ISSN 1742-2876, Document, Link Cited by: §6.1.
  • [4] K. Bhargavan, C. Fournet, A. D. Gordon, and R. Pucella (2003) TulaFale: a security tool for web services. In International Symposium on Formal Methods for Components and Objects (FMCO’03), LNCS. Cited by: §4.3, §6.
  • [5] K. Bhargavan, C. Fournet, and A. Gordon (2004) A semantics for web services authentication. In Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, New York, NY, pp. 198–209. External Links: ISBN 1-58113-729-X Cited by: §6.
  • [6] P. Biondi and F. Desclaux (2006-03) Silver needle in the Skype. In BlackHat Europe 2006, Cited by: §6.1.
  • [7] B. Blanchet (2001) An efficient cryptographic protocol verifier based on prolog rules. In Proceedings of the 14th IEEE Workshop on Computer Security Foundations, pp. 82. Cited by: §4.3.
  • [8] R. Bobba, O. Fatemieh, F. Khan, C. A. Gunter, and H. Khurana (2006-12) Using attribute-based access control to enable attribute-based messaging. In 2006 22nd Annual Computer Security Applications Conference (ACSAC’06), pp. 403–413. External Links: Document, ISSN 1063-9527 Cited by: §6.
  • [9] N. Borisov, I. Goldberg, and E. Brewer (2004) Off-the-record communication, or, why not to use PGP. In Proceedings of the 2004 ACM Workshop on Privacy in the Electronic Society, pp. 77–84. External Links: ISBN 1-58113-968-3 Cited by: §6.1.
  • [10] K. Cohn-Gordon, C. Cremers, B. Dowling, L. Garratt, and D. Stebila (2016) A formal security analysis of the signal messaging protocol. Note: Cryptology ePrint Archive, Report 2016/1013eprint.iacr.org/2016/1013 Cited by: §1, §6.1.
  • [11] F. Gioachin, R. Shankesi, M. J. May, C. A. Gunter, and W. Shin (2007-07) Emergency alerts as rss feeds with interdomain authorization. In Second International Conference on Internet Monitoring and Protection (ICIMP 2007), pp. 13–13. External Links: Document Cited by: §6.
  • [12] Google (2018) Peer-to-peer calling in hangouts. Note: Hangouts Help [Online]last accessed 22 Nov 2018. https://support.google.com/hangouts/answer/6334301?hl=en Cited by: §6.1.
  • [13] GoogleTalkGuide (2006-21 Nov) Can my gtalk discussion be tracked?. Note: Google Talk Help Discussion Archive [Online]groups.google.com/group/Calls-Chats-and-Voicemail/browse_thread/thread/431d561bf7d6f7d6/e49343f783a06a1e Cited by: §6.1.
  • [14] Q. Huang, P. P. C. Lee, C. He, J. Qian, and C. He (2015-06) Fine-grained dissection of wechat in cellular networks. In 2015 IEEE 23rd International Symposium on Quality of Service (IWQoS), Vol. , pp. 309–318. External Links: Document, ISSN Cited by: §6.1.
  • [15] Y. Iwase (2016-24 Mar) Is slack’s webrtc really slacking?. Note: webrtcH4cKShttps://webrtchacks.com/slack-webrtc-slacking/ Cited by: §6.1.
  • [16] F. Karpisek, I. Baggili, and F. Breitinger (2015) WhatsApp network forensics: decrypting and understanding the WhatsApp call signaling messages. Digital Investigation 15 (), pp. 110 – 118. Note: Special Issue: Big Data and Intelligent Data Analysis External Links: ISSN 1742-2876, Document, Link Cited by: §6.1.
  • [17] K. D. Lux, M. J. May, N. L. Bhattad, and C. A. Gunter (2005-07) WSEmail: secure internet messaging based on web services. In 2005 IEEE International Conference on Web Services (ICWS), Orlando, FL, USA. Cited by: §1, §1, §4.3.
  • [18] R. Paleari (2013-17 Sept) A look at wechat security. [Online] Emaze S.p.A.. Note: blog.emaze.net/2013/09/a-look-at-wechat-security.html Cited by: §6.1.
  • [19] Slack (2017-31 Jan) Security white paper: Slack’s approach to security. Cited by: §6.1.
  • [20] WhatsApp (2016-16 Nov) WhatsApp encryption overview. Technical White Paper WhatsApp. Note: www.whatsapp.com/security/WhatsApp-Security-Whitepaper.pdf Cited by: §6.1.
  • [21] D. Zhou, J. C. Kuo, S. Older, and S. K. Chin (1999-01) Formal development of secure email. In Proceedings of the 32nd Annual Hawaii International Conference on Systems Sciences, Cited by: §6.

Appendix A Client Screen Shots

Figures 14 and 15 are screen shots from the WSEmail client program.

Figure 14: Inbox screen
Figure 15: Instant Messaging screen

Appendix B UML Class Diagrams

Figure  17 and  16 shows the UML class diagram for the server side implementation and plugin architecture respectively.

Figure 16: Server architecture with plugins
Figure 17: Server implementation class diagram