1 Related Workinline, color=remarkYalong,size=,caption=xxxinline, color=remarkYalong,size=,caption=xxxtodo: inline, color=remarkYalong,size=,caption=xxx YY: In summary review; R1: an early study about flow maps; : This study compared a paper map series, a computer map series, and animated maps of the same data to assess the effectiveness of each technique for memorizing data symbolized by graduated flow lines. Subjects were asked to study the maps and to memorize two types of information:quantity data at specified locations on the maps and trend patterns that occurred over the maps. Analysis of response times and accuracy rates for these questions suggest that animation does not improve learning ability for quantity evaluations. It does appear, however, to improve subjects’ abilities to learn and remember trend patterns in the data. Results also indicate gender differences in using animated maps. Females preferred the paper map series and completed tasks signifi- cantly more accurately with them, while males appeared to learn better with animation. Average reaction times for males were significantly faster with animation. Accuracy rates, however, failed to show a significant increase over the paper map series. inline, color=remarkYalong,size=,caption=xxxinline, color=remarkYalong,size=,caption=xxxtodo: inline, color=remarkYalong,size=,caption=xxx YY: In summary review; R4: An edge bundling algorithm that drawing the lines separately; : We therefore propose methods that bundle edges at their ends rather than their interior. This way, tangents at vertices point in the general direction of all neighbors of edges in the bundle, and ambiguity is avoided altogether. For undirected graphs our approach yields curves with no more than one turning point. For directed graphs we introduce a new drawing style, confluent spiral drawings, in which the direction of edges can be inferred from monotonically increasing curvature along each spiral segment.
The presentation of multiple flows on a map is a classic problem in cartography and geographical visualisation. Here, we discuss the three broad visualisation approaches outlined in our introduction.
Flow Map Approaches. The earliest known flow map was created by Henry Drury Harness in 1837 to show rail usage . Shortly after, Charles Joseph Minard popularised their use with sophisticated depictions of emigration and trade . In 1981 Tobler produced and tested the first computer generated node-link flow maps . Each flow was presented as a straight-line arrow connecting its origin and destination, with arrow thickness proportional to its quantity. Unfortunately visual clutter and line crossings are inevitable even in small datasets. The term “flow map” has also been used in a very literal sense to depict flows in rivers (with a single source and single destination) on geographic maps .
tests the limits of scalability of traditional flow maps with straight-line arrows, trying to make aggregate flow information visible through overlaid density maps. Whilst these show overall trends, aggregation and vector fields lose potentially important information about individual flows.
Another approach is to bundle links together. Several elegant and sophisticated ‘bundling’ strategies have also been proposed for flows from a single source where a simple hierarchy is possible  and algorithms have been developed for single-source flows that achieve aesthetic branching properties [21, 33]. While these bundling strategies are well suited for maps presenting flows for one-to-many locations, their application for many-to-many flow is limited. To create readable node-link flow maps for our study we adapted a bundling method originally intended for network visualisation  that is capable of handling many-to-many flows, see Section 3.1.
Interaction is another way to overcome such visual clutter. A recent system described by van den Elzen and van Wijk  provided interactive filtering and aggregation to interactively restrict the set of origins and destinations to something manageable with an otherwise conventional flow-map representation. Obviously, in printed or public displays such interaction is unavailable, yet even with interaction each individual view should ideally be as informative and unambiguous as possible with respect to the underlying data . Thus, our primary focus in this paper is on the design of flow representations that are as readable as possible from a single view. However, even the best possible design has limits to its scalability and so we consider interaction for novel flow representations in Section 5.
OD Matrix Based Approaches. Adjacency matrix representation of flow networks are called OD matrices. These present flow using a table where rows and columns represent origin and destination locations and each cell indicates the quantity of movement from one location to another. The original OD matrix dates from 1955 . More generally, adjacency matrices have long been useful for presenting a network of relationships in a compact and structured format where reordering of rows or columns can reveal patterns . A user study by Ghoniem  found that adjacency matrices perform better than node-link diagrams for quickly reading adjacencies. Whilst the original OD matrices were purely numerical, colour shading using a heatmap approach  is often used to encode the size of the flow.
One major drawback of the classic OD matrix is that it lacks a mapping from OD locations to geographical positions. The identification of geographically related rows, columns or cells can be difficult and so spatial patterns in the dataset can be hard to determine . Marble et al.  attempt to preserve the spatial properties of the OD locations by reordering columns and rows by approximate spatial position but only limited spatial information is retained due to dimension reduction.
OD maps  attempt to overcome this limitation through a nested small-multiples design, see Fig. 1. They provide schematic geographical information by dividing the canvas into a regular grid based on the actual geographical locations on the map using a spatial treemap structure . A second level of spatial treemaps is embedded within the first to present the OD information using colour shading. To aid readability, some cells of the grid may be left blank to indicate the outline shape of the country, e.g. Fig. 1(b). Like the OD matrix, spatial locations in OD Maps are presented as squares. As all locations have similarly sized cells, this allows data for small, highly populated areas to be seen at the same level of detail as more sparsely populated larger regions, e.g. in Fig. 1 compare Berlin (BE) to Brandenburg (BB).
Whilst spatial treemaps are currently being tested for their performance in a number of tasks , OD maps have yet to be evaluated in a quantitative user-study. Less formal studies show they are useful for presenting geographical commodity flows to data experts [18, 38], but OD maps have not yet been tested on a wider audience or compared with other visualisations. We would expect OD maps to best suit countries with similar width and height such as Ireland, Germany or Australia, while they may be less suitable for countries with elongated proportions such as Japan or New Zealand where the distortion of map location to grid location may cause cognitive difficulties.
Other Approaches. There have been a number of recent examples which combine other visualisations with maps to present flow data. VIS-STAMP , for instance, presents a matrix of small multiple maps in their approximate spatial location with linked views including parallel plots.
Flowstrates connects a temporal heatmap with two maps presenting the geographical locations of origin and destination  and shows how flow changes over time. The resulting visual representation is superficially similar to the MapTrix visualisation presented in Section 2. However, while Flowstrates does present OD data it is not designed to present a complete OD matrix as we do in MapTrix. In Flowstrates each row corresponds to a single flow as it uses columns for the temporal scale. In contrast, a single MapTrix cell corresponds to a single flow. Thus to show all flows between M sources and N destinations Flowstrates requires leader lines but MapTrix only leader lines. Furthermore, it is possible to avoid leader line crossings with MapTrix but not with Flowstrates for larger multi-way flows.
2 Design and Implementation of MapTrix
Our novel flow visualisation, MapTrix, is intended to show quantitative multi-source flow data together with its associated geographical information. It has three main components: an origin map, a destination map, and an OD matrix with a single line connecting each origin and destination to the corresponding matrix row or column.
2.1 Design of the Visual Representation
Our first attempt to connect the OD matrix to the two maps ordered rows and columns by their map locations’ - and -coordinate, respectively and used straight line leaders connected map locations to their corresponding matrix row or column. Unfortunately, this resulted in many leader line crossings making it very difficult to link rows and columns with locations.
Our second attempt led to the design shown in Fig. 2(a) which ensures that the connection between maps and matrix was clear, easy to track and unambiguous. To achieve this we solved a so-called boundary labelling problem which finds an ordering for the matrix rows and columns that permits leaders to connect map locations without crossings. There are various models for aesthetic boundary labelling for different situations [2, 3, 4]. Our design uses a one-sided boundary labelling model to generate crossing-free connections with a horizontal and a diagonal segment between points in the figure and labels at one side of the figure. We introduce a novel leader adjustment algorithm (next section) to more evenly space the leader lines.
Our design uses colour shading (“YlOrRd” continuous scale from colorbrewer ) to show magnitude of flow between states. Geographical locations’ total in/out flows are indicated by proportional-sized circles in the map, Fig. 2. Choropleth maps were also investigated, but as the scale of the total and single flows could be very different multiple colour schemes would be needed. In addition to the proportional circles, bar charts were added to help the reader to follow the line (e.g. from large circle to large bar) between map and matrix and to emphasise total in and out flows.
The design is also well suited to showing flow between different countries, as shown in Fig. 2(b). However, for showing flow within a single country the asymmetrical ordering of rows and columns in the OD matrix can be confusing. A consistent ordering for the rows and columns is critical for revealing patterns .
Our final design—shown in Fig. 3—permits the same ordering to be used for rows and columns by rotating the OD matrix. The destination map is placed under the origin map and the OD matrix is rotated to allow both symmetric ordering and crossing-free leader lines to the maps. An additional advantage of the rotated matrix is that the labels are easier to read. To utilise the additional space and aid leader line connection, the bar charts showing total in/out flow are centred on the leader-lines. We also add total inflow and outflow to both bar charts, differentiated by colour, to allow net flow to be easily determined. Instead of the large arrow, we use the darker bar charts to indicate direction of flow.
2.2 Algorithm for Leader Line Placement
When connecting sites in the maps with rows and columns in the OD matrix we would like: (1) connection lines to be crossing-free; (2) adjacent connection lines to be clearly separated; and (3) clear separation between lines and map locations (sites) to avoid ambiguity.
Line through site
Lines too close
We associate penalties with close leader-line segments and displacement of connection points from their initial position. We define hard linear constraints to preserve the ordering of leaders and keep the connection points inside their state boundaries.
Input to the Bekos et al. one-sided boundary labelling is a connection site for each map location, typically at the centre of a region. The output is a label ordering permitting crossing-free connection to map locations. That is, for sites we have an ordering such that the leaders for sites and are adjacent and crossing free. There are two types of leaders: those with diagonals pointing upward from the sites and those with downward diagonals.
The quadratic program to reposition connection sites to achieve good leader separation is as follows. Let be variables for leader connection coordinates. The first set of goal terms penalise displacement of connection sites from their initial position:
Inside each state boundary we find a rectangle in which the connection site can be safely positioned. Ideally, in order to maximise freedom in placing the connection site, this should be a rectangle with maximal width and height centered around the initial site position. We use a simple heuristic to find such a rectangle. We start from the initial
site to cross another leader line and introduce a crossing as shown in Line through site. Thus, we prune the rectangle to ensure the site remains a minimum distance from all other leader lines ().
Constraints keep the leader connections inside the (pruned) rectangle boundaries:
where and is the gradient of the leader diagonals. Since is constant the relationship is linear. We introduce another variable to our quadratic program for each and the above relation between is added as a hard constraint. The constraint:
preserves the ordering of parallel leader lines ensuring they remain crossing free. A final set of penalty terms encourages equal separation between adjacent leader lines:
where is the maximum initial separation between adjacent leader diagonals output by the Bekos et al. algorithm.
The full quadratic goal is where the weight can be varied to trade-off displacement of connection sites and equal separation of leader diagonals. To obtain connection sites with good separation we minimise this goal subject to the linear constraints of Equ. 1, 2 and 3. Since the number of variables and constraints is linear in the number of input regions, solving this quadratic program with a standard solver is very fast. Placement of hundreds of connection sites takes a fraction of a second on a standard computer.
3 Study 1
We conducted an on-line user study to evaluate MapTrix and to compare it with two alternative visualisation methods; a flow map using bundling and the OD map by Wood et al. . We chose these methods because flow maps are the most common visualisation for showing flow while OD maps are an alternative approach to enhance the OD matrix with a geographic embedding.
We aimed to test the usability of the three methods with respect to various tasks as described in Sec. 3.3. We consider user preferences as well as task performance in terms of response time and accuracy. This first study considers only static representations. We begin to consider basic interactions in Study 2, Sec. 4.
3.1 Bundled Flow Map Design & Implementation
We searched for a flow map design solution which could minimise data occlusion by reducing overlap without removing individual flows such as through flow aggregation. There are a couple of recent edge bundling methods that are able to neatly offset individual edges within bundles [6, 24]. We adopt the method by Pupyrev et al.  which groups edges on shared paths that are centred between obstacles. It then neatly offsets the curves so that all are visible and uses a heuristic to minimise crossings as lines join and leave the bundles. To demonstrate the reduction of line overlap Fig. 4 shows straight and bundled arrows.
|Total Flow||TFI||Identify two total in/out flows for two named locations and compare their magnitude.||Comparing the two locations QLD and TAS, which has the greater total inflow?|
|TFS||Search for the largest/smallest total in/out flow.||Which state has the largest total outflow?|
|Single Flow||SFI||Identify two single flows between named locations and compare their magnitude.||For the two flows from WA to ACT and TAS to SA, which is greater?|
|SFSo||Search for the greatest single in/out flow for one named location.||ACT receives the largest single flow from which state?|
|SFSm||Search for the largest single flow across all (many) locations.||Which is the largest single flow?|
|RF||if the flow is predominantly within the regions or among the regions.||Using the regions A and B defined in the above map, is the flow predominantly within A or B?|
We investigated the use of colour together with arrow size to indicate magnitude of flow. We found that due to line occlusion around arrowheads the flow direction was often difficult to determine. Following Holten et al., we therefore decided to encode line direction using colour gradient. The darker section of the line shows inflow direction, while the lighter section depicts outflow as shown in Fig. 4(c). The continuous blues colour scheme from colorbrewer is used . We used a different colour scheme to the MapTrix flow data to limit confusion. The key aspect of using a continuous gradient from source to target is that the directionality of the line can be understood at any part of the line, so the reader does not need to follow the line to find an arrow head. Note that, since we use line width to encode flow magnitude, the tapered line representation advanced by Holten et al. would not work in this situation.
To embed the information of total in/out flows we use proportional circle sizes. Unlike MapTrix where two maps are available, the bundled map has only one location point. We therefore replaced the solid black circle for each location as shown in Fig. 4(b) with two half circles as shown in Fig. 4(c). The left half circle in black indicates total inflow, while the right half circle in grey shows total outflow.
3.2 OD Map Design & Implementation
OD Maps preserve the geographical aspects of OD matrices without including lines or arrows and introducing occlusion. Having discussed OD Map implementation with the authors [18, 36, 37] we manually created grid layouts for the necessary countries to ensure the grid structure was as intuitive and as similar to the country shape as possible, as shown for Germany in Fig. 1. We used the same colour scheme as shown in the MapTrix matrix for the flow data and slightly modified the Wood et al. OD map design  to include a proportional circle at the associated origin or destination cell of the small multiple to show the total in/out flow for each location. We also show both the OD map for outflows and the reverse ‘DO’ map for inflows, to allow for two way comparison. Fig. Many-to-Many Geographically-Embedded Flow Visualisation: An Evaluation shows the dual OD/DO Map visualisation shown in the study for Australia.
3.3 Apparatus & Materialsinline, color=remarkYalong,size=,caption=xxxinline, color=remarkYalong,size=,caption=xxxtodo: inline, color=remarkYalong,size=,caption=xxx YY: In summary review; R3; describe more precisely the scope actually covered by the studies. inline, color=remarkReviewer,size=,caption=xxxinline, color=remarkReviewer,size=,caption=xxxtodo: inline, color=remarkReviewer,size=,caption=xxx Reviewer: R3: Could you state what are the specific goals and the hypothesis of the study? inline, color=remarkReviewer,size=,caption=xxxinline, color=remarkReviewer,size=,caption=xxxtodo: inline, color=remarkReviewer,size=,caption=xxx Reviewer: R3: You have asked participants about their background knowledge. Did you try to analyze the data? Are map experts more successful at performing your tasks? Do your conclusions hold across all the groups?
In order to test how the visualisations perform for different numbers of locations we investigated their use for different countries. We decided to use real rather than fictional countries and locations to implicitly emphasize the use of such visualisations for common commodity flows such as population migration. This also allowed us to explore the possible impact of prior knowledge of geography on performance.
Tasks We identified a variety of tasks that commodity flow visualisations should support by reviewing the geographical visualisation literature [1, 26, 31]. For single and total flows we are mainly interested in flows from a given target location(s), or identifying which location(s) corresponds to a given characteristic. These tend to be lookup or comparison tasks which may refer to identifying and comparing total flow (TF) values of 1, 2 or many locations, or single flows (SF) between 2 or many locations. A further important task involves determine the geographical or regional distribution of the flow (RF). This involves identifying if flow is predominantly within a certain area on the map or between two different areas. We designed our questions of the study into the following six task categories: TFI, TFS, SFI, SFSo, SFSm and RF. These are defined along with examples of exact questions in Table 1.
Countries and Datasets To represent actual commodity flow data, we created synthetic datasets based on real internal population migration data. The first country we chose was Australia (AU) as it has a large spacious country shape with relatively few federal states (and territories). With 8 states there are only between state flows to present. The original dataset for AU is based on 2013-14 internal migration for AU111http://stat.abs.gov.au//Index.aspx?QueryId=1233.
To investigate larger number of flows, Germany (DE) was chosen as a comparison as it again has a large and spacious shape but double the number of federal states and therefore individual flows. For DE between state migration was not openly available so we allocated data from USA internal migration data from 2009-10222https://www.census.gov/hhes/migration/data/acs/state-to-state.html.
A third country, New Zealand (NZ), was chosen to allow us to investigate the effect of country shape. It has the same number of national states as DE but is more elongated. The original NZ dataset is based on regional migration from 2001–06333http://www.stats.govt.nz/browse_for_stats/population/Migration.
During our pilot sessions we also investigated countries with larger number of locations, including the United States of America (US) with flows. This number of flows was found to be too confusing and difficult for users, particularly for the bundled flow map design. We therefore removed US from the first study (but used it in the second study Sec. 4).
In order to train the participants we introduce the problem and explain the visualisations using the United Kingdom. It has a distinguishable country shape and only four national states (in this case countries) so therefore only flows. The training is provided as supplementary material.
For each case we minimise the effect of the data on the study results by randomising the source and destination of the original dataset in order to ensure each question has different data and participants must read the data every task. For Task RF (see Table 1) we ensured that data was different but that the spatial pattern remained. We therefore ensured that the data for analysing flow between/within regions had a definite answer. Sometimes there was a near second best answer. We experimented with treating these Almost Correct responses as correct and as incorrect: this had virtually no impact on the analysis results. In the analysis presented here we give them half points.
R1: The “almost correct” category is odd – I do not believe I have seen any prior study have such a category. Thus, while I am not suggesting that it must be dropped, I do think it requires somewhat more explanation to document why it is included and what the implications of including (or excluding) it are.
Procedure The structure of the study was slightly amended following the pilot study as the study took too long with all three countries. As all tasks were shown to be important to the analysis we chose to split the countries so each participant was asked questions about only one pair of countries (AU-NZ; AU-DE; NZ-DE). The choice of country pair was counterbalanced. After receiving information about the study through the explanatory statement and agreeing to the consent form the study took the following structure:
Background knowledge: participants were asked about their prior experience using maps – rarely use, navigation only or often use maps to read statistical information – and their knowledge of the administrative structure of their pair of countries;
Training: participants were given an overview of the problem and explanation of each of the visualisation methods. Upon finishing the training for each method the participant was showed two sample questions with the answers and explanation. They were then asked to answer another two questions to verify that they understood the method. The training order was counterbalanced.
Tasks: participants were asked to answer questions: one for each kind of task for each of the three visualisation methods and each of the two countries. Question order was randomised.
Ranking and Feedback: participants were asked to rank the three visualisation methods in terms of visual design and in terms of effectiveness of reading information for each of the two countries that had been shown. They also had the opportunity to comment on the strengths and weaknesses of each visualisation method.
Participants To attract a range of skill-levels amongst participants the study was advertised at Monash University (Australia) using a university-wide bulletin and through email lists at Microsoft Research (USA), HafenCity University (Germany) and two international map visualisation lists of GeoVis and CogVis. Three $50 gift cards were offered as an incentive, where participants could optionally provide their contact details and be placed in the prize draw.
R1: I remain a bit skeptical about the extent to which these results (with an open call to university people) are both generalizable and applicable to the primary target audience of large data innovators. The participants in the study are probably not representative of the scientists that much of the paper seems to suggest will be primary users.
In total we had 62 complete responses, with an equal split of country pairs – 20 AU-DE, 21 AU-NZ and 21 NZ-DE. Of these 2 participants were excluded from the final analysis due to the exceedingly quick completion time of 5m and an average task time of 8s. Upon analysis we also trimmed 1% of response times – those over 300s/5m – this removed large outliers ranging from 305s to 3352s. On average the 60 participants spent 39s per task and the entire online study took an average of 51m:52s to complete.
Statistical Analysis Methods We consider response time and accuracy for each question. We investigate the effect of the three conditions of visualisation (Vis) (these are abbreviated to BD for Bundled Flow Map, OD for OD Map, MT for MapTrix in this section), country and task, and to what degree these conditions differ significantly.
In our analysis we treat all conditions as being independent. Although question order was randomised, we validated task independence by plotting results against question order. No clear pattern was evident.
To compare error rates between different conditions we use standard non-parametric statistics : For multiple (more than 2) conditions, we use Friedman’s ANOVA to check for significance and apply Post hoc tests with Bonferroni correction to compare groups while for two conditions, we use the Wilcoxon signed-rank test. Both tests require the same participants in all conditions so when comparing across countries (3 conditions) we could not directly use Friedman’s ANOVA as participants only completed the study for two countries. Instead we split the results into 3 groups, one for each pair of countries and used a Wilcoxin signed-rank test for each group.
To compare response time we consider only times for correct and almost correct responses. To test for significance we use a multilevel model for analysing mixed design experiments . Here we breakdown the analysis by each condition and their interactions (Vis/Country, Vis/Task, Country/Task and Vis/Country/Task).
For the user preference results we again use Friedman’s ANOVA and Post hoc tests to test for significance.
3.4 Resultsinline, color=remarkReviewer,size=,caption=xxxinline, color=remarkReviewer,size=,caption=xxxtodo: inline, color=remarkReviewer,size=,caption=xxx Reviewer: R3: The layout of the figure makes me compare accuracy of a specific vis method across various datasets. However, I would like to compare different methods on the same dataset. For example, it is almost impossible to say whether BD outperforms OD for dataset AU and task RF. Can you “rotate” the figures?
Error Rate Responses were in four categories of accuracy: , , and , Fig. 5. We see notable differences in the performance of BD compared to OD and MT, in particular for the SF tasks for the two larger datasets (DE and NZ). All vis methods perform well in the TF tasks, especially TFI. The RF task also shows a far lower accuracy across all vis methods (A in Fig. 5).
Our smallest dataset (AU) consistently out-performs DE and NZ in almost all tasks for all vis methods. There is one notable exception (see highlight B: Fig. 5): BD performs far worse for SFSm with 13% correct + 23% almost correct, compared to 90%+5% for OD and 85%+5% for MT. Statistical significance is shown between BD:OD and between BD:MT (both ). No statistical significance is evident between OD:MT.
The other two countries DE and NZ have the same number of flows. There are some similarities and notable differences when comparing the two sets of results. Most notably, BD is less accurate for SF tasks, see C, D, E in Fig. 5. For all SF tasks using DE and NZ, Wilcoxon signed-rank tests show statistical significance between BD:OD (SFI: , both SFSo and SFSm ), and between BD:MT ( values: SFI , SFSo and SFSm ).
For SFS(o/m), compared to BD not only does response rate improve using OD and MT (all for OD:BD & MT:BD in SFS(o/m) ), but the ability to differentiate the dominant answer (i.e. correct rather than almost correct) is far higher, see D and E in Fig. 5.
For TF tasks we observe more similarity between methods. All vis perform well particularly for TFI, with BD performing slightly worse for DE. For TFS we see some differences between vis methods with OD and MT performing better than BD, but no statistical significance is found.
For RF tasks, not only do we see a difference in performance between all tasks for all vis, but BD performed notably worse than OD and MT (See Fig. 5 A). Friedman’s ANOVA (details Sec. inline, color=remarkReviewer,size=,caption=xxx) for DE reveals a statistical significance between BD:OD () and between BD:MT (), the same for NZ; between BD:OD () and between BD:MT (). Similar percentages are reported for both DE and NZ, with 34 and 35% for BD and between 56 and 63 for OD, and 61 and 63% for MT. No statistical significance is again found between OD and MT.
Response Time We extract the results of all Correct and Almost Correct responses (1689 timed responses) from all 2160 responses and plot these for all conditions, as shown in Fig. 6. These box plots, together with multilevel model analysis method, reveal:
For DE, Task SFI takes increasingly longer from BD, OD and MT (i.e. MT OD BD – see F in Fig. 6). This is shown to be statistical significant ();
For DE, SFSm the trend is the opposite (i.e. MT OD BD) – see G in Fig. 6. Again, there is a statistical significance ();
Although their accuracy is higher, OD and MT took notably more time on RF than BD. MT longer than OD. Correct responses have a wider range for NZ. No statistical significance is found.
Finally, as the size of the dataset increases from AU to DE/NZ we see increasing response time for all tasks, especially for SF tasks (multilevel comparison: DE AU & NZ AU, ) (see Fig. 6 H). NZ often takes longer than DE.
BD was intuitive and familiar: “it is good for anyone with geographic knowledge and spatial cognition”. But arrows overlap, arrows are too long, the visualisation does not help when there are many locations and was hard for the RF task: “Too many locations means many arrows, they occlude, it’s hard to see which is which” and “Hard to follow arrows over long distances or through intersections… impossible to answer the between or within regions questions”.
OD was easy to comprehend and participants often liked the geographical layout. Others found it good for comparison and easy to read the flows. Some also commented on the novelty: “It is creative and clear”. Yet, it was also seen as the most unfamiliar and sometimes difficult to comprehend: “Arrows are missing, I was confused to identify inflow and outflow”. The small square sizes were also frustrating: “the visualisation (grids) can become rather small and more difficult to interpret”.
MT was visually attractive, easy for larger flows and intuitive: “clear, it is easy to quantify the flows”. Yet, some found it confusing or unfamiliar. A few commented that it looked complex: “It may look complicated but it is the best visualization for information extraction”. Others found it difficult to follow the lines or read the labels in the matrix, especially with more locations: “When dataset is large, it becomes difficult to follow the flow”. Some also commented on there being too much information and there being redundant visual elements (e.g. bars and circles).
Summary These results reveal that:
AU is the fastest and best performing of all countries. All vis methods are suitable for such small datasets, with the exception of BD for the SFSm task;
Error rate worsens with scaling data from AU to NZ/DE, especially for BD for all the SF tasks, where OD and MT out-perform BD with statistical significance;
SFI takes the longest of the SF tasks and on average has the highest error rate;
The RF task takes the longest and has the highest error rate compared to all other tasks. All vis methods performed poorly;
OD and MT show no significant differences in performance across all conditions;
Participants prefer MT for design and readability of information.
A central design goal of OD and MT is to overcome the problem of occlusion of flows as data increases. For the larger datasets (DE and NZ) both OD and MT were significantly better than BD, but there is no significant difference between the two for any condition. User ranking indicates a preference for MT. The fact that BD performs worse is unsurprising given the known problem of overlapping flows; however, the remarkably similar performance of OD and MT is unexpected. We now examine to what extent these methods scale and whether the similarities in task performance continue with increasing scale.
4 Redesign and Study 2
In this section we concentrate on the scalability of MT and OD. Our second study followed the same structure and participant recruitment method (see Sec. inline, color=remarkReviewer,size=,caption=xxx and inline, color=remarkReviewer,size=,caption=xxx) as the first, but the countries investigated were amended together with improvements made to the tasks and visual designs.
Data To investigate larger countries than NZ and DE ( flows) we wanted to use The United States of America (US) with flows as our previous pilot revealed that participants got frustrated with BD for US, but less so with OD or MT. We also chose China (CN) with flows as it is almost half way between the two. For CN the original data set is available for the internal migration from 2005-10 444http://www.stats.gov.cn/tjsj/pcsj/rkpc/6rp/indexch.htm. For US we use 2009-10 internal migration data 555https://www.census.gov/hhes/migration/data/acs/state-to-state.html. Again we randomised the locations of the data for each question.
Tasks In the first study, the RF task was found to be extremely difficult. However, when considering flow in a geographical context it is important to be able to easily compare multiple groups of locations. In the description of tasks below, we define a region as being a collection of locations on the map that are geographically contiguous (adjacent). Due to the design choices of both OD and MT the marks corresponding to flows for such regions in the map may not be adjacent in the visualisation.
For more detailed comparison, we divide the RF task from Study 1 into subtasks related to the adjacency of regions in the visualisation and whether the flow is occurring “between” regions or “within” a region. The six subtasks are labelled: RFBb, RFBw, RFBn, RFWb, RFWw, RFWn. These codes are explained as follows (examples are provided as supplementary material):
Assume two regions A & B each consisting of multiple contiguous locations. Are the flows predominantly
Between: between A to B or B to A?
Within: within A or B?
Different adjacency conditions for visuals of regions and locations:
between: locations within region and regions are adjacent in vis;
within: only locations in each region are adjacent in vis;
none: both are not adjacent in vis;
For each question we manually identified appropriate regions for the task for each visualisation method to ensure comparability. These were combined with the same tasks as the first study. In total, participants were asked to answer (Country) questions.
Visualisation Redesign Feedback and suggestions from the first study and modifications to make it more capable with large dataset, led to some design improvements for both methods explained below.
MT, shown in Fig. 7:
Removed TF barcharts to allow more lines, improve tracking and reduce redundancy of information;
Scaled lines thickness and grey shading of line and label proportional to TF circle size and aid line tracking;
Added separation lines within matrix every 5 rows / columns to aid user tracking;
Minimised overlapping of circles and labels in maps;
Removed full names in matrix. All labels refer to abbreviations;
Removed the arrows in the destination maps to give more space for labels and circles; instead, we used a destination icon next to the map label.
OD (examples are provided as supplementary material):
Removed white space to increase grid square size;
Moved and enlarged legend to improve lookups and to allow more space for grid;
Extra care with grid layout to ensure neighbouring regions were adjacent and limited white space – the downside being that the country is more abstract;
Increased text size and added label shading to relate to TF and match to the proportional labels in MT;
Added destination icons to indicate direction to match new icons in MT, with multiple arrows in/out compared to only one for MT.
Pilot Test and Highlighting The first study indicated that the RF task was the most difficult and time consuming across all vis techniques. Our redesign of the RF question to investigate adjacency was intended to investigate this task in more detail; however, pilot testing revealed difficulties.
The RF tasks, although now possible to answer, still took considerable time and were a particular cause of frustration. One participant took over 1h:30m to complete the pilot, with the majority of this time spent manually connecting flows or identifying the squares for the regional tasks.
To continue to investigate scalability and to allow us to determine whether one visualisation out-performs the other for the aggregation of flows we opted to aid the users in finding the right locations by highlighting them on the OD Map or MapTrix. Our assumption is that such simple highlighting is easily made available with interaction. We eventually implemented this, see Section 5. Subsequent pilots revealed much more satisfied users and much faster completion time.
To encourage participants to think over their answers, instead of showing “Too difficult” option straight away we revealed it after 1 minute for every question.
The study had 46 valid responses from an original 49 (3 with impossibly quick responses were excluded). On average, individual task completion time was 31.74s and the entire study took 45m:12s. We present the results for error rate and response time in Fig. 8 and Fig. 9. For the response time analysis, we took the 1861 correct and almost correct responses from 2024 total responses.
Error Rate Fig. 8 shows remarkably similar results across all conditions. No differences are evident in the RF tasks, which all performed very well. Some differences are evident in Fig. 8 between the vis methods for the SF and TFS tasks, but these are not consistent between countries and these differences are not statistically significant. Considering the increase in data flows, it is surprising to see that the results often show an improvement for US over CN. Investigating whether task performance improved with country knowledge, we compare the results for those who claimed good knowledge of the states of US (12 participants) or CN (11) to those who claimed little to no knowledge of the states. As expected identification tasks (TFI and SFI) increase in speed as well as accuracy for those with knowledge of US, however, only SFI shows an increase for CN. Perhaps the US map is more well known than participants realise, or perhaps it is better suited for these designs. Feedback from one of the pilot participants suggested that the block shapes of US states helped identification.
The differences for SF tasks show for CN, OD (82%) outperformed MT (62%) in SFI, with slight improvement for both for the US. For CN, OD (82%) outperformed MT (65%) in SFSm, with improvements for both for US. This is the only task where one method outperforms the other for both data sets. For CN, MT (91%) slightly outperformed OD (85%) in SFSo, but for US the results reduce for MT and increase for OD.
The final notable difference is for TFS, where for US OD (98%) outperformed MT (74%), but for CN results for relatively similar for both vis methods.
Response Time In our response time analysis, Fig. 9 shows all conditions. Some notable and statistical significant differences are evident:
In CN for SFI, MT took longer than OD (), see I in Fig. 9, reflecting the increased difficulty indicated through the higher error rate;
In CN for SFSm, OD took longer than MT (), see J in Fig. 9, despite a lower error rate;
With OD, SFSm took longer than SFSo ();
In general, SFI took longer than all other tasks. This is statistically significant for OD in US and MT in both US and CN with ().
RF is significantly quicker than the rest ().
RFB[bwn] take significantly longer than RFW[bwn] (), see K in Fig. 9.
MT is quicker than OD for RF, but not significantly.
User Preferences and Feedback
Removing the 6 participants who had participated in the first study from the results we see the rankings for readability of OD increase to 65%, whilst design remains the same.
The qualitative analysis of the feedback quotes again reveals quite conflicting preferences:
OD was seen as easy to link locations with visualisations. Some participants found it easy to compare single flows: “OD is easy to find the flow from one location to another without losing your place” and to find the location names. Many found the visual elements (grids, cells, circles, labels) far too small. Some disliked the abstract geography: “losing some geographical reference make locations confusing, given prior map knowledge”, whilst others recognised that although it can be difficult at first, you can learn the representation: “you would quickly learn their locations”.
MT was found to be familiar because it has real maps and related to the geographical locations. For some the matrix display was also familiar: “Closer to familiar matrix display. The way of connecting the maps on the left with the rows and columns of the matrix works well.”. Many participants commented on the difficulty of finding locations, e.g. “It was too dense with the labels too small to identify the place.”. Whilst some commented that it was difficult to trace the leader lines and there was a need for a marker: “Sometimes I had to use a ruler to find the intersection.”
In general, many participants requested interaction, such as highlighting and selecting. Some noted that locations need to be easier to find in MT, i.e. through reordering the matrix or by allowing text based searching. A few participants also commented that the RF task would be near impossible without the highlighting.
Summary The results are consistent with the previous study in that both OD and MT perform similarly. We demonstrate that both methods can scale to data sets containing flows (US). However, the identification of individual and in particular regional flows became much more difficult and time-consuming. SFI takes considerably longer to complete.
RF took too long, and therefore we aided users with highlighting to simulate possible interaction. Our results using highlighting for all RF tasks are promising for both methods. No clear differences are evident when the adjacency of the regions differ and although the RFB tasks did take longer than the RFW tasks, the accuracy remains very good and response time is relatively low for all RF tasks.
We do see tasks decreasing in accuracy between the two studies, but in this study, despite both countries being round rather than elongated, we did not find consistency in increased time or error rate with increased numbers of flow – US outperformed CN for some tasks.
5 Interactioninline, color=remarkYalong,size=,caption=xxxinline, color=remarkYalong,size=,caption=xxxtodo: inline, color=remarkYalong,size=,caption=xxx YY: In summary review; R1 and R2: discuss interactivity in more detail.
We learned from our second study that while MapTrix makes it possible to read a single flow value between a given source and destination in larger datasets it did become more difficult. For comparing clusters of locations our pilots revealed that highlighting of paths (from origins via leaders and matrix cells to destinations) was essential. We have constructed a prototype interactive system which allows users to interactively create these highlights through various selection mechanisms . In particular, the following interactions directly support the indicated tasks from Table 1:
SFSo–Highlighting of the associated row/column on mouse-hover over a map region, cell, label or leader line.
TFI, SFI, SFSo–Mouse-click makes such cell highlighting persist such that multiple flows can be compared simultaneously.
TFS, SFSm–The colour key beside the MapTrix is an interactive widget allowing for filtering the MapTrix to a particular range of flow values, Fig. 11.
RF–Aggregate selection for Regional Flow comparison tasks, Fig. 10.
The last two interactions both reduce the number of regions shown in the MapTrix and induce a re-layout of the MapTrix and leader lines. Such re-layout is fast to compute; for the US with 51 locations it is in the order of a few milliseconds. This dynamic rearrangement, together with the smooth transition animations we use, are demonstrated in our accompanying video.
We have introduced a new method, MapTrix, for visualising many-to-many flows by connecting an OD matrix with origin and destination maps. We have provided a detailed analysis of the design alternatives and have given an algorithm for computing an arrangement with crossing free leader lines.
We conducted two user studies of visual representations of many-to-many flow. In our first study we compared MapTrix with a flow map with bundled edges and with OD Maps for different country maps. All three visualisations performed well for the smallest data set (AU - 8 locations), but MapTrix and OD Maps were far better for DE and NZ (16 locations). There was no statistically significant difference between MapTrix and OD Maps on data sets of this size. Surprisingly, we did not find that country shape affected performance: in particular we had expected this to affect OD Maps.
In our second study we compared MapTrix and OD Maps on two larger data sets (CN - 34 locations and US - 51 locations). Both performed relatively well for all tasks and we did not find that one method outperformed the other even for individual tasks. We did find in the pilot that analysing flow between or within regions for data sets of this size was extremely difficult with both methods, though slightly easier with OD Maps. Thus, in the study we used highlighting to help with analysis of regional flow.
In the first study users ranked MapTrix highest in terms of design and readability while in the second study MapTrix is preferred for design but OD Maps for readability.
The designs presented in this paper and our user study concentrate on static visual representations of dense many-to-many flows. However in our second user study we did explore the usefulness of highlighting for analysis of regional flow. The results of our studies led us to implement several types of interaction, not only highlighting but also filtering and region zooming. We plan to evaluate these in our future work. One limitation of our studies is that participants were predominately students or researchers: we plan further evaluations with domain experts.
Acknowledgements.Data61, CSIRO (formerly NICTA) is funded by the Australian Government through the Department of Communications and the Australian Research Council through the ICT Centre for Excellence Program. We would like to thank Dr Aidan Slingsby and other members of the giCentre, City University London for their work and discussions on OD Maps. We thank Dr Haohui (Caron) Chen and all of our user study participants for their time and feedback.
-  N. Andrienko and G. Andrienko. Exploratory Analysis of Spatial and Temporal Data: A Systematic Approach. Springer Science & Business Media, 2006.
-  M. A. Bekos, M. Kaufmann, M. Nöllenburg, and A. Symvonis. Boundary Labeling with Octilinear Leaders. Algorithmica, 57(3):436–461, 2009.
-  M. A. Bekos, M. Kaufmann, and A. Symvonis. Boundary labeling: Models and efficient algorithms for rectangular maps. Computational Geometry, 36(3):215–236, 2007.
-  M. A. Bekos and A. Symvonis. Boundary Labeling with Octilinear Leaders and Minimum Number of Leader-Bends. In International conference on applied computer science, pages 208–213, 2010.
-  J. Bertin. Semiologie graphique: les diagrammes, les rŽseaux, les cartes. La Haye, Mouton; Gauthier-Villars, 1967.
-  Q. W. Bouts. Clustered edge routing. In 2015 IEEE Pacific Visualization Symposium (PacificVis), pages 55–62. IEEE, 2015.
-  I. Boyandin, E. Bertini, P. Bak, and D. Lalanne. Flowstrates: An Approach for Visual Exploration of Temporal Origin-Destination Data. Computer Graphics Forum, 30(3):971–980, 2011.
-  Y. Chiricota, G. Melançon, T. T. P. Quang, and P. Tissandier. Visual Exploration of (French) Commuter Networks. In Geovisualization of Dynamics, Movement, and Change AGILE Satellite Workshop, 2008.
-  B. D. Dent, J. S. Torguson, and T. W. Hodler. Cartography: Thematic Map Design. McGraw-Hill, 6 edition, 2008.
-  A. Field, J. Miles, and Z. Field. Discovering Statistics Using R. SAGE Publications, 2012.
-  M. Ghoniem, J.-D. Fekete, and P. Castagliola. A Comparison of the Readability of Graphs Using Node-Link and Matrix-Based Representations. IEEE Symposium on Information Visualization, pages 17–24, 2004.
-  M. Gilbert, A. Mitchell, D. Bourn, J. Mawdsley, R. Clifton-Hadley, and W. Wint. Cattle movements and bovine tuberculosis in Great Britain. Nature, 435(7041):491–496, 2005.
-  D. Guo. Visual analytics of spatial interaction patterns for pandemic decision support. International Journal of Geographical Information Science, 21(8):859–877, 2007.
-  D. Guo, J. Chen, A. M. MacEachren, and K. Liao. A visualization system for space-time and multivariate patterns (vis-stamp). IEEE Transactions on Visualization and Computer Graphics, 12(6):1461–1474, 2006.
-  M. Harrower and C. A. Brewer. ColorBrewer.org: An Online Tool for Selecting Colour Schemes for Maps. The Cartographic Journal, 40(1):27–37, 2003.
-  D. Holten, P. Isenberg, J. J. Van Wijk, and J.-D. Fekete. An extended evaluation of the readability of tapered, animated, and textured directed-edge representations in node-link graphs. In Pacific Visualization Symposium (PacificVis), 2011 IEEE, pages 195–202, 2011.
-  H. Johnson and E. S. Nelson. Using Flow Maps to Visualize Time-Series Data: Comparing the Effectiveness of a Paper Map Series, a Computer Map Series, and Animation. Cartographic Perspectives, 0(30):47–64–64, Jan. 1998.
-  M. Kelly, A. Slingsby, J. Dykes, and J. Wood. Historical Internal Migration in Ireland. In GIS Research UK (GISRUK), 2013.
-  D. F. Marble, Z. Gou, L. Liu, and J. Saunders. Recent advances in the exploratory analysis of interregional flows in space and time. Innovations in GIS, 4:75–88, 1997.
-  Q. Nguyen, P. Eades, and S.-H. Hong. On the faithfulness of graph visualizations. In Visualization Symposium (PacificVis), 2013 IEEE Pacific, pages 209–216. IEEE, 2013.
-  A. Nocaj and U. Brandes. Stub Bundling and Confluent Spirals for Geographic Networks. In Graph Drawing, pages 388–399. Springer International Publishing, 2013.
-  R. Paci and S. Usai. Knowledge flows across European regions. The Annals of Regional Science, 43(3):669–690, 2008.
-  D. Phan, L. Xiao, R. Yeh, and P. Hanrahan. Flow map layout. In Proceedings of IEEE Symposium on Information Visualization. INFOVIS 2005., pages 219–224, 2005.
-  S. Pupyrev, L. Nachmanson, S. Bereg, and A. E. Holroyd. Edge routing with ordered bundles. In Graph Drawing, pages 136–147. Springer, 2012.
-  A. Rae. From spatial interaction data to spatial interaction information? Geovisualisation and spatial structures of migration from the 2001 UK census. Computers, Environment and Urban Systems, 33(3):161–178, 2009.
-  A. Ramathan, J. Dykes, and J. Wood. Framework for Studying Spatially Ordered Treemaps. In 26th International Cartographic Conference, 2013.
-  A. H. Robinson. The 1837 Maps of Henry Drury Harness. The Geographical Journal, 121(4):440–450, 1955.
-  A. H. Robinson. The Thematic Maps of Charles Joseph Minard. Imago Mundi, 21:95–108, 1967.
-  W. Tobler. Depicting Federal Fiscal Transfers. The Professional Geographer, 33(4):419–422, Nov. 1981.
-  W. R. Tobler. Experiments In Migration Mapping By Computer. Cartography and Geographic Information Science, 14(2):155–163, 1987.
-  C. Tobon. Evaluating geographic visualization tools and methods: An approach and experiment based upon user tasks. In M.-J. Kraak, J. Dykes, and A. M. MacEachren, editors, Exploring Geovisualization, International Cartographic Association, pages 645 – 666. Elsevier, 2005.
-  S. van den Elzen and J. J. van Wijk. Multivariate network exploration and presentation: From detail to overview via selections and aggregations. IEEE Transactions on Visualization and Computer Graphics, 20(12):2310–2319, 2014.
-  K. Verbeek, K. Buchin, and B. Speckmann. Flow Map Layout via Spiral Trees. IEEE Transactions on Visualization and Computer Graphics, 17(12):2536–2544, 2011.
-  A. M. Voorhees. A general theory of traffic movement. Transportation, 40(6):1105–1116, 2013.
-  L. Wilkinson and M. Friendly. The History of the Cluster Heat Map. The American Statistician, 63(2):179–184, 2009.
-  J. Wood and J. Dykes. Spatially Ordered Treemaps. IEEE Transactions on Visualization and Computer Graphics, 14(6):1348–1355, 2008.
-  J. Wood, J. Dykes, and A. Slingsby. Visualisation of Origins, Destinations and Flows with OD Maps. The Cartographic Journal, 47(2):117–129, May 2010.
-  J. Wood, A. Slingsby, and J. Dykes. Visualizing the Dynamics of London’s Bicycle-Hire Scheme. Cartographica: The International Journal for Geographic Information and Geovisualization, 46(4):239–251, 2011.
-  Y. Yang, T. Dwyer, S. Goodwin, and K. Marriott. MapTrix online demo: http://vis.yalongyang.com/maptrix/index.html, 2016.