1,a) 2 2 Mosiac Matrix Mosaic Matrix High-Dimensional Data Visualization Using a Color Representation of Features Hiroaki Kobayashi 1,a) Kazuo Misue 2 Jiro Tanaka 2 Abstract: Due to the displays limitation in size, it is difficult to obtain an overview of high-dimensional data in the area of the display that displays the results of visualization. In this paper, we aimed to obtain an overview of the high-dimensional data in a limited area of the screen. We developed Colored Mosaic Matrix as a method to obtain an overview of high-dimensional data. Colored Mosaic Matrix is a visualization method for high-dimensional categorical data, using a color representation of the features. By representing the quantitative data in units of categories, it enables the visualization of high-dimensional data with a large number of records. As a result of investigating the readability by experiments, we have found our method to be useful in obtaining an overview of high-dimensional data. 1. 10 1 College of Information Science, School of Informatics, University of Tsukuba 2 Faculty of Engineering, Information and Systems, University of Tsukuba a) hiroaki@iplab.cs.tsukuba.ac.jp Scatterplot Matrix[1] Scatterplot 1 Sips [2] VisBricks[3] c 2013 Information Processing Society of Japan 1
RadViz[4], [5] 2 30 HD 1920 1080 4 2. 2.1 Scatterplot Matrix Scatterplot Matrix[1] Scatterplot SCATTERDICE[6] Scatterplot Matrix 2 Scatterplot Scatterplot 2.2 Parallel Coordinates Plot Parallel Coordinates Plot PCP [7], [8], [9] Scatterplot Matrix 1 1 PCP PCP 2.3 Fua [10] PCP Feng [11] PCP Scatterplot Two-Tone Pseudo Coloring[12] 1 3. 3.1 [13] 1 c 2013 Information Processing Society of Japan 2
Fig. 2 2 Examples of the coloring technique focused on category 1 Mosaic Plot Fig. 1 Generation process of Colored Mosaic Plot 3.2 Mosaic Matrix 1 (a) (b) Mosaic Plot[14], [15] 2 1(c) Mosaic Plot Mosaic Plot 1(d) Mosaic Plot Mosaic Plot Mosaic Plot Mosaic Plot Mosaic Matrix 3.3 3.3.1 Mosaic Plot 2 Mosaic Plot 1 Y 2 Mosaic Plot 2(a) X X X 2(a) 2(b) Y Y X Y 3.3.2 Mosaic Plot 2 Mosaic Plot Mosaic Plot 2 ( 1 ) 3 1 1 c 2013 Information Processing Society of Japan 3
Fig. 3 Fig. 4 3 ( 1) Examples of the coloring technique focused on the correlation (pattern 1) 4 ( 2) Examples of the coloring technique focused on the correlation (pattern 2) Y 1 Mosaic Plot ( 2 ) 4 1 1. 3.4 Mosaic Plot 3.4.1 3.4.2 3.4.3 p p 2 Mosaic Matrix 4. 4.1 Matrix View Detail View 4.2 4.2.1 Matrix View Matrix View Mosaic Matrix Mosaic Matrix Matrix View Mosaic Plot 4.2.2 Detail View Detail View 1 Mosaic Plot Mosaic Plot Mosaic Plot X Y c 2013 Information Processing Society of Japan 4
5 Fig. 5 Screenshot of the analytical tool 5. Mosaic Plot Mosaic Plot Mosaic Matrix 6 7 Fig. 6 Example of displaying the correlation record selection Fig. 7 Changes due to the coefficient 4.2.3 Area Graph Detail View Area Graph Area Graph 4.3 5 Matrix View Detail View Matrix View Mosaic Plot Mosaic Plot 6 Matrix View Detail View Mosaic Plot Mosaic Plot Detail View Detail View 7 2 5.1 Mosaic plot Scatterplot 5.1.1 7 5.1.2 ( 1 ) ( 2 ) ( 3 ) ( 4 ) 5 5.2 5.2.1 30, 000 16 c 2013 Information Processing Society of Japan 5
8 Fig. 8 5.2.2 Screenshot of the experiment tool 8 Mosaic Plot Scatterplot 5 Scatterplot Mosaic Plot ColorC X ColorX Y ColorY 5.2.3 Area Graph 5 1 { 700 pixel with Area Graph without Area Graph 24 pixel without Area Graph (1) 12 pixel without Area Graph 6 pixel without Area Graph P (n) 4 P (16) Mosaic Plot P (8) P (4) Scatterplot (2) 20 9 Fig. 9 The average percentage of correct answers in task parameters for each condition 6 120 (2) 2 5.3 9 Area Graph AG P (n) 5.4 1 5% t t ν = 6 5.4.1 Mosaic Plot P (n) µ(n) 24 12 6 µ(16) µ(4)(p = 0.2894) µ(8) µ(4)(p = 0.0167) 5.4.2 Area Graph Area Graph Area Graph t Area Graph c 2013 Information Processing Society of Japan 6
1 t Table 1 The average percentage of correct answers in each visualization method and the results of t-test n 700 24 12 6 µ C (n) 92.86% 80.95% 88.10% 80.95% µ S (n) 95.24% 83.33% 54.76% 38.10% p 0.6036 0.8588 0.0177 0.0057 Area Graph Area Graph Area Graph 5.4.3 ColorC<ColorX<ColorY ColorC ColorC ColorX ColorY 5 ColorC ColorC ColorX ColorY ColorC ColorC 5.4.4 Scatterplot Area Graph Mosaic Plot Scatterplot 1 n Mosaic Plot µ C (n) Mosaic Plot µ S (n) t p 1 n = 700 n = 24 5% n = 12 n = 6 10 Fig. 10 Cases visualization of the weather data with this tool µ C (n) µ S (n) µ C (n) > µ S (n) Mosaic Plot Scatterplot 12 Mosaic Plot Mosaic Plot Mosaic Matrix 6. 6.1 *1 37 [2011/9/1-2012/8/31] 366 1 1 1 1830 6.2 5 5 Matrix View Y *1 http://www.data.jma.go.jp/obd/stats/etrn/index.php c 2013 Information Processing Society of Japan 7
10(a) 10(a) Mosaic Plot X Y 10(b) Y Detail View X Y Mosaic Plot Detail View 10(c) Detail View X Y 10(c) 10(d) Detail View 11 2 7. Mosaic Matrix Mosaic Matrix Mosaic Matrix Mosaic Matrix Mosaic Plot Mosaic Plot Mosaic Plot Mosaic Matrix [1] D. B. Carr, R. J. Littlefield, W. L. Nicholson and J. S. Littlefield. Scatterplot Matrix Techniques for Large N. In JASA 87, Vol. 82, No. 398, pp. 424 436, 1987. [2] M. Sips, B. Neubert, J. P. Lewis and P. Hanrahan. Selecting good views of high-dimensional data using class consistency. In IEEE-VGTC Symposium on Visualization, Vol. 28, No. 3, pp. 831 838, 2009. [3] A. Lex, H.-J. Schulz, M. Streit, C. Partl and D. Schmalstieg. VisBricks: Multiform Visualization of Large, Inhomogeneous Data. In TVCG 11, Vol. 17, No. 12, pp. 2291 2300, 2011. [4] L. Nováková and O. Štěpánková. Multidimensional clusters in RadViz. In SMO 06, pp. 470 475, 2006. [5] J. Sharko, G. Grinstein and K. A. Marx. Vectorized Radviz and Its Application to Multiple Cluster Datasets. In TVCG 08, Vol. 14, No. 6, pp. 1444 1451, 2008. [6] N. Elmqvist, P. Dragicevic and J.-D. Fekete. Rolling the Dice: Multidimensional Visual Exploration using Scatterplot Matrix Navigation. In TVCG 08, Vol. 14, No. 6, pp. 1141 1148, 2008. [7] A. Inselberg and B. Dimsdale. The plane with parallel coordinates. The Visual Computer, Vol. 1, No. 4, pp. 69 91, 1985. [8] F. Bendix, R. Kosara and H. Hauser. Parallel Sets: Visual Analysis of Categorical Data. In InfoVis 05, pp. 133 140, 2005. [9] Z. Geng, Z. Peng, R. S. Laramee, R. Walker, and J. C. Roberts. Angular Histograms: Frequency- Based Visualizations for Large, High Dimensional Data. In TVCG 11, Vol. 17, No. 12, pp. 2572 2580, 2011. [10] Y.-H. Fua, M. Ward and E. Rundensteiner. Hierarchical Parallel Coordinates for Exploration of Large Datasets. In VIS 99, pp. 43 50, 1999. [11] D. Feng, L.Kwock, Y. Lee and R. M. Taylor. Matching Visual Saliency to Confidence in Plots of Uncertain Data. In TVCG 10, Vol. 16, No. 6, pp. 980 989, 2010. [12] T. Saito, H. N. Miyamura, M. Yamamoto, H. Saito, Y. Hoshiya and T. Kaseda. Two-tone pseudo coloring: compact visualization for one-dimensional data. In Info- Vis 05, pp. 173 180, 2005. [13] B. Shneiderman. The eyes have it: A task by data-type taxonomy for information visualizations. In Proceedings of the Symposium on Visual Languages, pp. 336 343, 1996. [14] M. Friendly. Mosaic Displays for Multi-Way Contingency Tables. In JASA 94, Vol. 89, No. 425, pp. 190 200, 1994. [15] M. Friendly. Extending Mosaic Displays: Marginal, Conditional, and Partial Views of Categorical Data. In JCGS 99, Vol. 8, No. 3, pp. 373 395, 1999. c 2013 Information Processing Society of Japan 8