{"id":312,"date":"2017-06-07T10:11:25","date_gmt":"2017-06-07T14:11:25","guid":{"rendered":"https:\/\/blogs.library.unt.edu\/digital-humanities\/?p=312"},"modified":"2017-06-07T10:11:25","modified_gmt":"2017-06-07T14:11:25","slug":"raw-data-visualization-hexagonal-binning","status":"publish","type":"post","link":"https:\/\/blogs.library.unt.edu\/digital-scholarship\/2017\/06\/07\/raw-data-visualization-hexagonal-binning\/","title":{"rendered":"RAW Data Visualization: Hexagonal Binning"},"content":{"rendered":"<a href=\"http:\/\/rawgraphs.io\/\">RAWGraphs<\/a> offers hexagonal binning as an option for representing dispersions in datasets with an exceptionally large number of data points. This visualization visually clusters the most populated areas on a gridded surface\u00a0and assigns a color based on the number of points in the region.\r\n\r\nThis example uses a public data set from <a href=\"http:\/\/www.kaggle.com\/\">Kaggle<\/a> of data from <a href=\"http:\/\/www.kaggle.com\/deepmatrix\/imdb-5000-movie-dataset\">5000+ movies on IMDB<\/a>. The x-axis shows IMDb movie ratings, and the y-axis displays gross revenue. Because there are so many data points in this set, it may be difficult to visualize the data in a clear and coherent way. However, hexagonal binning\u00a0simplifies the data by clustering and color-coding it.\r\n\r\nAfter setting up the visualization, this is what RAW gave me:\r\n\r\n&nbsp;\r\n\r\n<img loading=\"lazy\" decoding=\"async\" class=\"wp-image-324 size-full aligncenter\" src=\"https:\/\/blogs.library.unt.edu\/digital-humanities\/wp-content\/uploads\/sites\/20\/2017\/05\/hb_1.jpg\" alt=\"\" width=\"861\" height=\"519\" srcset=\"https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-content\/uploads\/sites\/20\/2017\/05\/hb_1.jpg 861w, https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-content\/uploads\/sites\/20\/2017\/05\/hb_1-300x181.jpg 300w, https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-content\/uploads\/sites\/20\/2017\/05\/hb_1-768x463.jpg 768w\" sizes=\"auto, (max-width: 861px) 100vw, 861px\" \/>\r\n\r\n<!--more-->\r\n\r\nBecause RAW randomly assigns colors for each cluster, this visualization does not mean a whole lot to someone looking at it for the first time, especially without a lengthy key explaining how many data points are contained in each hexagon by color. RAW also offers the option of displaying the visualization using a linear (numeric) color scale, in which case I get something that looks like this:\r\n\r\n&nbsp;\r\n\r\n<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-314 size-full\" src=\"https:\/\/blogs.library.unt.edu\/digital-humanities\/wp-content\/uploads\/sites\/20\/2017\/05\/hb_2-e1495041627617.png\" alt=\"\" width=\"844\" height=\"510\" srcset=\"https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-content\/uploads\/sites\/20\/2017\/05\/hb_2-e1495041627617.png 844w, https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-content\/uploads\/sites\/20\/2017\/05\/hb_2-e1495041627617-300x181.png 300w, https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-content\/uploads\/sites\/20\/2017\/05\/hb_2-e1495041627617-768x464.png 768w\" sizes=\"auto, (max-width: 844px) 100vw, 844px\" \/>\r\n\r\n&nbsp;\r\n\r\nThis looks better, but it is still not great in terms of detail. I can see where the most populated areas are but after that, everything looks pretty much the same. So, I thought this was the perfect opportunity to use a color palette tool!\r\n\r\nI used my favorite tool, <a href=\"http:\/\/tools.medialab.sciences-po.fr\/iwanthue\/index.php\">iWantHue<\/a>, to select a color palette based on a single color. The tool provided me with 40 different HEX codes that I could sort by brightness, starting with the dark colors and ending with the lightest colors. Then I plugged the codes into the key in RAW and got the following visualization:\r\n\r\n&nbsp;\r\n\r\n<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-315 size-full\" src=\"https:\/\/blogs.library.unt.edu\/digital-humanities\/wp-content\/uploads\/sites\/20\/2017\/05\/hb_3-e1495041656299.png\" alt=\"\" width=\"861\" height=\"509\" srcset=\"https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-content\/uploads\/sites\/20\/2017\/05\/hb_3-e1495041656299.png 861w, https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-content\/uploads\/sites\/20\/2017\/05\/hb_3-e1495041656299-300x177.png 300w, https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-content\/uploads\/sites\/20\/2017\/05\/hb_3-e1495041656299-768x454.png 768w\" sizes=\"auto, (max-width: 861px) 100vw, 861px\" \/>\r\n\r\n&nbsp;\r\n\r\nIn this case, color makes a big difference! I can now see that the darker regions are areas with more data points and the lighter regions have the fewest number of data points. This concept is much easier to explain to an audience than showing them a huge legend and asking them to interpret it.\r\n\r\nThis case provides an excellent example of how different open source tools can be combined to create unique and meaningful visualizations of data. There are so many tools available and countless ways to use them. Get out there and give it a try!","protected":false},"excerpt":{"rendered":"RAWGraphs offers hexagonal binning as an option for representing dispersions in datasets with an exceptionally large number of data points. This visualization visually clusters the most populated areas on a gridded surface\u00a0and assigns a color based on the number of points in the region. This example uses a public data set from Kaggle of data&#8230;  <a href=\"https:\/\/blogs.library.unt.edu\/digital-scholarship\/2017\/06\/07\/raw-data-visualization-hexagonal-binning\/\" class=\"more-link\" title=\"Read RAW Data Visualization: Hexagonal Binning\">Read more &raquo;<\/a>","protected":false},"author":69,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[9],"tags":[10,11,97,99,12,98],"class_list":["post-312","post","type-post","status-publish","format-standard","hentry","category-tools-and-toys","tag-data-viz","tag-free-tools","tag-iwanthue","tag-kaggle","tag-open-data","tag-rawgraphs"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8keRV-52","_links":{"self":[{"href":"https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-json\/wp\/v2\/posts\/312","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-json\/wp\/v2\/users\/69"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-json\/wp\/v2\/comments?post=312"}],"version-history":[{"count":5,"href":"https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-json\/wp\/v2\/posts\/312\/revisions"}],"predecessor-version":[{"id":387,"href":"https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-json\/wp\/v2\/posts\/312\/revisions\/387"}],"wp:attachment":[{"href":"https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-json\/wp\/v2\/media?parent=312"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-json\/wp\/v2\/categories?post=312"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.library.unt.edu\/digital-scholarship\/wp-json\/wp\/v2\/tags?post=312"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}