Check this article !
Check this article !
Just back to my TechBlog, as I have been *busy* this week.This gave me a thought, “Hmmmm…How busy are other people in US?”
I found some interesting articles re “how busy American ppl are, and how the world spend its time online”. Have a look! =)
source: How Busy are Americans?
Another one: How The World Spends Its Time Online
They also provide Census Data APIs for developers to access data, send/recv query requests/responses, and JSON scripts as well.
Their web pages are well established and data sets are comparatively well presented on the public domain. For instance, the captured image above is from Coastline County Population, which enables end-users to control the slider at the bottom to show the changes in time (years) and the accumulated population in coastline counties in USA.
Most of the visualization here are, however, just 2D plain graphs that are just for 1 or 2D data arrays. It would be much better and effective if there are some fast & light-weighted 3D interactive visualization which provides more interactive controls for the end-users for in-depth data analysis and exploration. But still, data sets here are very well presented in my opinion, as a Data Visualization Expert, Senior Software Engineer, and an Assistant Professor in Computer Science.
This small tool, Single Variable View svVu was developed for in-depth data analysis on single feature (attribute) variable as a part of research collaboration work with research team in School of Computer Science, Edith Cowan University, Western Australia.
Functionalities of this tool;
1. Loading N-dimensional data files,
2. Selecting one of the N-dimension features,
3. Selecting one of the data distribution function types (Gaussian Normal Distribution by default),
4. (‘View’ button) Plotting distribution functions for each selected feature w.r.t. all the output decision classes,
5. Showing the “membership values (closeness / belonging degrees)” of a new unknown input value within [min max] range (‘Analyze’ button),
6. Displaying statistics of the selected feature variable,
7. Calculating overlap degrees of distribution of the selected feature variable w.r.t. its output classes,
8. Calculating SNR (Signal-to-Noise Ratio) for the selected feature to show how much its output groups are separated from each other,
9. Calculating membership degrees for the unknown ‘test’ N-dimensional data points and Generating the list of rank on those unknown test data set.
Even though this is a small tool for data analysis, it has achieved a significant part of the research for in-depth data analysis.
This is an on-going project which I have started and supervised for one of the Visualization projects at the School of Computer Science, Edith Cowan University, Western Australia.
The ultimate main purpose for data visualization is to provide the followings to data analysts, scientists, statisticians, engineers and business stakeholders;
1. Intuitive visual displays for data points, clusters, distribution, and structures,
2. In-depth data analysis at a glance,
3. Trends of data in the past and current, also for prediction on possible future data patterns,
4. Specific data examination / investigation in a particular domain.
The interactive data visualizer v.0.15 shown above is the first interactive data visualization tool (java applet) I have developed with two students in Java for its first prototype. It loads data files – there are 10 example data files retrieved from UCI Machine Learning Repository and shows data points, data distribution in Gaussian-basis distribution functions, and statistics using box plots for each attribute/feature.
I also have been working on other data visualization applications for the last two years and at the moment. Once ready, I will put them on here to show the prototypes and the details.
HeatMap is another data visualization tool which has been widely used for examining correlation between data rows or columns.
All three pictures above show correlations between features (column data values) in regards to ‘how much they are co-related to each other’. For instance, if their correlation is very high (i.e., very similar) then the color of the each small square dot will be 100% black in first BLACK-WHITE map, and 100% red in second RED_GREEN and third RED_BLUE maps. The data set used here is one of the Alzheimer Disease data sets that I utilized for collaboration work between CSJL solutions and my research team in Edith Cowan University.
This is one way of showing or revealing relationships of data points on the given data set. Obviously, it is much easier to pick up some facts by looking at these pictures, rather than staring at a bunch of high precision numbers.