top of page
  • Writer's pictureFrancisco Gajardo

Play Store in Sketches

Starting from the data available in the Play Store Dataset (9,660 datapoints) we explore some ideas to visualize the info in this set. For those who don't know anything about this dataset, we should start saying that the information available in it was scraped from the Playstore in August 2018 and more information can be obtained from Kaggle where his author gives a small intro to the information (https://bit.ly/2PSvQRh). However, we should at least state here that the dimensions available in this dataset are name of the app, category, rating, number of reviews, size of the file (MB), number of installs, last update (date), among others.


With this information we propose 6 first sketches as an approach to understand this information and how to relate it temporarily to the evolution of Android itself and also to the evolution of the apps (size, popularity in terms of categories, etc).


Sketch 1: "Neural network"

My first sketch was inspired by the "Neural Network" type of plot, but it's actually a mix of several techniques. This is

the one I liked the most in terms of how it relates time to other variables. The variable time is contained in the circumference from 2008 to 2019 containing the date of the last update for every app but also the total of apps available for a certain version of Android (encircled). At the same time it illustrates other variables as downloads/reviews (bubble size), rating (hue), categories (clusters), etc.


,

Sketch 2: "Parallel Edge Splatline"

On Sketch 2, I used as inspiration the "Parallel Edge Splatline" models we've seen in classes, adjusting those models to the information I have available in my dataset. Here, we can recognize by colors the evolution in number of updates for each category as Android updates its OS version. Each color line represent an app, and we can follow the average size, rating or number of reviews (ideally we should be able to select this field) through time.


Sketch 3: "Sankey Diagram"

A third option (Sketch 3) is a "Sankey Diagram" type plot. This is a kind of flow chart used in different contexts. In my case, I thought it could be useful to graph the evolution of Android and its versions and how the apps available in August 2018 (date of the datset) relate to these versions: How many apps are updated for each one of them? What is the composition in terms of categories?. The heigh of the bars is also suppose to contain information, in this case the amount of downloads/reviews.


Finally, other 3 options are sketches 4, 5 and 6 (In the gallery below). The first of them explores the "Sorted Steam graph" technique, and intends to show the evolution of every category (hue) in terms of amount of apps for every Android version (width of the lines) with other variables as size or rating (selecting a field). Other option is the "Hive plot" in sketch 5, which however have as limitation the few dimensions that we could add on it. And lastly, I added a "Pictorial chart" which is a friendly way to illustrate the data, and I would consider as an extra to be able to visualize some specific data but not as main plot for this dataset.



1 view0 comments
bottom of page