How to explore Facebook data at the command-line? (Part I — Data preview)

Published in

codeburst

3 min readAug 26, 2017

FB Like, share, comments analysis at the command-line, image adopted from the unsplash.com

Now that the Facebook.com has more than a billion of active users, it really has become a personal, product and corporate branding hub. Companies would like to understand what people think about topics related to their business, so they can make their products and marketing more relevant to their customers. One way to achieve such goal is to analyse company’s FB pages which can make marketing content more relevant for marketers.

Before we go any further, let’s setup our working environment by creating a folder on the Desktop. To do so, assuming we have a Linux based OS (e.g., Ubuntu) on our computer and let’s first fire up a command line and navigate to our analysis folder:

cd ~/Desktop
mkdir FBdata
cd FBdata

This will create a folder FBdata on your Desktop. Next, we download the data. In this project, we’re going to mine a data set generated by using a Facebook scraper on a particular Facebook page (undisclosed).

We use a data set generated by using a simple Facebook scraper. Image via unsplash.com

The goal of this experiment is to find the most vibrant status message on that page, with just one Bash command. You should download the data from below. Let’s save the data as: facebookdata.csv.

wget https://www.scientificprogramming.io/datasets/facebookdata.csv

Learning objectives

By completing this, you will learn to use the following Bash commands:

head – output the first part of files
tail – opposite to head
cat – concatenate and print files
sort – sort file contents
grep – search the input files for lines containing a match to a given pattern list
uniq – remove duplicate entries
awk – programming language
Bash functions

Preview

This dataset is also small (toy) and we could in principle open it in a text editor or in Excel. However, real-world datasets are often larger and cumbersome to open in their entirety. Instead, let’s get a sneak peak of the data using the command csvstat from the csvkit tool (pip install csvkit).

Stats

Finding the stat of the cols:

$ csvstat -n facebookdata.csvoutput1: status_id   
2: status_message   
3: link_name   
4: status_type   
5: status_link   
6: status_published   
7: num_reactions   
8: num_comments   
9: num_shares  
10: num_likes  
11: num_loves  
12: num_wows  
13: num_hahas  
14: num_sads  
15: num_angrys

Finding the stat of the rows:

$ csvstat --count facebookdata.csvoutputRow count: 3222

It looks like that the dataset has a total of 11 columns and 3222 rows.

Data Preview

This is often the first thing to do when you get your hands on new data; previewing it is important to get a sense for what it contains, how it is organized, and whether the data makes sense in the first place. To help us get a preview of the data, we can use the command head, csvlook and csvcut:

$ csvcut -c 1,4,7-11 facebookdata.csv | csvlook | head -n 50

The csvcut command helped us to cut (extract) a given set of columns (e.g., 1,4,7-11). Note that we have not previewed the column numbers 2 and 3 (status_message, link_name), which are wider columns and wouldn’t fit properly into our preview-screen above! See you soon in the Part II ⏰ !

Part II

codeburst

How to explore Facebook data at the command-line? (Part I — Data preview)

Learning objectives

Preview

Stats

Data Preview

Related works

Learn to Analyze Data in Bash Shell and Linux — Learn Scientific Programming

A simple course demonstrate the use of Bash shell in processing real-world data sets

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Published in codeburst

Written by Scientific Programming School

No responses yet