codeburst

Bursts of code to power through your day. Web Development articles, tutorials, and news.

Follow publication

How to explore Facebook data at the command-line? (Part I — Data preview)

--

FB Like, share, comments analysis at the command-line, image adopted from the unsplash.com

Now that the Facebook.com has more than a billion of active users, it really has become a personal, product and corporate branding hub. Companies would like to understand what people think about topics related to their business, so they can make their products and marketing more relevant to their customers. One way to achieve such goal is to analyse company’s FB pages which can make marketing content more relevant for marketers.

Before we go any further, let’s setup our working environment by creating a folder on the Desktop. To do so, assuming we have a Linux based OS (e.g., Ubuntu) on our computer and let’s first fire up a command line and navigate to our analysis folder:

cd ~/Desktop
mkdir FBdata
cd FBdata

This will create a folder FBdata on your Desktop. Next, we download the data. In this project, we’re going to mine a data set generated by using a Facebook scraper on a particular Facebook page (undisclosed).

We use a data set generated by using a simple Facebook scraper. Image via unsplash.com

The goal of this experiment is to find the most vibrant status message on that page, with just one Bash command. You should download the data from below. Let’s save the data as: facebookdata.csv.

wget https://www.scientificprogramming.io/datasets/facebookdata.csv

Learning objectives

By completing this, you will learn to use the following Bash commands:

  • head – output the first part of files
  • tail – opposite to head
  • cat – concatenate and print files
  • sort – sort file contents
  • grep – search the input files for lines containing a match to a given pattern list
  • uniq – remove duplicate entries
  • awk – programming language
  • Bash functions

Preview

facebookdata.csv stats using csvstat

This dataset is also small (toy) and we could in principle open it in a text editor or in Excel. However, real-world datasets are often larger and cumbersome to open in their entirety. Instead, let’s get a sneak peak of the data using the command csvstat from the csvkit tool (pip install csvkit).

Stats

Finding the stat of the cols:

$ csvstat -n facebookdata.csvoutput1: status_id   
2: status_message
3: link_name
4: status_type
5: status_link
6: status_published
7: num_reactions
8: num_comments
9: num_shares
10: num_likes
11: num_loves
12: num_wows
13: num_hahas
14: num_sads
15: num_angrys

Finding the stat of the rows:

$ csvstat --count facebookdata.csvoutputRow count: 3222

It looks like that the dataset has a total of 11 columns and 3222 rows.

Data Preview

This is often the first thing to do when you get your hands on new data; previewing it is important to get a sense for what it contains, how it is organized, and whether the data makes sense in the first place. To help us get a preview of the data, we can use the command head, csvlook and csvcut:

$ csvcut -c 1,4,7-11 facebookdata.csv | csvlook | head -n 50
FB data preview

The csvcut command helped us to cut (extract) a given set of columns (e.g., 1,4,7-11). Note that we have not previewed the column numbers 2 and 3 (status_message, link_name), which are wider columns and wouldn’t fit properly into our preview-screen above! See you soon in the Part II !

Part II

Related works

--

--

Published in codeburst

Bursts of code to power through your day. Web Development articles, tutorials, and news.

No responses yet

Write a response