Insights from data? What to look for when search for insights?

This post is a revised version of presentation for Hack Santa Monica Meetup on November 17th, 2016. To learn more about Hack Santa Monica Meetup, please look here

Hack Santa Monica Meetup was initiated by non-profit group SixThirty Group (formerly known as Team SixThirty). For more information of SixThirty Group, please check here


Often in Business Intelligence field, people would say the first thing to do is to visualize data. However, one might be confused, among various visualization methods, what are the one visualization method that will reveal insights? And more exactly, what are called insights?

Insights, as it literally suggests, are things that are embedded in data and waiting for you to discover. That is somewhat a romantic way to put it, and the process of discovering one takes lots of patience and cautious. Different people might be looking at different things. Accountants may look at ledgers and balance sheets, whereas economists may look at annual labor data or stock market data.

Here I use data from Santa Monica Open Data Portal, which is a data set describing Santa Monica’s fire report records.

Key Takeaways

  • Look for pattern
  • Look for anomaly
  • Knowing the context

Continue reading

Advertisements

Data Transform using pandas library in Python

“Over half of the time, analysts are trying to import/cleaning the data.”

— By numerous John/Jane Does of data analysts

Data these days can be flown in from various sources: web, database, local files, user input, etc. Analysts now often have to work with various format of data input, in order to make them compatible with each other for analysis. Though sometimes considered to be a data engineer’s work, data preparation is still an essential skills for all data analysts, especially those who work in small to medium size firms (as I am doing now).

I am going to introduce data reading/manipulation with pandas library in Python 3. I have recently worked extensively with pandas in Python 3 and started realized the powerful component in the library. In this post, I will the one I used most frequently, groupby() with pandas.

Continue reading