Difference between Anomalies and Outliers

Outlier = legitimate data point that’s far away from the mean or median in a distribution

Anomaly = illegitimate data point that’s generated by a different process than whatever generated the rest of the data

Ravi Parikh has written a very interesting blog on this topic – Garbage In, Garbage Out: How Anomalies Can Wreck Your Data. The blog talks more about anomalies and how to detect them through proper visualization technique. He gives an example of detecting election fraud through the following visualization:


Do read the full post!

Interesting Question

Do you have the capability to assess data quality? Or even suggest appropriate analysis visualizations to help distinguish between Anomalies and Outliers? … Vijay Ghei


2 Responses to “Difference between Anomalies and Outliers”
  1. I found great post such as your article, its gave me lot of information.
    Further Information

  2. nklata says:

    I have been following Ravi’s blogs and he does make a lot of interesting and valid observations!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: