A Step by Step Backpropagation Example


backpropagation and gradient descent in neural network is most important topic. Because it combines with mathematics as foundation which uses derivates ,it becomes harder to understand . No Worries, This blog post demystified this hard concept by breaking down the problem into step by step derivation.
My suggestion while reading this blog ,keep a pen and paper to do yourself .Happy Learning.

Matt Mazur

Background

Backpropagation is a common method for training a neural network. There is no shortage of papers online that attempt to explain how backpropagation works, but few that include an example with actual numbers. This post is my attempt to explain how it works with a concrete example that folks can compare their own calculations to in order to ensure they understand backpropagation correctly.

If this kind of thing interests you, you should sign up for my newsletter where I post about AI-related projects that I’m working on.

Backpropagation in Python

You can play around with a Python script that I wrote that implements the backpropagation algorithm in this Github repo.

Backpropagation Visualization

For an interactive visualization showing a neural network as it learns, check out my Neural Network visualization.

Additional Resources

If you find this tutorial useful and want to continue learning about neural networks, machine learning, and…

View original post 1,794 more words

Introduction to Anomaly Detection: Concepts and Techniques


Great Article about Anomaly detection and techniques from @srinath_perera .

My views of the World and Systems

Why Anomaly Detection?

burglar-157142_640Machine Learning has four common classes of applications: classification, predicting next value, anomaly detection, and discovering structure. Among them, Anomaly detection detects data points in data that does not fit well with the rest of the data. It has a wide range of applications such as fraud detection, surveillance, diagnosis, data cleanup, and predictive maintenance.

Although it has been studied in detail in academia, applications of anomaly detection have been limited to niche domains like banks, financial institutions, auditing, and medical diagnosis etc. However, with the advent of IoT, anomaly detection would likely to play a key role in IoT use cases such as monitoring and predictive maintenance.

This post explores what is anomaly detection, different anomaly detection techniques,  discusses the key idea behind those techniques, and wraps up with a discussion on how to make use of those results.

Is it not just Classification?

The answer is yes if…

View original post 2,587 more words

R in Data Profiling and Cleansing


Data Profiling and data cleansing is one of the essential steps in data processing. Poor data quality and analysis on dirty data is a primary reason for business insights failure. Most of the time in project timeline spent in cleaning, quality check, standardize the data and right format for use. This process also collectively called as Data Munging.

  • Basic summary layout of the Data
  • Converting to appropriate data types
  • Correcting the Missing Values
  • Normalizing the Data
  • Use of DataCombine package
  • Reshaping the Data using reshape2 package

ETL Developers spends much of their time on rectifying the errors in source data and apply some of the quality check before loading to database for downstream users. In Data Insights Projects, Data Analyst does the same kind of data scrubbing before applying any statistical operation. Data Cleaning is the process of transforming raw data into consistent Data. Some of the common Data errors like Missing Data, Datatype mismatch, Special Character, White space, date format error, etc.

Dirty Data

I have modified the correct dataset to dirty dataset in order to work on some of the data cleaning process examples.Lets apply some of the data cleansing techniques using R before applying the statistical analysis .

Please refer the code snippets and data sets in the below github link

https://github.com/kannandreams/R_Snippets/blob/master/R_DataProfile.R

Using R in Extract , Transform and Load


Business Intelligence is umbrella term includes ETL, Data Manipulation, Business Analytics, Data Mining and Visualization. It may relate with other trending statistics techniques. Lets study most commonly used techniques in BI and applies to achieve our goal by building our sample BI Application. We will use R Language (Open source Software for statistical computing and graphics) to build end-to-end business intelligence application or platform. R has powerful packages for Data Analysis, Regression Modeling and Visualization.

Overview of ETL

  • Acquiring the Data
  • Extracting the Data from the CSV
  • How to Import the Data from Relation Databases
  • Data Transformation using dplyr Package
  • Exporting the transformed Data

ETL

The main functionality of Data Integration in Business Intelligence is to acquire the data from various source systems, change or modify the format of the data matches with Target and loading the data into the Target System. These processes are called Extract, Transform and Load. There are different products available in market to perform the ETL process and used based on criteria like Infrastructure, Scalability, Price, etc.  I will see how to implement Import the Data from sources, applying important Transformation techniques and load the data into Target quickly using R.

Acquiring the Data

To start building our Business Intelligence Application we need a data. Let’s use some simple and trending Bike Sharing data set. Bike sharing systems are for renting bicycles where the process of obtaining membership, rental, and bike return is automated via a network of kiosk locations throughout a city. Using these systems, people are able rent a bike from a one location and return it to a different place on an as-needed basis. The Dataset collected from popular Data Science and analytics competition website

http://www.kaggle.com/c/bike-sharing-demand/data

The data set we selected for our development is in CSV format. But there might be requirement for you or need to extract the data from Database tables, Website, etc. For the reason, example code or recommended packages and links will be shared also.

Extracting the Data from Source

Before starting, I assume that you have R Studio installed in your platform. Readers can find plenty of tutorials online to learn R Basics and basic knowledge will be helpful and essential.

Please refer complete code used in this post.

https://github.com/kannandreams/R_Snippets/blob/master/R_ETL.R

Importing the Data from Relational Database

Generally any relation database can be accessed through ODBC Interface. R has provided RODBC library to connect and access. Advantage of using RODBC library is the function argument remains same and only the connection string details need to be changed according to our database needs. But there are specific backend libraries to connect with different databases. With respect to performance in data operations, using the native package will be good. Some of them are RMySQL, ROracle, RpgSQL. Consider as scenario, we wanted to access the same bike sharing data from any one of the database. Below is an example of reading from ODBC database and it will be helpful if you are using any source as database for your BI application development.

Importing Data from Non-Relational Databases

Non-Relational databases are like document based, key value store and used to store unstructured data which are different from RDBMS model. It will be mostly used in big data environment. R has provided various package to support and pull data from non-relational databases. For example, RCassandra library used to connect with Cassandra, rmongodb library for MongoDB.

Importing XML and JSON Files

XML and JSON are file format are mostly used in Web Service to transmit the data for request and response. XML and rjsonio libraries in R can be used to import XML and JSON Type file sources. Please refer the package on CRAN Website

http://cran.r-project.org/web/packages/XML/index.html

http://cran.r-project.org/web/packages/RJSONIO/index.html

Transforming the Data

In the previous section we learned how to bring the data from various source into R. Now we have the raw data in R in the form of data frame. Data Manipulation is very important part of ETL process by applying the business rules to output the data we required and pass them to target system.

Using dplyr Package

dplyr is a package which provides a set of tools for efficiently manipulating datasets in R.  For Example, In SQL, there are some of commonly used data manipulation techniques like selecting the columns, filtering the records, sorting, grouping, etc. Similarly we can do the same functionalities in R using dplyr.

Deriving new Attributes or Metrics

Using dplyr package, we can even preform multiple operations in single line by nesting the function over function. It is similar like writing SQL Queries. Let’s see some example below with additional options like ordering by and derive new column based on existing columns. Creating derived columns based on other columns is called mutate.

Exporting the Data to Target

We are now at the end of the ETL process stage. Manipulated Data resides in R as dataframe. Now we should know how to export this dataframe into Target system or Outbound files like CSV, Tab, and Text file.

Exporting the Data to Database

We have seen how to import the data from database using ODBC library. There are other packages on CRAN which provide interface to database. Let consider our Target database is SQLite (Open Source transactional SQL Relational database). Please refer https://sqlite.org/ to understand the features .We wanted to load the data frame from R into target table resides in SQLite .For this, RSQLite Package provide an interface to SQLite database. SQLite also uses the same syntax of SQL Language.Any Database starts with creating a database and creating the table definition and loads the data with SQL Data Manipulation Language. We can achieve everything within R through RSQLite package.

SlideShare Web Traffic Data Visualization using R


I have extracted my Slide share Web Data and did some  Data wrangling and Visualization.You can do the same by downloading the data from your slide share account .

R Code ( Other unwanted code snippets also exists which can be omitted ) and File Definition available at https://github.com/kannandreams/slideshare

Basic Data Cleansing process  :

  • Missing values
  • Group various traffic source into channel grouping
  • generalized the various organisation to simple groups I needed.
  • Date format – grouping to Year-Month

12_slidesharedata

Visualization using GoogleVis and ggplot Package in R . Merged the charts using gVisMerge. 

Slideshare_Visualization

R Data Visualization Cookbook


Finally it Happened ,My Name in book . One of the wish fulfilled in my bucket list and soon will be Author for a book 🙂

The ‘R Data Visualization Cookbook‘ – I am part of this book as Technical Reviewer has just been published by PACKT Publishing https://www.packtpub.com/

Inline image 1

About the book :

 Over 80 recipes to analyse data and create stunning visualizations with R. This book is for New and Intermediate Data Enthusiasts who interested to learn about Data visualization using R and Data Exploration Techniques. I have attached the Table of Content contains list of Visualization Graphs and techniques covered in this book.

It’s a great learning experience for me.

 What is my Contribution to this book:

  • Reviewed each chapters and Code Snippets used in this book.
  • Validated the R Code, Sample Data set used , Visualization output and recommend to change the code for better understanding and Bug fixing .
  • Chapter improvements 
About Me in Reviewer Section :
RDataVisualization_aboutme

Maachu Picchu !!! – Pencil Drawing


After long time, i enjoyed my pencil work . I have done this pencil drawing to display in my Office Meeting Room named as ” Maachu Picchu “. I should thank my Mentor Raghu  for giving this opportunity 🙂 Maachu Picchu ( Location – Peru  and one of the most important archaeological sites in the world. http://en.wikipedia.org/wiki/Machu_Picchu I enjoyed and happy when i see the result came out after 3 hours of effort . I have lost the touch with drawing. there are flaws in it but will do better next time for sure. IMG_20150110_152418008IMG_20150110_172450456    1420895035581

Real Photograph of Maachu Picchu

machu-picchu-peru

 

My version of Walk to Remember


I have started this blog some years back and moved away from old blog which was purely about Oracle.

My thought is  to post about my own perspective of life , Technology , entrepreneurship and share my knowledge with this world. But my bad . Still I have not posted anything other than technology. Ok, Now I felt I should do that and thought of sharing this.

Let me start by sharing one Tamil ( my native Language ) poem wrote a two years back.  I Wrote this on the day when my life was bored, felt lonely and Things around me are moving faster than me. I was staying at Indira Nagar Bangalore near to 100 Ft road  . It was a pleasant evening with drizzling rain . Got out of my home and started walking aimlessly …

Just slow walk with melodies song running inside me . Felt like want to write a poem of this day.  ” Walk to Remember” came to my mind and looked around me .

Penned these beautiful moments  on the way  through my poem.

Still Life is beautiful with Lots of Hope and Dreams to come ….

292823_10151137363848303_704745866_n