Is a pantomath and a former entrepreneur. Currently, he is in a harmonious and a symbiotic relationship with Data.

‘If You Smell, What The Rock Is Cooking!’

This is exactly what came to my mind when I first read about an hierarchical clustering algorithm called ‘ROCK’. The creators of this technique unknowingly drew up an analogy between Dwayne Johnson, the versatile American *actor, producer, retired professional wrestler, and former American football and Canadian football player*, and a clustering technique, which solves the problem of using data such as his* *long* *list of achievements as a variable in a clustering exercise.

In other words, **ROCK**, not Mr. …

Fuzzy logic principles can be used to cluster multidimensional data, assigning each point a *membership* in each cluster center from 0 to 100 percent. This can be very powerful compared to traditional hard-threshold clustering where every point is assigned a crisp, exact label. This algorithm works by assigning membership to each data point corresponding to each cluster center on the basis of distance between the cluster center and the data point. More the data is near to the cluster center more is its membership towards the particular cluster center. …

The exploratory nature of data analysis and data mining makes clustering one of the most usual tasks in these kind of projects. More frequently these projects come from many different application areas like biology, text analysis, signal analysis, etc. that involve larger and larger datasets in the number of examples and the number of attributes .The biggest challenge with clustering in real-life scenarios is the volume of the data and the consequential increase in the complexity, and need for more computational power.

These problems have opened an area for the search of algorithms able to reduce this data overload. Some solutions…

Clustering is a process to group data into several clusters or groups so the data in one cluster has a maximum level of similarity and data between clusters has a minimum similarity. K-means (Duda & Hart, 1973; Bishop, 1995) has long been the workhorse for metric data. Its attractiveness lies in its simplicity, and in it’s local-minimum convergence properties. The center of the cluster or centroid is the starting point for the group in clusters in the K-Means algorithm. …

This article assumes that you have a basic understanding of A/B Testing and statistical tests. Here we will discuss some tips to ensure the success of A/B Tests.

A/B testing (also known as split testing or bucket testing) is a method of comparing two versions of a webpage or app against each other to determine which one performs better. AB testing is essentially an experiment where two or more variants of a page are shown to users at random, and statistical analysis is used to determine which variation performs better for a given conversion goal.

**Variant**: Variant is the term…

These days analytics professionals favor Neural Networks (NN) over SVM’s for want of higher accuracy. We can find many papers that prove the superiority of NN over SVM. This is also due to the fact that if one can train a NN that performs better than SVM, then it becomes an opportunity to publish a paper. However, a paper is less likely to be published if SVM scores over NN!

In this context, this article explores the superiority of SVM —* the crumbling hero*, over NN.

Before getting down to business, let us first look at the intuitive difference between…

Kernels or kernel methods (also called Kernel functions) are sets of different types of algorithms that are being used for pattern analysis. They are used to solve a non-linear problem by using a linear classifier. Kernels Methods are employed in SVM (Support Vector Machines) which are used in classification and regression problems. The SVM uses what is called a “Kernel Trick” where the data is transformed and an optimal boundary is found for the possible outputs.

SVM algorithms use a set of mathematical functions that are defined as the kernel. The function of kernel is to take data as input…

The name xgboost, though, actually refers to the engineering goal to push the limit of computations resources for boosted tree algorithms. Which is the reason why many people use xgboost — Tianqi Chen.

**XGBoost** or e**X**treme **G**radient **Boost**ing is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework. XGBoost provides a parallel tree boosting (also known as GBDT, GBM) that solve many data science problems in a fast and accurate way. The same code runs on major distributed environment (Kubernetes, Hadoop, SGE, MPI, Dask) and…

The Exponential distribution is the only memoryless continuous distribution. It’s cousin — Geometric distribution is also memoryless, but it is a discrete distribution.

I was recently working on a Market Mix Model, wherein I had to predict sales from impressions. While working on an aspect of it I was confronted with the problem of choosing between a Random Forest and a XG Boost. This led to the inception of this article.

Before we get down to the arguments in favor of any of the algorithms, let us understand the underlying idea behind the two algorithms in brief.

The term gradient boosting consists of two sub-terms, gradient and boosting. Gradient boosting re-defines boosting as a numerical optimization problem where the objective is to minimize…