Women-In-Tech (WIT) group at Capital One is a great resource for women associates working in STEM fields. I regularly follow their slack channel to read about various accomplishments of my fellow women colleagues working across the organisation. Reading about them inspires me everyday. I have participated in few events hosted by this group and it has always been a joy to interact with these strong and inspiring leaders.
Attending this year’s WIT Professional Development Day was special for 2 reasons. First, because this was my first time attending this much-coveted, annual event. Second, because I’m at stage in my career…
Living in Washington DC for the past 1 year, I have come to realize how WMATA metro is the lifeline of this vibrant city. The metro network is enormous and well-connected throughout the DMV area. When I first moved to the Capital city with no car, I often used to hop on the metro to get around. I have always loved train journeys and therefore unsurprisingly, metro became my most favorite way to explore this beautiful city. On my travels, I often notice the product placements and advertisements on metro platforms, near escalators/elevators, inside the metro trains, etc. A good…
Random forest is one of the most popular and powerful machine learning algorithms. It is one of the algorithms that can used for both classification and regression tasks and therefore, it is one of the most used algorithms in the machine learning space.
Random Forest is a supervised learning algorithm. So what exactly is ‘Random Forest’? As the name suggests, this algorithm creates a ‘forest’ with a number of trees. The underlying logic of the algorithm is to have a higher number of trees in the forest to produce high accuracy in results. …
The city of San Francisco is pure magic and I had an amazing time visiting this Golden City a few months back. So when I chanced upon a very interesting data set of movies and series shot across several locations in San Francisco, I couldn’t stop myself from digging into it. Lo and behold, I found 4 interesting insights!
I recently completed Udacity’s course on A/B testing. It offers a high-level understanding of what a typical A/B test entails before diving into specifics of each stage in the process of experimentation. Needless to say, it was a great learning experience! In this post, I am going to summarize my key learning from the course and explain how it benefits a lot of companies which are focused on improving user experience.
So, let us dive in!
A/B testing is a method of experimentation to understand how user experiences change following any changes/variations made in the way they interact with a…
I found this interesting challenge posted by Airbnb on Kaggle 3 years ago. But it is never too late to get you hands dirty with a stimulating data challenge! :)
This Kaggle challenge presents a problem to predict which country will be a new user’s booking destination. For the purpose of this challenge, we will make use of three datasets provided by Airbnb.
Let us understand how user profile looks like in training and testing datasets.
There are 16 features to describe each user which are as follows
The ‘rvest’ library in R is the newest tool in my toolbox to web scrape websites! To test it out, I decided to scrape ‘Trustpilot’ website which is a popular platform for reviewers to review services and other websites. In this scraping exercise, I plan to extract reviewer ratings and remarks for Spotify. Let us see what users have to say about Spotify music!
I moved to Minneapolis this summer to pursue my graduate studies at Carlson School of Management. Having spent close to 6 months in the thriving city of Minneapolis, I can vouch for its exciting pub and bar scene. I decided to find the best bars in Twin cities and make a ready-reference directory using Python’s web scraping library ‘BeautifulSoup’. This directory would have the best 44 bars in the town with their addresses, phone number and website addresses.
So lets get started and put on our coding hats!
We start by importing the necessary libraries. BeautifulSoup library in Python is…
I am an avid Youtube user and love watching videos on it in my free time. I decided to do some exploratory data analysis on the youtube videos streamed in the US. I found the dataset on the Kaggle on this link
I downloaded the csv file ‘USvidoes.csv’ and the json file ‘US_category_id.json’ among all the geography-wise datasets available. I have used Jupyter notebook for the purpose of this analysis.
Loading the necessary libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from subprocess import check_output
from wordcloud import WordCloud, STOPWORDS
Nice Ride Minnesota is a bikesharing service with 3000+ bikes and 400+stations across Minneapolis. These bikes are faster than other ways of getting around. They can be conveniently dropped at any nearby Nice Ride station at the end of the ride. A single ride costs $2, a day’s pass with unlimited 30-minutes rides in a 24-hour period costs $6 and annual membership costs $75.
I decided to explore the bike history data from 2017 season to see if data shows any interesting trends or patterns. The dataset for 2017 is freely available on the following link.
I also found the…