Recently, in my data science program, we were given a project where we needed create a model to predict house prices. The training data set we received included property data relating to square feet, bathroom count, construction grade, etc. The data I was most excited to begin my analysis with was the location data. We were given columns containing the latitude and the longitude coordinates and instantly the first thing I wanted to do was to create a heat map showing the distribution of property prices.
Let’s take a peek at our data:
import pandas as pd
import numpy as np
Utilizing Grid-Search on Hyperparameters allows data scientists to optimize how well their models are learning from the data. Before we dive into what hyperparameters are and how grid-search applies to them, let’s figure out where they fit into the overall machine learning process. The flow-chart below illustrates where both hyperparameters & hyperparameter tuning (i.e. Grid-Search) take place during the machine learning process.
Like many terms and acronyms you come across when entering the world of machine learning, understanding the difference between ‘parameters’ and hyperparameters’ can be difficult. Let’s take a quick moment to break these two terms down.
A big part of assessing the performance of logistical models is analyzing the evaluation metrics. In contrast to linear regression, where error was determined by how far estimates were off from actuals, with classification modeling you’re either correct or incorrect. Due to this distinction, when evaluating your model, it’s critical to look into these evaluation metrics.
In this post, we’ll touch upon the 4 major evaluation metrics but mainly focus on Precision and Recall. We’ll dive into what these metrics calculate, how they can influence other important metrics and the circumstances when one could be more important than the other.
Prior to starting at Flatiron School, I was buried under balance sheets and P&L’s working as an accountant. Like my other accounting brethren, I spent the majority of my time either in an ERP system, like SAP or NetSuite, or hammering away ‘V-lookups’ and various other formulas in Excel.
With accounting, there are only so many ways things can be done. Publicly traded companies must follow guidelines stated by GAAP, the SEC and other governing bodies for regulatory reasons. A lot of the work can be relatively procedural due to this. So, you might understand why I was so overwhelmed…
Data Science Student at Flatiron School