__STYLES__

Hamilton County Residential Housing Analysis Using R

Tools used in this project
Hamilton County Residential Housing Analysis Using R

About this project

In one of the analytics focused classes we briefly used local real estate data when explaining regression modeling. I thought it would be a case for data analysis. The data was analyzed in RStudio using R.

This dataset is rich enough to find multiple layers of insight. So I expect to write multiple articles on this data.

The first step is to describe the market.

Business Questions

  • How Many Transactions Occurred?
  • How Expensive Was The Most Expensive House? The least expensive?
  • How many Transactions were there with a sale price of 0 or NA?
  • What is average age of the houses?

For data analysis I use the Tidyverse Library. The number of transactions is accessible with a simple row count with the count function. I refer to them as transactions instead of sales because this data include change of ownership through legal instruments such as wills and trusts.

I could summarize the prices individually with calls to the max, min, and mean functions. However, using summary gives all of these values.

The summary value also counts the number of NAs in the column. To find the number of transactions with a cost of 0 I filtered the transactions with a price of 0 and counted the result. I assigned the total number of sales, the number of sales with a price of 0, the number of sales with a price of NA to variables and used arithmetic to represent them as percents.

To find the average age of the house I used mutate to create an age column by subtracting the year built from 2023. Then I found used the mean function on the resulting column.

Answers

  • As of 1/6/23 there have been 295661 house transactions in Hamilton County since 1900.
  • The most expensive house cost $18,950,000. The least expensive house cost $0
  • 74,101 (25.06%) of transactions had a price of $0. 41,975 (14.197%) of transactions had a price of NA. 116,076 (39.259%) of transactions had a price of either $0 or NA.
  • The average house is 85.85 years old.

The analysis itself so far suggests the question of why there are so many transactions with no price? The most obvious possibility is that these transactions were transferred through trusts and wills. That would require more analysis.

Discussion and feedback(0 comments)
2000 characters remaining