Real Estate Price Estimation

Ishan Gupta

Ishan Gupta

Howrah, West Bengal

1 0
  • 0 Collaborators

Exploring the neighborhoods of New York city in order to extract the correlation between the real estate value and its surrounding venues. For-Potential buyers,Real estate makers and planners,Houses sellers(advertisement optimization). ...learn more

Project status: Published/In Market

Artificial Intelligence

Intel Technologies
DevCloud

Code Samples [1]

Overview / Usage

The main goal will be exploring the neighborhoods of New York city in order to extract the correlation between the real estate value and its surrounding venues.

The idea comes from the process of a normal family finding a place to stay after moving to another city. It’s common that the owners or agents advertise their properties are closed to some kinds of venues like supermarkets, restaurants or coffee shops, etc.; showing the “convenience” of the location in order to raise their house’s value.

So, can the surrounding venues affect the price of a house? If so, what types of venues have the most affect, both positively and negatively?

This project is developed for:

  • Potential buyers who can roughly estimate the value of a house based on the surrounding venues and the average price.

  • Real estate makers and planners who can decide what kind of venues to put around their products to maximize selling price.

  • Houses sellers who can optimize their advertisements.

Methodology / Approach

The assumption is that real estate price is dependent on the surrounding venue. Thus, regression techniques will be used to analyze the dataset. The regressors will be the occurrences of venue types. And the dependent variable will be standardized average prices.

At the end, a regression model will be obtained. Along with a coefficients list which describes how each venue type may be related to the increase or decrease of a neighborhood’s real estate average price around the mean.

New York city neighborhoods were chosen as the observation target due to the following reasons:

  • The availability of real estate prices. Though very limited.
  • The diversity of prices between neighborhoods. For example, a 2- bedrooms condo in Central Park West, Upper West Side can cost

$4.91 million on average; while in Inwood, Upper Manhattan, just 30 minutes away, it's only $498 thousands.

  • The availability of geo data which can be used to visualize the dataset onto a map.

The type of real estate to be considered is 2-bedroom condo, which is common for most normal nuclear families.

The dataset will be composed from the following two main sources:

The process of collecting and clean data:

  • Scrap the CityRealty webpage for a list of New York city neighborhoods and their corresponding 2-bedroom condo average price.
  • Find the geographic data of the neighborhoods. Both their center coordinates and their border.
  • For each neighborhood, pass the obtained coordinates to FourSquare API. The “explore” endpoint will return a list of surrounding venues in a pre-defined radius.
  • Count the occurrence of each venue type in a neighborhood. Then apply one hot encoding to turn each venue type into a column with their occurrence as the value.
  • Standardize the average price by removing the mean and scaling to unit variance.

Workflow:

  1. First insight using visualization
  2. Linear Regression
  3. Principal Component Regression (PCR)

Technologies Used

Python data science tools will be used to help analyze the data. Completed code can be found here: https://github.com/ishangupta-ds/Coursera_Capstone/blob/master/Capstone_Analyze.ipynb

Tehnologies used:

  1. Data Science

  2. Machine Learning

  3. Artificial Intelligence

  4. Web scrapping

  5. Geo-spatial Data Analysis

  6. Geographic Information System

Libraries used:

  1. Numpy

  2. pandas

  3. Beautiful soup

  4. Matplotlib

  5. Folium

  6. Sklearn

Software:

  1. Ipython Notebook

  2. Python Interpreter

  3. FourSquare API

Hardware Requirements:

  1. RAM 4 GB

  2. GPU 4GB Nvidia with CUDA support

Repository

https://github.com/ishangupta-ds/Coursera_Capstone

Comments (0)