mascarenhasneil / Boston-Property-Assessment

This is Final Capstone Project for ALY6040 Data Mining Fall 2021 CPS. Primarily to learn Data Analytics, Data Mining and Python. Residential and commercial properties were assessed in Boston. The Boston Globe reported in May 2021 that the competitive Boston housing market drives up costs. As the pandemic continues, people demand larger homes. Finding a home became more difficult as most property managers and realtors could not display their properties to several people. This post was written to help individuals, realtors, and real estate brokers find a property at a reasonable price. We selected to use a few basic machine learning concepts to help us determine the best selling price for the house based on the amount of rooms, location, design, and other characteristics about the bath and kitchen. We only focused on residential property because it was in demand. This study's goal was to improve on initial EDA work by constructing predictive models that solved our business concerns. Finally, optimizing the model's performance.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Contributors Forks Stargazers Issues License: GPL v3 LinkedIn

Machine Learning Project and Analytics


Boston Residential Property Assessment and Recommendation Engine.

This is Final Capstone Project for ALY6040 Data Mining Fall 2021 CPS.
Primarily to learn Data Analytics, Data Mining and Python.


Residential and commercial properties were assessed in Boston. The Boston Globe reported in May 2021 that the competitive Boston housing market drives up costs. As the pandemic continues, people demand larger homes. Finding a home became more difficult as most property managers and realtors could not display their properties to several people. This post was written to help individuals, realtors, and real estate brokers find a property at a reasonable price. We selected to use a few basic machine learning concepts to help us determine the best selling price for the house based on the amount of rooms, location, design, and other characteristics about the bath and kitchen. We only focused on residential property because it was in demand. This study's goal was to improve on initial EDA work by constructing predictive models that solved our business concerns. Finally, optimizing the model's performance.

Explore the docs »

View Paper Online · Report Bug · Request Feature

Our Paper Presentation Slides

Slide 1 Slide 2 Slide 3
Slide 4 Slide 5 Slide 6
Slide 7 Slide 8 Slide 9
Slide 10 Slide 11 Slide 12
Slide 13 Slide 14 Slide 15

Roadmap

See the open issues for a list of proposed features (and known issues)

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

Distributed under the GPL v3 License. See LICENSE for more information.

Contact

Team Members:

  1. Neil Mascarenhas - About me? | Github
  2. Sagar Mordiya - Github
  3. Tejal Ambilwade - Github

Project Link: https://mascarenhasneil.github.io/Boston-Property-Assessment/

References

Click to expand!
  1. 40 Techniques Used by Data Scientists. (2020). Data Science Central. https://www.datasciencecentral.com/profiles/blogs/40-techniques-used-by-data-scientists
  2. Bhattacharyya, S. (2020, September 28). Ridge and Lasso Regression: L1 and L2 Regularization. Medium. https://towardsdatascience.com/ridge-and-lasso-regression-acomplete-guide-with-python-scikit-learn-e20e34bcbf0b
  3. Brendel, C. (2021, December 14). Quickly Compare Multiple Models - Towards Data Science. Medium. https://towardsdatascience.com/quickly-test-multiple-models-a98477476f0
  4. Brownlee, J. (2021, April 27). How to Develop a Light Gradient Boosted Machine (LightGBM) Ensemble. Machine Learning Mastery. https://machinelearningmastery.com/lightgradient-boosted-machine-lightgbm-ensemble/
  5. ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. (2020, June 30). Stack Overflow. Retrieved December 5, 2021, from https://stackoverflow.com/questions/62658215/convergencewarning-lbfgs-failed-toconverge-status-1-stop-total-no-of-iter
  6. Duca, A. L. (2021, October 24). Data Preprocessing with Python Pandas — Part 5 Binning. Medium. https://towardsdatascience.com/data-preprocessing-with-python-pandas-part-5-binning-c5bd5fd1b950
  7. How can I determine the optimal binning system for a continuous variable in Python? (2020, December 8). Cross Validated. Retrieved December 5, 2021, from https://stats.stackexchange.com/questions/499941/how-can-i-determine-the-optimalbinning-system-for-a-continuous-variable-in-pyth
  8. Malik, U. (2021, December 1). Principal Component Analysis (PCA) in Python with ScikitLearn. Stack Abuse. Retrieved December 3, 2021, from https://stackabuse.com/implementing-pca-in-python-with-scikit-learn/
  9. Miller, T. W. (2021). Modeling Techniques In Predictive Analytics With Python And R: A Guide To Data Science (1st ed.) [E-book]. Pearson Education.
  10. N. (2021, October 29). Key data science modeling techniques used in data evaluation and analysis. Selerity. https://seleritysas.com/blog/2021/01/22/key-data-science-modelingtechniques-used-in-data-evaluation-and-analysis/
  11. sklearn.feature_selection.SequentialFeatureSelector. (2010). Scikit-Learn. Retrieved December 4, 2021, from https://scikitlearn.org/stable/modules/generated/sklearn.feature_selection.SequentialFeatureSelector.html
  12. sklearn.linear_model.LogisticRegression. (n.d.). Scikit-Learn. Retrieved December 4, 2021, from https://scikitlearn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
  13. statsmodels Principal Component Analysis — statsmodels. (n.d.). StatsModel. Retrieved December 4, 2021, from https://www.statsmodels.org/dev/examples/notebooks/generated/pca_fertility_factors.html
  14. What is the difference between pandas.qcut and pandas.cut? (2015, May 13). Stack Overflow. Retrieved December 5, 2021, from https://stackoverflow.com/questions/30211923/whatis-the-difference-between-pandas-qcut-and-pandas-cut
  15. Wijaya, C. Y. (2021, October 12). 5 Feature Selection Method from Scikit-Learn you should know. Medium. Retrieved December 5, 2021, from https://towardsdatascience.com/5-feature-selection-method-from-scikit-learn-you-should-know-ed4d116e4172

About

This is Final Capstone Project for ALY6040 Data Mining Fall 2021 CPS. Primarily to learn Data Analytics, Data Mining and Python. Residential and commercial properties were assessed in Boston. The Boston Globe reported in May 2021 that the competitive Boston housing market drives up costs. As the pandemic continues, people demand larger homes. Finding a home became more difficult as most property managers and realtors could not display their properties to several people. This post was written to help individuals, realtors, and real estate brokers find a property at a reasonable price. We selected to use a few basic machine learning concepts to help us determine the best selling price for the house based on the amount of rooms, location, design, and other characteristics about the bath and kitchen. We only focused on residential property because it was in demand. This study's goal was to improve on initial EDA work by constructing predictive models that solved our business concerns. Finally, optimizing the model's performance.

License:GNU General Public License v3.0


Languages

Language:Jupyter Notebook 100.0%