Once the business problem is clearly understood, the next step is to determine how data analysis can answer the question.
The analytic approach involves:
The refined analytical question is:
This question requires:
This is a Regression Problem
| Question Type | Analytic Approach |
|---|---|
| Predict a numeric value | Regression |
| Yes / No decision | Classification |
| Discover patterns | Clustering |
| Describe trends | Descriptive Analysis |
For house price prediction:
Machine Learning can:
However:
| Problem Approach | Analytic Approach |
|---|---|
| Focuses on goals and value | Focuses on methods |
| Business-driven | Data-driven |
| Defines what success means | Defines how to measure it |
| No models mentioned | Model types identified |
| Business Need | Analytics Interpretation |
|---|---|
| Estimate a numerical house price | Predict a continuous numeric value |
This is Supervised Learning – Regression
| Role | Business Meaning | Data Representation |
|---|---|---|
| Input (X) | Property attributes | Area, location, rooms, year built, etc. |
| Output (y) | House value | Sale price |
Mathematically:
\[Price=f(Property,Location,Market)\]The model must:
Not:
- Classify houses
- Rank houses
- Recommend houses
Business Success Criteria: “Average prediction error below an acceptable threshold”
Mapping: | Business Concern | Analytics Metric | | —————————- | —————————— | | How far off is the estimate? | MAE (Mean Absolute Error) | | Penalize large mistakes | RMSE (Root Mean Squared Error) | | Relative performance | R² score |
- Primary metric: MAE
- Secondary: RMSE
Business wants to know: “Is this better than current practice?”
Baseline:
| Model | Purpose |
|---|---|
| Simple heuristic | Business benchmark |
| ML model | Value-added comparison |
| Field | Description |
|---|---|
| predicted_price | Estimated house value |
| confidence_interval (optional) | Uncertainty |
This project is a supervised regression problem where historical house transaction data will be used to train a model that predicts continuous house prices from property, location, and market features. The solution will be evaluated primarily using MAE and RMSE against a baseline pricing heuristic.