
Data Science methodology begins by seeking clarification of the business problem in order to achieve a solid business understanding.
This step is placed at the beginning because:
Having a clearly defined question is vital, because it ultimately directs the analytic approach that will be required.
Establishing a clearly defined question starts with understanding the GOAL of the person asking the question.
In the House Price Prediction case:
Goal: Support pricing decisions by providing a reliable house price estimation based on historical data.
Once the goal is clarified, the next step is to identify the objectives that support this goal.
Examples of objectives:
Objectives are measurable and actionable, unlike the high-level goal.
Depending on the problem, different stakeholders must be involved to clarify requirements and constraints.
| Stakeholder | Role | Interest |
|---|---|---|
| Real estate agents | Use price estimates | Speed & usability |
| Company managers | Control pricing strategy | Consistency |
| Customers | View estimated prices | Transparency |
| Data team | Build the system | Data quality |
Real estate agents currently estimate house prices manually based on experience, which leads to inconsistent pricing and longer response times. The company needs a data-driven approach to support faster and more reliable price estimation.
Case Study: Identifying the business requirements
Business Question: What is the best way to estimate house prices accurately and consistently using historical data?
From the problem analysis, the business requires:
Proposed Business-Level Solution (No Models Yet)
Build a data-driven system that estimates house prices using historical transaction data and property features to support pricing decisions.
Note: At this stage:
- No machine learning algorithm is mentioned
- No evaluation metric is discussed
- Focus remains on business value
(Professional Standard Template for Data Science Projects)
(What is happening in the business environment?)
Briefly describe the business domain and the current situation that motivates this project. Focus on the real-world context rather than technical details.
Example (House Price): The real estate market experiences frequent price fluctuations, and property valuation is often performed manually by agents based on experience. This process lacks consistency and scalability as the number of listings grows.
(What specific problem needs to be solved?)
Clearly state the core business problem. Avoid mentioning data science techniques or models.
Example: The organization does not have a standardized and data-driven method to estimate house prices, resulting in inconsistent pricing and longer response times for customers.
(What does the business ultimately want to achieve?)
Define the high-level goal the business aims to accomplish.
Example: To support pricing decisions by providing reliable and consistent house price estimates.
(What measurable objectives support the goal?)
List concrete objectives that help achieve the business goal.
- Reduce the time required for price estimation
- Improve consistency across price estimates
- Provide a data-driven reference price
(Who is involved or affected?)
Identify key stakeholders and their interests.
| Stakeholder | Role | Key Interest |
|---|---|---|
| Real estate agents | Price estimation | Speed, usability |
| Managers | Pricing strategy | Consistency |
| Customers | Price transparency | Trust |
| Data team | Solution development | Data quality |
(How will success be measured?)
Define clear and measurable criteria linked to business value.
- Average prediction error below an acceptable threshold
- Price estimates generated within an acceptable time
- Results that can be explained to non-technical users
(What limitations must be considered?)
Use of historical data only
- Prototype-level implementation
- No real-time market updates
(What assumptions are being made?)
- Historical transaction data reflects current market trends
- Relevant property features are available and reliable
(What is explicitly excluded?)
- Image-based valuation
- Fully automated pricing without human review
- Commercial deployment
(Why does this project matter?)
Describe the anticipated benefits.
- Improved pricing efficiency
- Increased consistency and transparency
- Better decision support for agents and managers
(One-paragraph executive summary)
This project aims to address inconsistencies in house price estimation by developing a data-driven reference system based on historical transaction data. The solution is intended to support faster, more consistent pricing decisions while maintaining human oversight.