Abstract
As one of the basic industries in China, the real estate industry contributes much to the national GDP every year and plays an important role in stimulating the economy. After years of development, the real estate industry has accumulated a large number of sales data, includes the data of customer and construction. However, the utilization of them is still in the stage of extensive collection and use, which is not sufficient for the accumulated data. Therefore, a precise marketing management system for real estate enterprises based on data mining technology is developed in this paper. Through big data mining, the target customers can be accurately subdivided and portraited, which realizes the matching and prediction of resources. On this basis, accurate marketing and promotion of multichannel collaboration are implemented, which realizes the innovation of real estate marketing mode in the era of big data.
1. Introduction
With the rapid development of information technology, the use of Internet+, 5G, cloud computing and mobile terminals has brought us into the era of big data [1]. Big data technology has become a hot topic at present that the arrival of big data era promotes the interaction between human beings and makes exchange of information more convenient, which also makes the Internet economy more closely linked, so that people can create wealth without leaving home [2–4]. “Big data” has penetrated into all walks of life, which subverts the operation thinking and marketing mode of many traditional industries. In this context, big data touches the nerves of the real estate industry, and real estate enterprises can accurately understand the demand of consumers, so as to formulate accurate marketing strategy [5].
As an emerging field, data mining has a wide range of application prospects, and has been widely used in all walks of life [6, 7]. The combination of real estate field and data mining technology is a topic that researchers are devoted to studying in recent years. It is meaningful to using data mining technology to research and analyze the application of big data in real estate marketing.
2. Overview of Data Mining Technology
Precision marketing is based on the collation and analysis of customer s’ data, which can accurately grasp the demand of customers, provide customers with appropriate products and marketing means, and realize the company’s interests. Data mining, including clustering analysis, discriminant analysis and factor analysis, is commonly used in practical technology [8].
2.1. Definition of Data Mining
Berry and Linoff [9] defined data mining as the process of exploring and analyzing large amounts of data in order to discover meaningful patterns and rules. A large number of data may be partly noise data or fuzzy data. The object of data mining can be a database, a file system, or any other data collection organized together.
The data set in the real estate database includes real estate sales data and feature data of real estate. By randomly extracting part of the data, and data conversion, analysis and other processing, we can find the key data that needed for the development of real estate sales strategy [10, 11].
Data mining is divided into directed one and nonoriented. The purpose of directed data mining is to interpret or classify a specific target domain. The purpose of nondirectional data mining is to find out the pattern or similarity between batch data without preseting target domain or class.
2.2. Steps of Data Mining
The specific steps of data mining often take different steps or processes depending on the industry, technology itself and its situation. In addition, whether the data is complete or not and whether the professionals are skilled will also have an impact on the process. Therefore, the industry generally believes that the degree of systematization and standardization of data mining process, and there is a positive correlation with the value of the information. Usually, the main steps of data mining can be broken down into the following programs [12–14]:(1)Understand the data and the source of the data(2)Acquire relevant knowledge and technology(3)Integrate and check data(4)Remove erroneous or inconsistent data(5)Establish models and assumptions(6)Actual data mining work(7)Test and verify the mining results(8)Interpretation and application
The above steps show that there are a lot of related preparatory activities before the real implementation of data mining. While statistics show that data preprocessing takes more than 80% of the data mining work, including data filtering, format conversion, variable integration and data table linking [15–17] is shown in Figure 1.

2.3. Methods of Data Mining
The purpose of data mining is to mine valuable information. In order to realize data mining, we need to adopt certain methods, which is helpful to the better realization of data mining [18]. It can be completed by different methods and means, while users can choose the appropriate method according to their own needs. The following content describes several common methods:
2.3.1. Data Summary
The word “summary” is not difficult to understand literally, which is to use simple and concise sentences to summarize the more complex issues, so that people can understand the content of the exposition in a short time. In the same way, data summary is to concentrate the existing data in certain ways, such as through statistical methods such as sum and average, The calculation results of the data are reflected by charts. In addition, the common chart models are column chart and pie chart [19]. Data mining i is actually a process of data summary, but its research is more in-depth, which needs a more comprehensive summary of the data from a deep level and a wide angle. Generally, the data are analyzed, synthesized, abstracted and summarized in order from low level to high level, so as to find out some internal relations of these data, and judge the direction of future data development through these laws. According to the methods, it can be divided into multidimensional data analysis method and attribute oriented induction method.
2.3.2. Classification Mining
Classification mining refers to the classification of data in order to mine its potential value. It mainly classifies the data by mapping, so as to correctly classify the data to be processed. There are many ways to construct classifiers, while statistical method, machine learning method and decision tree method are commonly used [20, 21].
2.3.3. Cluster
Cluster refers to aggregate classification, that is, the data with the same characteristics are classified, aggregated and stored separately. In this way, the relationship between data is very clear, and the data belonging to the same category have the same or similar categories; on the contrary, data in different categories have different categories which is helpful to the later work. Statistical method and machine learning method [22] are commonly used in clustering methods.
3. Design of Precise Marketing Management System Based on Data Mining
3.1. Demand Analysis
The purpose of functional requirements analysis is to clarify the functional indicators of the real estate precise marketing management system, that is, the function points that the system should have. In the development of software, there are many tools used to describe users’ requirements. This section will discuss them in detail through use case diagrams.
The overall use case of precision marketing management system is shown in Figure 2. The users of the target system mainly include enterprise leaders, market leaders, salesmen and system administrators. In addition, the business of the system can be roughly divided into real estate management, house management, building management, sales management, customer management, decision support and system management.

The real estate management mainly completes the maintenance of the basic information of the real estate, including the business of adding, deleting, modifying and querying the information of the real estate, and the users mainly include the person in charge of the market and the system administrator.
Building management mainly realizes the maintenance of building basic information, including the business of building information addition, building information deletion, building information modification, building information query and so on. The users involved include market leader and system administrator.
Housing management is the management of room information, which mainly maintains the specific information of the room, including the business of adding housing information, deleting housing information, modifying housing information, querying housing information, etc., and the users involved are market leaders and system administrators.
Sales management mainly maintains the basic information of real estate sales, including sales opportunity management, sales record management, sales performance management, etc., and the users involved are market leaders and salesmen.
Customer management includes customer information addition, customer information deletion, customer information modification, customer information query, etc. the user roles involved include enterprise leader, market leader and salesman. Among them, enterprise leader can only be responsible for customer information query.
Decision support includes sales forecasting, performance statistics, weekly and monthly performance reports. The users involved include enterprise leaders, market leaders and salesmen.
System management is mainly for system administrators, providing the maintenance of basic system information, including user management, data backup, data restore, permission setting, etc.
3.2. Overall Framework of System
In the overall design of the system, first of all, we need to design the overall framework of the system. The target system is mainly used for the daily sales management of real estate enterprises which mainly targets at the enterprise leaders, market leaders, salesmen and system administrators, and provides different services for diverse users. According to the actual t needs of the system, B/S mode is selected as the structure, which is a distributed application architecture based on Web. It can meet the needs of different types of users and is suitable for the development of precision marketing management system [23].
In the process of execution, it will give all the interaction between users to the browser side of the system, that is, the interface layer, and the business and data related operations are handed over to the web server to complete, so that the browser and the web server work together to complete the processing of requests. The architecture of precision marketing management system is shown in Figure 3.

3.2.1. Interface Layer
The interface layer is usually called user layer or application layer. The main function of this layer is to complete the interaction with users. On the one hand, it receives the request messages sent by users to the system, on the other hand, it feeds back the request results completed by the server to users for browsing. For users, this layer is the intuitive that their evaluation of the software will be directly reflected in the user’s sense of operation on the interface layer. When a user sends out an access request, the interface layer will receive the request, and then send it to the business logic layer and data access layer. At the same time, the processing results will be fed back to the user’s browser through HTML and displayed to the user.
3.2.2. Business Logic Layer
The business logic layer is responsible for executing the part that needs business logic judgment and processing in the user request which is located in the middle of the three layers. It is a bridge between the client and the database and it can be said that it is the core layer of the three layers, and its role is very important. When the interface layer receives the user’s request, it will send the request of the logical processing part to the layer, which is responsible for executing it. At the same time, it will also send the request of data processing to the data access layer. In the three layers, the business logic layer has its own responsibilities from the top to the bottom. It also acts as a callee’s identity, in this layer, involves a lot of business relations. Therefore, the business logic layer is the core part of the whole architecture.
3.2.3. Data Access Layer
Data access layer is also known as application data source layer. The main function of this layer is to complete the operation of database, including data call and data processing. Through the access layer, you can query, modify, delete and update database tables, and it will provide data call and execution related services for the middle layer. Because the operation of the system cannot be separated from the operation, and the database design of any system is relatively complex and takes up more resources, which makes the performance requirements of the database system higher, so the access mode of the database system should be optimized as much as possible, in order to improve the overall efficiency of the system.
3.2.4. Advantages of Three-Tier Architecture
The reason why the architecture design adopts the three-tier architecture mode is as follows:(i)The three-tier partition makes the system more flexible, and realizes the maintainability and expansibility of the later functions and performance. Because of the independence of the three layers, it is easy to transplant the database(ii)It accords with the design idea of “high cohesion, low coupling” in software engineering(iii)The three layers are independent of each other, and the related functions are relatively clear, which facilitates the development of developers, effectively improves the efficiency of system development work, and shortens the cycle
3.3. Deployment and Functional Structure of System
The precision marketing management system based on Net environment is discussed in this paper, B/S mode is adopted for system structure, C# is used for foreground programming, SQL Server 2005 is used for backstage database, and ADO is applied between foreground and background Net connection to achieve interaction. For the system developed by B/S mode, the deployment can be divided into three parts: client, application server, and database server. The request submitted by the client will be sent to the application server for processing, and then through the operation of the database, the response to the user’s request is realized. The deployment of the system is shown in Figure 4.

In the work of system design, the first is to design the overall architecture. When the overall architecture is determined, the design will have an overall direction. The next work is to design the function of the system, that is, to design some functions that the target system needs. This is the most concerned problem of users, which means, it can help users complete specific functions. The target system is applied in the real estate sales management. Therefore, in the demand analysis stage, the author talked with the staff related to sales management in depth, and defined the functional requirements of the system, which also laid a good foundation for the design of functional modules. The requirements of modular design are summarized as follows:(1)In the process of module division, the system should be divided according to the hierarchy, that is, the system should be first divided from the overall perspective, and then the modules after division should be further divided, so that the final module can be realized in a certain way.(2)All the modules divided should be as independent as possible. In other words, there should be no association between these modules under special circumstances. Of course, this situation is not absolute, but should be avoided as far as possible.(3)The relationship between the divided modules must be clear. The result of this is to facilitate the later maintenance of the system. When facing problems, the programmer can easily modify it and trace it. After understanding the requirements of the system function module division, we can divide the function modules of the precise marketing management system. The relevant functional modules mainly include eight functional modules, including real estate management, building management, house management, sales management, customer management, users management, decision support and system management. The function module is shown in Figure 5.(1)Real estate management: to manage all real estate(i)Information addition: add basic information of real estate(ii)Information query: query the basic information of the real estate according to the needs(iii)Information maintenance: to maintain and manage the basic information of real estate(2)Building management: to manage the building information in each building of the enterprise(i)Information addition: add the building information, including the name, address, area and unit of the building(ii)Information query: query the basic information of the building according to the needs(iii)Information maintenance: include maintenance and management of related information(3)Sales contract management: manage the sales contract information of the enterprise(i)Add sales contract: enter sales contract information into the system(ii)Sales contract maintenance: modify and delete the sales contract information(iii)Sales contract query: query sales contract information(4)Customer management: manage all customers of the enterprise(i)Customer data entry: enter new customer information(ii)Customer information inquiry: according to the needs of customer information query operation(iii)Customer data maintenance: maintain and manage the customer’s basic information and data

3.4. Design of Database
According to the process of concept design is the process of data entity design. Based on the previous analysis of the real estate precision marketing management system, part of the data entity design of the system is given below. The main entities of the system include user entity, house entity, building entity, room entity, customer entity, sales contract entity, etc.
The content of database logical design is based on the entities obtained from conceptual design, and transforms each entity into the actual data physical storage structure. This section gives the relevant data tables of the real estate precision marketing management system, and each data table includes the attribute description of each entity. The data tables of the target system include user information table, house type table, room information table, building information table, property information table, property sales information table, customer information table, sales contract table, etc.
The structure of user information table is shown in Table 1. The data table is used to store the basic user information of the system, including the fields of user ID, user name, user type, user real name, gender, age, etc., in which the user ID is the primary key.
The structure of the house type table is shown in Table 2. The data table is used to store the house type information of the room. While this fields included are mainly house type ID, house type name and description, in which the house type ID is the primary key.
The structure of the room is shown in Table 3. The data table is used to store the basic information of the room, mainly including room ID, building ID, floor, room number, house type ID, etc. among them, the room ID is the primary key, and the building ID and unit type ID are foreign keys.
The structure of the building type is shown in Table 4, which is used to store the building type information, including the fields of type ID, type name, type description, etc., in which the type ID is the primary key.
The structure of the building is shown in Table 5. The data table is used to store the basic information of the building. While the fields include building ID, Property ID, unit number, floor area, building type ID, etc. Among them, the building ID is the primary key, and the building ID and building type ID are foreign keys.
The structure of the real estate is shown in Table 6. The data table is used to store the basic information of the real estate. The fields mainly include the property name, developer, floor area, building area, etc., and the real estate ID is the primary key of the data table.
The structure of the real estate sales information table is shown in Table 7. The data table is used to store the sales of real estate. The main fields include sales ID, ID, ID, average price, and number of households sold. Among them, the sales ID is the primary key, and the real estate ID, building ID and room ID are all foreign keys.
The structure of the customer information is shown in Table 8. The data table is used to store the basic information of customers, including the fields of customer ID, customer name, contact number, age, occupation, etc., in which the customer ID is the primary key of the data table.
The structure of the sales contract table is shown in Table 9. The data table is used to store the sales contract information. The fields include sales contract ID, customer name, purchase time, purchase price, payment method, etc. among them, the sales contract ID is the primary key of the data table.
3.5. Design of Interface and Operation
3.5.1. Interface Design
The interface design mainly includes the definition of external interface and internal interface. The detailed design is shown in Table 10.
3.5.2. Operation Design
(1) Combination of Running Modules. The system mainly takes a window as a module. Generally, a window completes a specific function. While the main window realizes the connection and combination of different functions between modules by opening another sub window. Each module is relatively independent, and the program has good portability. Moreover, the cooperation and data sharing between modules are realized by transferring the reference of data items.
(2) Operation Control. The user opens the system login window and enters the name and password. Then the system jumps to the corresponding background according to the user type corresponding to the name, so as to realize different operations with diverse permissions and roles.
(3) Running Time. The running time of each module should be controlled within 1–2 seconds (most of which is in response to the user’s action). As the system adopts message driven mode, it will effectively improve the utilization of computer.
3.6. Algorithm of Decision Tree Construction and ID3 Learning
3.6.1. Decision Tree Construction Algorithm
The construction algorithm of decision tree can be completed by training set T, where , and is a training example, it has n attributes listed in the attribute table where ai is the value of attribute Ai. is the classification result of X [24–26]. The algorithm is divided into the following steps: Select the attribute AI from the attribute table as the classification attribute; If there are Ki values in attribute AI, T is divided into ki subsets T1, …, TK, where and the attribute value a of X is the Ki value; Delete the attribute AI from the attribute table For each , order If the property sheet is not empty, return (1), otherwise output At present, the mature decision tree methods are ID3 and C45. Cart, SLIQ, etc.
3.6.2. ID3 Learning Algorithm
Information entropy is called average information quantity in information theory, which is an average value used to measure the information transmitted, which includes a finite number of mutually exclusive and joint complete events. They all appear with a certain probability, which is represented by the mathematical formula [27]: a group of events appears with a given probability while the mean value H (x) is the information entropy, and its value is equal to the mathematical expectation of the (self) information quantity I (x) of each event
In the traditional ID3 algorithm, the information entropy is used as the standard of attributes selection, and the value of information entropy is obtained based on data calculation. Then it is selected by comparing the size of each information entropy, and the item corresponding to the information entropy is taken as the root node of the decision tree. After the example set is divided into subsets by using this attribute, the entropy value of the system is the minimum. It is expected that the average path of the nonleaf node to reach each descendant leaf node is the shortest, and the average depth of the decision tree generated is smaller [28]. In addition, it can be seen that the more fuzzy and disorderly the training case set is in target classification, the higher its entropy is, the clearer the training case set is in target classification, while the more ordered it is, and the lower its entropy is. ID3 algorithm is based on the principle of “the attribute with greater information gain is more beneficial to the classification of training cases”. In each step of the algorithm, “the attribute in the table that can best classify the training case set” is selected. Moreover, the information gain of an attribute is the decrease of system entropy due to the use of this attribute to divide the sample, The key operation of ID3 algorithm is to calculate and compare the information of each attribute [29, 30].
The above detailed introduction of ID3 algorithm, in order to better achieve data mining, here will be the basic strategy of ID3 algorithm. The implementation of ID3 algorithm is as follows:(1)Each node given in the training sample is taken as the root node of the decision tree to start the process of creating the decision tree.(2)These root nodes are judged and analyzed. If they belong to the same class, they are set as leaf nodes, and the nodes set as leaf nodes are marked.(3)For the samples that do not belong to the same class, the entropy based measure that called information gain is used as the heuristic information, and the best attribute that can be reclassified is selected from the heuristic information, which becomes the test or decision attribute of the node.(4)Create a new branch for the new test property and divide new samples accordingly.(5)The recursive method is used to create a decision tree for each sample. At this time, the attributes of its descendants can not be considered too much.(6)The recursive process should be repeated until the following features appear:(i)All the attributes represented by a node belong to the same category.(ii)The attribute represented by the node does not have the feature of continuous partition.(iii)If there are no samples under the branches of a tree, the classes in these training sets can be compared and analyzed, and the sample of the largest class’s genus is set as the leaf node of the decision tree.
4. System Implementation and Testing
4.1. Implementation of Data Mining
4.1.1. Decision Tree Generation and ID3 Algorithm
The ID3 algorithm proposed by J. R. Quinlan is an earlier and most famous decision tree induction algorithm. Given a set of nonclass attributes C1, C2, ……, Cn, Category attribute C and record training set S, a decision tree can be constructed by ID3 algorithm. The ID3 algorithm of decision tree induction algorithm is described as follows [31]. //Returns a decision tree Function ID3 (R: a nonclass attribute set, C: a Category attribute, s: a training set) Begin If s is null, a single node with the value of failure is returned; If s is composed of records whose values are the same category attribute values, and returns a single node with this value; If R is null, a single node is returned, whose value is the most frequent Category attribute value found in s record; The attribute with the maximum gain (D, s) value between attributes in R is assigned to d; Assign the value of attribute d to {dj|J = 1, 2, 3, …, m}; The subsets of s composed of records corresponding to dj corresponding to D are assigned to {sj|1, 2, 3, …, m}; Then return a tree whose root is marked D and its branch is marked d1, d2, d3, …, dm; Then the following trees are constructed ID3(R-{D}, C, S1), ID3(R-{D}, C, S2), …, ID3(R-{D}, C, Sm); End ID3;
4.1.2. Application of Data Mining in Marketing Management
Combined with the previous analysis, this section analyzes the implementation of data mining with specific cases. Suppose that the following customer information exists in the database, as shown in Table 11.(1)Step 1: transformation of data. According to the basic data of customer, the required data is transformed by generalization to higher-level concepts which are given as follows: Data by age are shown in Table 12. Statistics by income are shown in Table 13. According to the statistics of purchase area, the statistics are shown in Table 14. Statistics by marital status are shown in Table 15.(2)The second step. Get the expected information and information gain. The key to construct a good decision tree is how to choose good logical judgment or attribute. It has been found that the smaller the tree is, the stronger the prediction ability is. To construct a decision tree as small as possible, the key is to choose the appropriate logical judgment or attribute. Information gain here is used to select attributes. The calculation formula of the degree is as follows Among them, the data set is s, M is the classification number of S, CI is a certain classification label, PI is the probability that any sample belongs to CI, and Si is the number of samples on classification CI.(1)Entropy divided by a into subsets: A is an attribute with V different values;(2)Information gain: Gain (a) = I (s1, s2, …, Sm) − E (A) The split attribute is obtained as follows: For purchase area: Info (purchase area) = 0.999215879 Age: Expected information: info (age) = 0.965202313 Information gain: gain = 0.034013566 Gender: Expected information: info (gender) = 0.999202348 Information gain: gain = 0.000013531 Occupation: Expected information: Info = 0.717207246 Information gain: gain = 0.282008633 Marriage or not: Expected information: Info = 0.958900714 Information gain: gain = 0.040315165 Revenue: Expected information: Info = 0.710086325 Information gain: gain = 0.289129554 It can be seen that income has information gain in attributes, so it is selected as splitting attribute. Node n is marked with age and grows a branch for each attribute value. Then Yuanzu divided them according to this. Step 3: generate decision tree and extract rules According to the above data, a decision tree can be generated. The classification rules are extracted from the decision tree. R1: if income = middle and occupation = professor then, purchase area = big; R2: if income = low and marriage = yes, then the purchase area = small; R3: if income = high then house purchase area = big; R4: if income = middle and occupation = doctor and age = senior then, purchase area = big; R5: if income = middle and occupation = business owner then purchase area = big;
5. Conclusion
With the application of mobile Internet technology, 5G, cloud computing and other network technologies, enterprises have an increasing need for big data, especially in the real estate industry. The era of big data brings real estate marketing not only a challenge, but also an opportunity. Real estate enterprises must seize the business opportunity of big data, adjust their marketing mode in time, and promote the successful transformation and upgrading of real estate enterprises. Based on this background, this paper uses data mining technology to design a precise marketing management system for real estate. Its biggest advantage is to realize sales forecast through mining and analyzing customer data, which provides reference for sales personnel to formulate marketing strategies.
Data Availability
The dataset can be accessed upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was supported by 2021 Philosophy and Social Science Research Project of Hubei Education Department (PX-321937) and 2021 Project of Hubei Association of Higher Education (PX-321937).