Enhancing Person-Job Fit for Talent Recruitment: An Ability-aware Neural Network Approach

by   Chuan Qin, et al.

The wide spread use of online recruitment services has led to information explosion in the job market. As a result, the recruiters have to seek the intelligent ways for Person Job Fit, which is the bridge for adapting the right job seekers to the right positions. Existing studies on Person Job Fit have a focus on measuring the matching degree between the talent qualification and the job requirements mainly based on the manual inspection of human resource experts despite of the subjective, incomplete, and inefficient nature of the human judgement. To this end, in this paper, we propose a novel end to end Ability aware Person Job Fit Neural Network model, which has a goal of reducing the dependence on manual labour and can provide better interpretation about the fitting results. The key idea is to exploit the rich information available at abundant historical job application data. Specifically, we propose a word level semantic representation for both job requirements and job seekers' experiences based on Recurrent Neural Network. Along this line, four hierarchical ability aware attention strategies are designed to measure the different importance of job requirements for semantic representation, as well as measuring the different contribution of each job experience to a specific ability requirement. Finally, extensive experiments on a large scale real world data set clearly validate the effectiveness and interpretability of the APJFNN framework compared with several baselines.



There are no comments yet.


page 3

page 7

page 8


Person-Job Fit: Adapting the Right Talent for the Right Job with Joint Representation Learning

Person-Job Fit is the process of matching the right talent for the right...

Machine Learned Resume-Job Matching Solution

Job search through online matching engines nowadays are very prominent a...

Learning Effective Representations for Person-Job Fit by Feature Fusion

Person-job fit is to match candidates and job posts on online recruitmen...

Job Prediction: From Deep Neural Network Models to Applications

Determining the job is suitable for a student or a person looking for wo...

Semantic Similarity Strategies for Job Title Classification

Automatic and accurate classification of items enables numerous downstre...

Job2Vec: Job Title Benchmarking with Collective Multi-View Representation Learning

Job Title Benchmarking (JTB) aims at matching job titles with similar ex...

Leveraging Multiple Online Sources for Accurate Income Verification

Income verification is the problem of validating a person's stated incom...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

The rapid development of online recruitment platforms, such as LinkedIn and Lagou, has enabled the new paradigm for talent recruitment. For instance, in 2017, there are 467 million users and 3 million active job listings in LinkedIn from about 200 countries and territories all over the world (Chaudhary, 2017). While popular online recruitment services provide more convenient channels for both employers and job seekers, it also comes the challenge of Person-Job Fit due to information explosion. According to the report (Management, 2016), the recruiters now need about 42 days and $4,000 dollars in average for locking a suitable employee (Management, 2016). Clearly, more effective techniques are urgently required for the Person-Job Fit task, which targets at measuring the matching degree between the talent qualification and the job requirements.

Figure 1. A motivating example of Person-Job Fit.

Indeed, as a crucial task for job recruitment, Person-Job Fit has been well studied from different perspectives, such as job-oriented skill measuring (Xu et al., 2018), candidate matching (Malinowski et al., 2006) and job recommendations (Lee and Brusilovsky, 2007; Paparrizos et al., 2011; Zhang et al., 2014). Along this line, some related tasks, such as talent sourcing (Xu et al., 2016; Zhu et al., 2016) and job transition (Wang et al., 2013) have also been studied. However, these efforts largely depend on the manual inspection of features or key phrases from domain experts, and thus lead to high cost and the inefficient, inaccurate, and subjective judgments.

To this end, in this paper, we propose an end-to-end Ability-aware Person-Job Fit Neural Network (APJFNN) model, which has a goal of reducing the dependence on human labeling data and can provide better interpretation about the fitting results. The key idea of our approach is motivated by the example shown in Figure 1. There are 4 requirements including 3 technical skill (

programming, machine learning

and big data processing) requirements and 1 comprehensive quality (communication and team work) requirement. Since multiple abilities may fit the same requirement and different candidates may have different abilities, all the abilities should be weighed for a comprehensive score in order to compare the matching degree among different candidates. During this process, traditional methods, which simply rely on keywords/feature matching, may either ignore some abilities of candidates, or mislead recruiters by subjective and incomplete weighing of abilities/experiences from domain experts. Therefore, for developing more effective and comprehensive Person-Job Fit solution, abilities should be not only represented via the semantic understanding of rich textual information from large amount of job application data, but also automatically weighed based on the historical recruitment results.

Along this line, all the job postings and resumes should be comprehensively analyzed without relying on human judgement. To be specific, for representing both the job-oriented abilities and experiences of candidates, we first propose a word-level semantic representation based on Recurrent Neural Network (RNN) to learn the latent features of each word in a joint semantic space. Then, two hierarchical ability-aware structures are designed to guide the learning of semantic representation for job requirements as well as the corresponding experiences of candidates. In addition, for measuring the importance of different abilities, as well as the relevance between requirements and experiences, we also design four hierarchical ability-aware attention strategies to highlight those crucial abilities or experience. This scheme will not only improve the performance, but also enhance the interpretability of matching results. Finally, extensive experiments on a large-scale real-world data set clearly validate the effectiveness of our APJFNN framework compared with several baselines.

Overview. The rest of this paper is organized as follows. In Section 2, we briefly introduce some related works of our study. In Section 3, we introduce the preliminaries and formally define the problem of Person-Job Fit. Then, technical details of our Ability-aware Person-Job Fit Neural Network will be introduced in Section 4. Afterwards, we comprehensively evaluate the model performance in Section 5, with some further discussions on the interpretability of results. Finally, in Section 6, we conclude the paper.

Figure 2. An illustration of the proposed Ability-aware Person-Job Fit Neural Network (APJFNN), which can be separated into three components, namely Word-level Representation, Hierarchical Ability-aware Representation and Person-Job Fit Prediction. Meanwhile, two different hierarchical structures are used to learn the ability-aware representation of job requirement and candidate experience respectively.

2. Related Work

The related works of our study can be grouped into two categories, namely Recruitment Analysis and Text Mining with Deep Learning.

2.1. Recruitment Analysis

Recruitment is always a core function of human resource management to support the success of organizations. Recently, the newly available recruitment big data enables researchers to conduct recruitment analysis through more quantitative ways (Zhu et al., 2016; Harris, 2017; Xu et al., 2018; Javed et al., 2017; Lin et al., 2017; Xu et al., 2016). In particular, the study of measuring the matching degree between the talent qualification and the job requirements, namely Person-Job Fit (Sekiguchi, 2004), has become one of the most striking topics.

The early research efforts of Person-Job Fit can be dated back to (Malinowski et al., 2006), where Malinowski et al. built a bilateral person-job recommendation system using the profile information from both candidates and jobs, in order to find a good match between talents and jobs. Then, Lee et al. followed the ideas of recommender systems and proposed a comprehensive job recommender system for job seekers, which is based on a broad range of job preferences and interests (Lee and Brusilovsky, 2007). In (Zhang et al., 2014), Zhang et al. compared a number of user-based collaborative filtering and item-based collaborative filtering algorithms on recommending suitable jobs for job seekers.

Recently, the emergence of various online recruitment services provides a novel perspective for recruitment analysis. For example, in (Zhang et al., 2016b), Zhang et al.

proposed a generalized linear mixed models (GLMix), a more fine-grained model at the user or item level, in the LinkedIn job recommender system, and generated 20% to 40% more job applications for job seekers. In

(Cheng et al., 2013), Cheng et al. collected the job-related information from various social media sources and constructed an inter-company job-hopping network to demonstrate the flow of talents. In (Wang et al., 2013), Wang et al. predicted the job transition of employees by exploiting their career path data. Xu et al. proposed a talent circle detection model based on a job transition network which can help the organizations to find the right talents and deliver career suggestions for job seekers to locate suitable jobs (Xu et al., 2016).

2.2. Text Mining With Deep Learning

Generally, the study of Person-Job Fit based on textual information can be grouped into the tasks of text mining, which is highly related to Natural Language Processing (NLP) technologies, such as text classification

(Yang and Pedersen, 1997; Kim, 2014), text similarity (Gomaa and Fahmy, 2013; Severyn and Moschitti, 2015; Kim, 2014), and reading comprehension (Berant et al., 2014; Hermann et al., 2015)

. Recently, due to the advanced performance and flexibility of deep learning, more and more researchers try to leverage deep learning to solve the text mining problems. Compared with traditional methods that largely depend on the effective human-designed representations and input features (e.g., word n-gram

(Wang and Manning, 2012), parse trees (Cherry and Quirk, 2008) and lexical features (Melville et al., 2009)), the deep learning based approaches can learn effective models for large-scale textural data without labor-intensive feature engineering.

Among various deep learning models, Convolutional Neural Network (CNN) (LeCun et al., 1998) and Recurrent Neural Network (RNN) (Elman, 1990) are two representative and widely-used architectures, which can provide effective ways for NLP problems from different perspectives.

Specifically, CNN is efficient to extract local semantics and hierarchical relationships in textural data. For instance, as one of the representative works in this field, Kalchbrenner et al. (Kalchbrenner et al., 2014) proposed a Dynamic Convolutional Neural Network (DCNN) for modeling sentences, which obtained remarkable performance in several text classification tasks. Furthermore, Kim et al. have shown that the power of CNN on a wide range of NLP tasks, even only using a single convolutional layer (Kim, 2014). From then on, CNN-based approaches have attracted much more attentions on many NLP tasks. For example, in (He et al., 2015), He et al. used CNN to extract semantic features from multiple levels of granularity for measuring the sentences similarity. Dong et al. introduced a multi-column CNN for addressing the Question Answering problem (Dong et al., 2015).

Compared with CNNs, RNN-based models are more “natural” for modeling sequential textual data, especially for the tasks of modeling serialization information, and learning the long-span relations or global semantic representation. For example, in (Tang et al., 2015), Tang et al. handled the document level sentiment classification with gated RNN. Zhang et al. designed a novel deep RNN model to perform the keyphrase extraction task (Zhang et al., 2016a)

. Meanwhile, RNN also shows its effectiveness on several text generation tasks with the Encoder-Decoder framework. For example, in

(Cho et al., 2014), Cho et al. firstly used the framework for Machine Translation. Bahdanau et al. introduced an extension to this framework with the attention mechanism (Bahdanau et al., 2014) and validated the advantages of their model in translating long sentences. Similarly, in (Nallapati et al., 2016), Nallapati et al.

adapted the framework for automatic text summarization.

In this paper, we follow some outstanding ideas in the above works according to the properties of Person-Job Fit. And we propose an interpretable end-to-end neural model APJFNN based on RNN with four ability-aware attention mechanisms. Therefore, APJFNN can not only improve the performance of Person-Job Fit, but also enhance the model interpretability in practical scenarios.

3. Problem Formulation

In this paper, we target at dealing with the problem of Person-Job Fit, which focuses on measuring the matching degree between job requirements in a job posting, and the experiences in a resume.

Specifically, to formulate the problem of Person-Job Fit, we use to denote a job posting, which contains pieces of ability requirements, denoted as . For instance, there exist 4 requirements in Figure 1, thus in this case. Generally, we consider two types of ability requirements, i.e., the professional skill requirements (e.g., Data Mining and Natural Language Processing skills), and comprehensive quality requirements (e.g., Team Work, Communication Skill and Sincerity). All the requirements will be analyzed comprehensively without special distinction by different types. Moreover, each is assumed to contain words, i.e., .

Similarly, we use to represent a resume of a candidate, which includes pieces of experiences, i.e., . In particular, due to the limitation of our real-world data, in this paper we mainly focus on the working experiences of candidate, as well as description of some other achievements, e.g., project experiences, competition awards or research paper publications. Besides, each experience is described by words, i.e., .

Finally, we use to indicate a job application, i.e., a Person-Job pair. Correspondingly, we have a recruitment result label to indicate whether the candidate has passed the interview process, i.e., means a successful application, while means a failed one. What should be noted is that, one candidate is allowed to apply several jobs simultaneously, and one job position could be applied by multiple candidates. Thus, the same may exist in different , so does . Along this line, we can formally define the problem of Person-Job Fit as follow:

Definition 3.1 ().

(PROBLEM DEFINITION). Given a set of job applications , where each application contains a job posting and a resume , as well as the recruitment result label . The target of Person-Job Fit is to learn a predictive model for measuring the matching degree between and , and then corresponding result label could be predicted.

In the following section, we will introduce the technical details of our APJFNN model for addressing the above problem.

4. Ability-aware Person-Job Fit Neural Network

As shown in Figure 2, APJFNN mainly consists of three components, namely Word-level Representation, Hierarchical Ability-aware Representation and Person-Job Fit Prediction.

Specifically, in Word-level Representation, we first leverage an RNN to project words of job postings and resumes onto latent representations respectively, along with sequential dependence between words. Then, we feed the word-level representations into Hierarchical Ability-aware Representation, and extract the ability-aware representations for job postings and resumes simultaneously by hierarchical representation structures. To capture the semantic relationships between job postings and resumes and enhance the interpretability of model, we design four attention mechanisms from the perspective of ability to polish their representations at different levels in this component. Finally, the jointly learned representations of job postings and resumes are fed into Person-Job Fit Prediction to evaluate the matching degree between them.

4.1. Word-level Representation

To embed the sequential dependence between words into corresponding representations, we leverage a special RNN, namely Bi-directional Long Short Term Memory network (BiLSTM), on a shared word embedding to generate the word-level representations for job postings and resumes. Compared with the vanilla RNN, LSTM

(Hochreiter and Schmidhuber, 1997)

cannot only store and access a longer range of contextual information in the sequential input, but also handle the vanishing gradient problem in the meanwhile. Figure 

3 illustrates a single cell in LSTM, which has a cell state and three gates, i.e., input gate , forget gate and output gate . Formally, the LSTM can be formulated as follows:

where and

denote the input vector and the length of

respectively. And , , , , , , , are the parameters as weight matrices and biases, represents element-wise multiplication,

is the sigmoid function, and

represents a sequence of semantic features. Furthermore, the above formulas can be represented in short as:

As shown in Figure 3, the BiLSTM uses the input sequential data and their reverse to train the semantic vectors . The hidden vector is the concatenation of the forward hidden vector and backward hidden vector at -step. Specifically, we have

We can represent the above formulas in short as:

where denotes the input sequence .

Now, we can use BiLSTM to model word-level representation in job posting and resume . For -th job requirement , we first embed the words to vectors by

where denotes -dimensional word embedding of -th word in . As for , word embedding of -th word in candidate experience  is generated by a similar way. It should be noted that the job postings and resumes share a same matrix which is initialized by a pre-trained word vector matrix and re-trained during training processing.

Then, for each word in the -th job requirement and -th candidate experience , we can calculate the word-level representation and by:


where and denote the word vectors input sequences of and , respectively. And , are -dimension semantic representations of the -th word in the -th job requirement and -th word in the -th candidate experience .

Figure 3.

(a): The architecture of Long Short-Term Memory block with one cell. (b): The architecture of bidirectional recurrent neural network.

4.2. Hierarchical Ability-Aware Representation

After getting the representations of job postings and resumes at word-level, we further extract more high-level representations for them. As for job postings, we consider that each ability requirement refers to a specific need of a job, and the entire needs of a job can further be summarized from all of its requirements. Following this intuition, we design a hierarchical neural network structure to model such hierarchical representation. And as for resumes, similar hierarchical relationships also exist between a candidate experiences and her qualification, thus a similar hierarchical neural network structure is also applied for resumes.

Besides, as we know, both of job postings and resumes are documents with relatively well-defined formats. For example, most of candidates tend to separate their past experiences by work contents and order them by time for facilitating understanding. Indeed, such kinds of format can help us to better extract representations. Thus, to improve the performance and interpretability, we follow the above intuitions and design four attention mechanisms to polish representations extracted by our model at different levels.

Specifically, this component can further be divided into four parts: 1) Single Ability-aware in Job Requirement for getting the semantic representation of each requirement in a job posting; 2) Multiple Ability-aware in Job Requirement for further extracting entire representation of a job posting, 3) Single Ability-aware in Candidate Experience for highlighting some experiences in resumes by ability requirements; 4) Multiple Ability-aware in Candidate Experience for finally profiling candidates with all previous experiences. In the following, we will introduce the technical details of each component.

Single Ability-aware in Job Requirement.

It is obvious that the meaning of a sentence is dominated by several keywords or phrases. Thus, to better capture the key information for each ability requirement, we use an attention mechanism to estimate the importance of each word in it.

This attention layer is the weighted sum of the semantic vector of each word in each ability requirement. Specifically, for -th ability requirement , we first use the word representation as input of a fully-connected layer and calculate the similarity with word level context vector. Then, we use a softmax function to calculate the attention score , i.e.,

where , and are the parameters to be learned during the training processing. Specifically, denotes the context vector of the , which is randomly initialized. The attention score can be seen as the importance of each word in . Finally, we calculate the single ability-aware requirement representation for by:


Multiple Ability-aware in Job Requirement. In this part, we leverage the representations extracted by Single Ability-aware in Job Requirement to summarize the general needs of jobs. In most of jobs, although different ability requirements refer to different specific needs, their importance varies a lot. For example, for recruiting a software engineer, education background is much less important than professional skills. Moreover, the order of ability requirements in job description will also reflect their importance. With these intuitions, we first use a BiLSTM to model the sequential information of ability requirements. Then we add an attention layer to learn the importance of each ability requirement. Formally, sequential ability representation , learned in Single Ability-aware in Job Requirement, are used as input of a BiLSTM to generate a sequence of hidden state vectors , i.e.,

Similar with the first attention layer, we add another attention layer above the LSTMs to learn importance of each ability requirement. Specifically, we calculate the importance of each ability requirement based on the similarity between its hidden state and the context vector of all the ability requirements, i.e.,

where the parameters , and context vector are learned during training. Then, a latent multiple ability-aware job requirement vector will be calculated by weighted sum of the hidden state vectors of abilities, i.e.,

Particularly, the attention scores can greatly improve the interpretability of the model. It is helpful for visualizing the importance of each ability requirement in practical recruitment applications.

Single Ability-aware in Experience. Now we turn to introduce the learning of resume representations. Specifically, when a recruiter examines whether a candidate matches a job, she tends to focus on those specific skills related to this job, which can be reflected by the candidate experiences. As shown in Figure 1, for candidate A, considering the fourth job requirement, we will pay more attention to the highlighted “green” sentences. Meanwhile, we may focus on the “blue” sentences when matching the second requirement. Thus, we design a novel ability-aware attention mechanism to qualify the ability-aware contributions of each word in candidate experience to a specific ability requirement. Formally, for the -th candidate experience , its word-level semantic representation is calculated by a BiLSTM. And we use an attention-based relation score to qualify the ability-aware contribution of each semantic representation to the -th ability requirement . It can be calculated by

where the ,, are parameters, is the semantic vector of ability requirement which is calculated by Equation 2.

Finally, the single ability-aware candidate experience representation is calculated by the weighted sum of the word-level semantic representation of

Here, the attention score further enhances the interpretability of APJFNN. It enables us to understand whether and why a candidate is qualified for an ability requirement, we will further give a deep analysis in the experiments.

Multiple Ability-aware in Experience. For a candidate, her ordered experiences can reveal her growth process well and such temporal information can also benefit the evaluation on her abilities. To capture such temporal relationships between experiences, we leverage another BiLSTM. Specifically, we first add a mean-pooling layer above the single ability-aware candidate experience representation to generate the latent semantic vector for -th candidate experience .

Now we get a set of semantic vectors for candidate experiences, that is . Considering there exist temporal relationships among , we use a BiLSTM to chain them, i.e.,

Finally, we use the weighted sum of the hidden states to generate the multiple ability-aware candidate experience representation, i.e.,

4.3. Person-Job Fit Prediction

With the process of Hierarchical Ability-aware Representation, we can jointly learn the representations for both job postings and resumes. To measure the matching degree between them, we finally treat them as input and apply a comparison mechanism based on a fully-connected network to learn the overall Person-Job Fit representation for predicting the label by a logistic function. The mathematical definition is as follows.


where ,,, are the parameters to tune the network and . Meanwhile, we minimize the binary cross entropy to train our model.

5. Experiments

In this section, we will introduce the experimental results based on a real-world recruitment data set. Meanwhile, some case studies are demonstrated for revealing interesting findings obtained by our model APJFNN.

5.1. Data Description

In this paper, we conducted our validation on a real-world data set, which was provided by a high tech company in China. To protect the privacy of candidates, all the job application records were anonymized by deleting personal information.

The data set consists of 17,766 job postings and 898,914 resumes with a range of several years. Specifically, four categories of job postings, namely Technology, Product, User Interface and Others were collected. Figure 4 summarizes the distribution of job postings and resumes, according to different categories. We find that most of the applications are technology-oriented, and only about 1% applications were accepted, which highlights the difficulty of talent recruitment. To a certain degree, this phenomenon may also explain the practical value of our work, as the results of Person-Job Fit may help both recruiters and job seekers to enhance the success rate.

Along this line, to ensure the quality of experiments, those incomplete resume (e.g., resumes without any experience records) were removed. Correspondingly, those job postings without any successful applications were also removed. Finally, 3,652 job postings, 12,796 successful applications and 1,058,547 failed ones were kept in total, which lead to a typical imbalanced situation. Some basic statistics of the pruned data set are summarized in Table 1. What should be noted is that, it is reasonable to have more applications than the number of resumes, since one candidate could apply several positions at the same time, which is mentioned above.

Statistics Values
# of job postings 3,652
# of resumes 533,069
# of successful applications 12,796
# of failed applications 1,058,547
Average job requirements per posting 6.002
Average project/work experiences per resume 4.042
Average words per job requirement 9.151
Average words per project/work experience 65.810
Table 1. The statistics of the dataset
Figure 4. (a): The time distribution of successful job applications. (b): The distribution of different categories w.r.t job posting and resume respectively. (c): The distribution of job requirements. (d): The words distribution of job requirement. (e): The distribution of candidate experiences. (f): The words distribution of candidate experience.

5.2. Experimental Setup

Here, we introduce the detailed settings of our experiments, including the technique of word embedding, parameters for our APJFNN, as well as the details of training stage.

Word Embedding. First, we explain the embedding layer, which is used to transfer the original “bag of words” input to a dense vector representation. In detail, we first used the Skip-gram Model (Mikolov et al., 2013) to pre-train the word embedding from job requirements and candidate’s experiences. Then, we utilized the pre-trained word embedding results to initialize the embedding layer weight , which was further fine-tuned during the training processing of APJFNN. Specifically, the dimension of word vectors was set to 100.

APJFNN Setting. In APJFNN model, according to the observation in Figure 444 and 4, we set both the maximum number of job requirements in each job posting as 15, and so does the constraint of candidate experiences in each resume. Then, the maximum number of words in each requirement/experience was set as 30 and 300, respectively. Along this line, the excessive parts were removed. Also, the dimension of hidden state in BiLSTM was set as 200 to learn the word-level joint representation and requirement/experience representation. Finally, the dimension of parameters to calculate the attention score and were set as 200, as well as 400 for and .

Training Setting. Following the idea in (Glorot and Bengio, 2010)

, we initialized all the matrix and vector parameters in our APJFNN model with uniform distribution in

, where , denote the number of the input and output units, respectively. Also, models were optimized by using Adam (Kingma and Ba, 2014)

algorithm. Moreover, we set batch size as 64 for training, and further used the dropout layer with the probability 0.8 in order to prevent overfitting.

Figure 5. An illustration of the proposed Basic Person-Job Fit Neural Network (BPJFNN)

5.3. Baseline Methods

To validate the performance of our APJFNN model, several state-of-the-art supervised models were selected as baseline methods, including the classic supervise learning methods like 

Logistical Regression (LR), Decision Tree (DT), Adaboost (AB), Random Forests (RF) and Gradient Boosting Decision Tree (GBDT). For these baselines, we used two kinds of input features to construct the experiment, separately.

  • Bag-of-words vectors. We first created the bag-of-words vectors of ability requirements and candidate experiences respectively, where the -th dimension of each vector is the frequency of the -th word in dictionary. Then, two vectors were spliced together as input.

  • Mean vector of word embedding. We respectively averaged the pre-trained word vector of the requirements and experiences, and then spliced them as model input.

Besides, we also propose an RNN-based model called Basic Person-Job Fit Neural Network (BPJFNN) as baseline, which could be treated as a simplified version of our APJFNN model. The structure of BPJFNN model is shown in Figure 5. To be specific, in this model, two BiLSTM are used to get the semantic representation of each word in requirements and experiences. What should be noted is that, here we treat all the ability requirements in one job posting as a unity, i.e., a “long sentence”, instead of separate requirements, and so do the experiences in candidate resumes. Then, we add a mean-pooling layer above them to got two semantic vectors , , respectively. Finally, we can use following equations to estimate the Person-Job Fit result label .

where the and are the parameters to learn.

Methods Accuracy Precision Recall F1 AUC
LR 0.6228 0.6232 0.6261 0.6246 0.6787
AB 0.6905 0.7028 0.6628 0.6822 0.7642
DT 0.6831 0.7492 0.5527 0.6361 0.7355
RF 0.7023 0.7257 0.6526 0.6872 0.7772
GBDT 0.7281 0.7517 0.6831 0.7157 0.8108
LR (with word2vec) 0.6479 0.6586 0.6175 0.6374 0.6946
AB (with word2vec) 0.6342 0.6491 0.5878 0.6170 0.6823
DT (with word2vec) 0.5837 0.5893 0.5589 0.5737 0.6249
RF (with word2vec) 0.6358 0.6551 0.5769 0.6135 0.7020
GBDT (with word2vec) 0.6389 0.6444 0.6237 0.6339 0.7006
BPJFNN 0.7156 0.7541 0.6417 0.6934 0.7818
APJFNN 0.7559 0.7545 0.7603 0.7574 0.8316
Table 2. The performance of APJFNN and baselines.

5.4. Evaluation Metrics

Since, in the real-world process of talent recruitment, we usually have a potential “threshold” to pick up those adequate candidate, which results in a certain “ratio of acceptance”. However, we could hardly determine the acceptance rate properly, as it could be a personalized value which is affected by complicated factors. Thus, to comprehensively validate the performance, we selected the AUC index to measure the performance under different situations. Besides, we also adopted the Accuracy, Precision, Recall and F1-measure

as the evaluation metrics.

5.5. Experimental Result

Overall Results. We conducted the task of Person-Job Fit based on the real-word data set, i.e., we used the successful job applications as positive samples, and then used the failed applications as the negative instance to train the models. In order to reduce the impact of imbalances in data, we used the under-sampling method to randomly select negative instances that are equal to the number of positive instances for each job posting to evaluate our model 111Since there were some job postings which the number of failed applications was less than the number of successful applications, we finally got 12,762 negative samples. The number of training, validation, testing samples is 20,446, 2,556 and 2,556 respectively.. Along this line, we randomly selected 80% of the data set as training data, another 10% for tuning the parameters, and the last 10% as test data to validate the performance.

The performance is shown in Table 2. According to the results, clearly, we realize that our APJFNN outperforms all the baselines with a significant margin, which verifies that our framework could well distinguish those adequate candidates with given job postings. Especially, as APJFNN performs better than BPJFNN, it seems that our attention strategies could not only distinguish the critical ability/experience for better explanation, but also improve the performance with better estimation of matching results.

At the same time, we find that almost all the baselines using the Bag-of-Words as input feature outperform those using the pre-trained word vector as input features (i.e., those with “word2vec” in Table 2). This phenomenon may indicate that the pre-trained word vectors are not enough to characterize the semantic features of the recruitment textural data, this is the reason of why we use the BiLSTM above the embedding layer to extract the word-level semantic word representation.

The Robustness on Different Data Split. To observe how our model performs at different train/test split, we randomly selected 80%, 70%, 60%, 50%, 40% of the dataset as training set, another 10% for tuning the parameters, and the rest part as testing set 222The numbers of samples in training/validation/testing set were 20,446/2,556/2,556; 17,891/2,556/5,111; 15,335/2,556/7,667; 12,779/2,556/10,223 and 10,223/2,556/12,779 respectively.. The results are shown in Figure 6(a), 6(b). We can observe that the overall performance of our model is relatively stable, while it gets better as the training data increases. Indeed, the improvements of the best performance compared with the worst one are only 5.44% and 2.99% for two metrics respectively. Furthermore, we find that our model with 60% of data for training has already outperforms all the baselines methods, which use 80% of the data for training. The results clearly validate the robustness of our model in terms of training scalability.

(a) The F1 performance
(b) The AUC performance
Figure 6. The performance of APJFNN at different train/test split.
Figure 7. The training efficiency of APJFNN at different train/test split.
Figure 8. Two examples for demonstrating the advantage of Attention in capturing the informed part of the ability requirement sentence.
Figure 9. An example for demonstrating the advantage of Attention in measuring the importance of the each ability requirement among all the job needs. The left bar charts denote the distribution of over all requirements.

Computational Efficiency. Here we evaluate the computational efficiency of our model APJFNN. Specifically, all of our experiments were conducted on a server with 2-core CPU@2.40GHz, 160GB RAM, and a Tesla K40m GPU. First, we present the training time of different data split. As shown in Figure 7, we observe the training time of our model does not increase dramatically with the increase of training data. Although our model is relatively slower than the BPJFNN, however, it can achieve the better performance as presented in the Table 2. Moreover, after the training process, the average cost of each instances in testing set is 13.46ms. It clearly validate that our model can be effectively used in the real world recruitment analysis system.

5.6. Case Study

With the proposed attention strategies, we target at not only improving the matching performance, but also enhancing the interpretability of matching results. To that end, in this subsection, we will illustrate the matching results in three different levels by visualizing the attention results.

 Word-level: Capturing the key phrases from the sentences of job requirement.

Firstly, we would like to evaluate whether our APJFNN model could reveal the word-level key phrase from long sentences in job requirements. The corresponding case study is shown in Figure 8, in which some words (in Chinese) are highlighted as key phrases, and their darkness correlated to the value of attention .

According to the results, it is unsurprising that the crucial skills are highlighted compared with common words. Furthermore, in the same requirement, different abilities may have different importance. For instance, In the requirement in line 1, which is technique-related, C/Python/R could be more important than Hadoop, which might be due to the different degrees (“proficient” v.s. “familiar”). Similarly, for the product-related requirement in line 2, more detailed skills are more important, e.g., data analysis compared with logical thinking.

 Ability-level: Measuring the different importance among all abilities.

Secondly, we would like to evaluate whether APJFNN could highlight the most critical abilities. The corresponding case study is shown in Figure 9, in which histogram indicates the importance of each ability, i.e., the distribution of attention .

From the figure, striking contrast can be observed among the 6 abilities, in which the bachelor degree with the lowest significance is usually treated as the basic requirement. Correspondingly, the ability of independent business negotiation could be quite beneficial in practice, which leads to the highest significance. In other words, the importance of abilities could be measured by the scarcity, as most candidates have the bachelor degree, but only a few of them could execute business negotiation independently.

Figure 10. An example for demonstrating the advantage of Attention in capturing the ability-aware informed part from the experience of candidate.

 Matching-level: Understanding the matching between job requirements and candidate experiences.

At last, we would like to evaluate how APJFNN model could guide the matching between requirements and experiences. The corresponding case study is shown in Figure 10, in which darkness is also correlated to the importance of experience with considering the different job requirements, i.e., the attention value of .

Definitely, we find that those key phrases which could satisfy the requirements are highlighted, e.g., WeChat public platform and focus on social products for the requirement SNS, forums. Also, we realize that the “importance” here indeed indicates the degree of satisfying the requirements. For instance, the phrase WeChat public platform (a famous SNS in China) is darker than ordering APP, since the former one is strongly related to the SNS requirement, but the latter one is only a rough matching. Thus, this case study also proves that our APJFNN method could provide good interpretability for Person-Job Fit task, since key clues for matching the job requirements and candidate experience can be highlighted.

6. Conclusions

In this paper, we proposed a novel end-to-end Ability-aware Person-Job Fit Neural Network (APJFNN) model, which has a goal of reducing the dependence on manual labour and can provide better interpretation about the fitting results. The key idea is to exploit the rich information available at abundant historical job application data. Specifically, we first proposed a word-level semantic representation for both job requirements and job seekers’ experiences based on Recurrent Neural Network (RNN). Then, four hierarchical ability-aware attention strategies were designed to measure the different importance of job requirements for semantic representation, as well as measuring the different contribution of each job experience to a specific ability requirement. Finally, extensive experiments conducted on a large-scale real-world data set clearly validate the effectiveness and interpretability of our APJFNN framework compared with several baselines.

7. Acknowledgments

This work was partially supported by grants from the National Natural Science Foundation of China (Grant No.91746301, U1605251, 61703386).


Admittedly, while the accuracy of the algorithm is essential, another paramount issue, which needs to be paid attention to, is ensuring the fairness of the algorithm and empowering the correct values of intelligent recruitment system. In recent years, it has been received extensive attention from academics and the media (Dastin, 2018). As for machine learning based algorithms, avoiding bias in training data is necessary for their fairness, such as the significant difference in employment ratio of women to men. Unfortunately, for many existing recruitment practices in our real life, the prejudices seem hard to be completely avoid. For example, according to a recent report (Test-doctoring, 2018), doctor has long been a male bastion of the Tokyo Medical University, where they confessed to marking down the test scores of female applications to keep the ratio of women in each class below 30%.

So, in the construction of the intelligent recruitment system, one of the questions that must be answered is that if we already have a dataset with potential value discrepancy, how can we avoid further misleading the algorithm? Intuitively, if the data with gender bias are used for training machine learning models of intelligent recruitment, Gender would be regarded as a dominant feature based on the commonly feature engineering, since whether the Chi-squared test result, information gain or correlation coefficient score indicate that it has a significant correlation with the recruitment result. Therefore, Gender feature is seen as a potential factor affecting the values of the machine learning algorithm. In our conjecture, we should not add Gender feature to train the model. = In order to confirm our conjecture, here we adjust equation 3 to:

where is the Gender feature. And we evaluate on a semi-synthetic data based on a real-world recruitment system. First of all, we constructed a “balanced dataset” in terms of gender. Specifically, we randomly selected 5,678 successful job applications (positive instances) from the recruitment records of historical job postings, where half of them are female candidates. Then, for each of the job postings, we also randomly selected the same number of failed job applications (negative instances). In particular, both successful and failed applications satisfy that the numbers of male and female candidates are equal.Next, in the model validation step, we randomly selected 80% of the dataset as training data, another 10% for tuning the parameters, and the last 10% as test data to validate the performance and robustness. As same time, in order to simulate the possible unfairness scenario in the recruitment system, we randomly labeled 50% female successful applications as negative, and labeled 50% male failed applications as positive ones, in the training set and validation set. After the manual construction, in both training and validation sets, the success rates of male and female candidates become 75% and 25%, respectively. Note that, we did not change the labels in test set, where has the same cutoff ratio as ”balance dataset” for both women and men to ensure it has the correct values.

Table 3 shows the performance on the validation set and testing set of the semi-synthetic data. Clearly, we observe that with Gender feature, each model in validation set has better performance since validation set has similar distribution with training set. However, in other words, those models have unfortunately learned the value bias that existed therein. In contrast, we realize that all the models perform better without using gender information on the testing set, which demonstrates that the models can avoid value deviation from the training data to a great extent without leveraging the Gender information. Therefore, we can conclude that when historical recruitment dataset contains the bias of data distribution, such as gender discrimination, we should not use the corresponding features to train the model, thus avoiding algorithm to produce value deviations like humans.

Features Without gender feature With gender feature
Methods Datasets Accuracy Precision Recall F1 AUC Accuracy Precision Recall F1 AUC
LR Validation set 0.5122 0.5126 0.4957 0.5040 0.5348 0.6783 0.6758 0.6852 0.6805 0.7063
Testing set 0.5855 0.5913 0.5913 0.5913 0.6093 0.5203 0.5281 0.5061 0.5169 0.5693
AB Validation set 0.5713 0.5724 0.5635 0.5679 0.5847 0.7217 0.7040 0.7652 0.7333 0.7882
Testing set 0.6402 0.6567 0.6087 0.6318 0.6770 0.5459 0.5549 0.5270 0.5406 0.6274
DT Validation set 0.5870 0.6179 0.4557 0.5245 0.5951 0.7261 0.7167 0.7478 0.7319 0.7744
Testing set 0.6711 0.7349 0.5496 0.6289 0.6807 0.5079 0.5159 0.4800 0.4973 0.5701
RF Validation set 0.5991 0.6096 0.5513 0.5790 0.6118 0.7148 0.6939 0.7687 0.7294 0.7531
Testing set 0.6279 0.6527 0.5687 0.6078 0.6857 0.5141 0.5211 0.5165 0.5188 0.5807
GBDT Validation set 0.5913 0.5953 0.5704 0.5826 0.6290 0.7200 0.7030 0.7617 0.7312 0.7945
Testing set 0.6896 0.7069 0.6626 0.6840 0.7436 0.5194 0.5271 0.5078 0.5173 0.6208
LR(with word2vec) Validation set 0.5652 0.5693 0.5357 0.5520 0.5985 0.7113 0.6989 0.7426 0.7201 0.7625
Testing set 0.5873 0.6011 0.5530 0.5761 0.6140 0.5079 0.5150 0.5078 0.5114 0.5642
AB(with word2vec) Validation set 0.5626 0.5655 0.5409 0.5529 0.5780 0.7217 0.7121 0.7443 0.7280 0.7685
Testing set 0.5540 0.5647 0.5235 0.5433 0.5860 0.5256 0.5322 0.5322 0.5322 0.5565
DT(with word2vec) Validation set 0.5313 0.5304 0.5461 0.5381 0.5577 0.7243 0.7067 0.7670 0.7356 0.7435
Testing set 0.5502 0.5534 0.5861 0.5693 0.5853 0.4929 0.5000 0.5009 0.5004 0.5340
RF(with word2vec) Validation set 0.5565 0.5610 0.5200 0.5397 0.5756 0.6991 0.6856 0.7357 0.7097 0.7332
Testing set 0.5847 0.6057 0.5183 0.5586 0.6301 0.5212 0.5282 0.5217 0.5249 0.5387
GBDT(with word2vec) Validation set 0.5809 0.5841 0.5617 0.5727 0.5983 0.7157 0.7033 0.7461 0.7241 0.7687
Testing set 0.5970 0.6105 0.5670 0.5979 0.6317 0.5088 0.5157 0.5130 0.5144 0.5587
PJFNN-RNN Validation set 0.5974 0.6261 0.4835 0.5456 0.6443 0.7183 0.6865 0.8035 0.7404 0.7858
Testing set 0.6296 0.6824 0.5043 0.5800 0.7034 0.5397 0.5425 0.5878 0.5643 0.6178
APJFNN Validation set 0.6191 0.6179 0.6243 0.6211 0.6681 0.7157 0.6649 0.8696 0.7536 0.8091
Testing set 0.7425 0.7386 0.7617 0.7500 0.8036 0.5917 0.5780 0.7217 0.6419 0.6444
Table 3. The performance of APJFNN and baselines on semi-synthetic data.


  • (1)
  • Bahdanau et al. (2014) Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).
  • Berant et al. (2014) Jonathan Berant, Vivek Srikumar, Pei-Chun Chen, Abby Vander Linden, Brittany Harding, Brad Huang, Peter Clark, and Christopher D Manning. 2014. Modeling Biological Processes for Reading Comprehension. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 1499–1510.
  • Chaudhary (2017) Meenakshi Chaudhary. 2017. LinkedIn by the numbers: 2017 statistics. https://www.linkedin.com/pulse/linkedin-numbers-2017-statistics-meenakshi-chaudhary/. (2017).
  • Cheng et al. (2013) Yu Cheng, Yusheng Xie, Zhengzhang Chen, Ankit Agrawal, Alok Choudhary, and Songtao Guo. 2013. Jobminer: A real-time system for mining job-related patterns from social media. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1450–1453.
  • Cherry and Quirk (2008) Colin Cherry and Chris Quirk. 2008. Discriminative, syntactic language modeling through latent svms. Proceedings of the 8th Conference of Association for Machine Translation in the America.
  • Cho et al. (2014) Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 1724–1734.
  • Dastin (2018) Jeffrey Dastin. 2018. Amazon scraps secret AI recruiting tool that showed bias against women. https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G. (2018).
  • Dong et al. (2015) Li Dong, Furu Wei, Ming Zhou, and Ke Xu. 2015. Question Answering over Freebase with Multi-Column Convolutional Neural Networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. 260–269.
  • Elman (1990) Jeffrey L Elman. 1990. Finding structure in time. Cognitive science 14, 2 (1990), 179–211.
  • Glorot and Bengio (2010) Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In

    Proceedings of the 13th International Conference on Artificial Intelligence and Statistics

    . 249–256.
  • Gomaa and Fahmy (2013) Wael H Gomaa and Aly A Fahmy. 2013. A survey of text similarity approaches. International Journal of Computer Applications 68, 13 (2013).
  • Harris (2017) Christopher G Harris. 2017. Finding the Best Job Applicants for a Job Posting: A Comparison of Human Resources Search Strategies. In Proceedings of the 2017 International Conference on Data Mining Workshops. IEEE, 189–194.
  • He et al. (2015) Hua He, Kevin Gimpel, and Jimmy J Lin. 2015. Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 1576–1586.
  • Hermann et al. (2015) Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and comprehend. In Proceedings of the 28th International Conference on Neural Information Processing Systems. 1693–1701.
  • Hochreiter and Schmidhuber (1997) Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
  • Javed et al. (2017) Faizan Javed, Phuong Hoang, Thomas Mahoney, and Matt McNair. 2017. Large-Scale Occupational Skills Normalization for Online Recruitment. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 4627–4634.
  • Kalchbrenner et al. (2014) Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. 2014. A convolutional neural network for modelling sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.
  • Kim (2014) Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014).
  • Kingma and Ba (2014) Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
  • LeCun et al. (1998) Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.
  • Lee and Brusilovsky (2007) Danielle H Lee and Peter Brusilovsky. 2007. Fighting information overflow with personalized comprehensive information access: A proactive job recommender. In Proceedings of the 3rd International Conference on Autonomic and Autonomous Systems. IEEE, 21–21.
  • Lin et al. (2017) Hao Lin, Hengshu Zhu, Yuan Zuo, Chen Zhu, Junjie Wu, and Hui Xiong. 2017. Collaborative Company Profiling: Insights from an Employee’s Perspective. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 1417–1423.
  • Malinowski et al. (2006) Jochen Malinowski, Tobias Keim, Oliver Wendt, and Tim Weitzel. 2006. Matching people and jobs: A bilateral recommendation approach. In Proceedings of the 39th Annual Hawaii International Conference on System Sciences, Vol. 6. IEEE, 137c–137c.
  • Management (2016) Society For Human Resource Management. 2016. 2016 Human Capital Benchmarking Report. https://www.shrm.org/hr-today/trends-and-forecasting/research-and-surveys/Documents/2016-Human-Capital-Report.pdf. (2016).
  • Melville et al. (2009) Prem Melville, Wojciech Gryc, and Richard D Lawrence. 2009. Sentiment analysis of blogs by combining lexical knowledge with text classification. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1275–1284.
  • Mikolov et al. (2013) Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems, Vol. 2. 3111–3119.
  • Nallapati et al. (2016) Ramesh Nallapati, Bowen Zhou, Cicero dos Santos, Caglar Gulcehre, and Bing Xiang. 2016. Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond. In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning. 280–290.
  • Paparrizos et al. (2011) Ioannis Paparrizos, B Barla Cambazoglu, and Aristides Gionis. 2011. Machine learned job recommendation. In Proceedings of the 5th ACM International Conference on Recommender Systems. ACM, 325–328.
  • Sekiguchi (2004) Tomoki Sekiguchi. 2004. Person-organization fit and person-job fit in employee selection: A review of the literature. Osaka keidai ronshu 54, 6 (2004), 179–196.
  • Severyn and Moschitti (2015) Aliaksei Severyn and Alessandro Moschitti. 2015. Learning to rank short text pairs with convolutional deep neural networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 373–382.
  • Tang et al. (2015) Duyu Tang, Bing Qin, and Ting Liu. 2015. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 1422–1432.
  • Test-doctoring (2018) Toxic Test-doctoring. 2018. Test-doctoring to keep Japanese women out of medical school. https://www.economist.com/asia/2018/08/09/test-doctoring-to-keep-japanese-women-out-of-medical-school. (2018).
  • Wang et al. (2013) Jian Wang, Yi Zhang, Christian Posse, and Anmol Bhasin. 2013. Is it time for a career switch?. In Proceedings of the 22nd International Conference on World Wide Web. ACM, 1377–1388.
  • Wang and Manning (2012) Sida Wang and Christopher D Manning. 2012. Baselines and bigrams: Simple, good sentiment and topic classification. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Vol. 2. 90–94.
  • Xu et al. (2016) Huang Xu, Zhiwen Yu, Jingyuan Yang, Hui Xiong, and Hengshu Zhu. 2016. Talent Circle Detection in Job Transition Networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 655–664.
  • Xu et al. (2018) Tong Xu, Hengshu Zhu, Chen Zhu, Pan Li, and Hui Xiong. 2018. Measuring the Popularity of Job Skills in Recruitment Market: A Multi-Criteria Approach. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.
  • Yang and Pedersen (1997) Yiming Yang and Jan O Pedersen. 1997.

    A comparative study on feature selection in text categorization. In

    Proceedings of the Fourteenth International Conference on Machine Learning. 412–420.
  • Zhang et al. (2016a) Qi Zhang, Yang Wang, Yeyun Gong, and Xuanjing Huang. 2016a. Keyphrase Extraction Using Deep Recurrent Neural Networks on Twitter.. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 836–845.
  • Zhang et al. (2016b) XianXing Zhang, Yitong Zhou, Yiming Ma, Bee-Chung Chen, Liang Zhang, and Deepak Agarwal. 2016b. Glmix: Generalized linear mixed models for large-scale response prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 363–372.
  • Zhang et al. (2014) Yingya Zhang, Cheng Yang, and Zhixiang Niu. 2014. A research of job recommendation system based on collaborative filtering. In Proceedings of the 7th International Symposium on Computational Intelligence and Design, Vol. 1. IEEE, 533–538.
  • Zhu et al. (2016) Chen Zhu, Hengshu Zhu, Hui Xiong, Pengliang Ding, and Fang Xie. 2016. Recruitment market trend analysis with sequential latent variable models. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 383–392.