Modified Password Guessing Methods Based on TarGuess-I

Xie, Zhijie; Zhang, Min; Guo, Yuqi; Li, Zhenhan; Wang, Hongjun

doi:https://doi.org/10.1155/2020/8837210

Wireless Communications and Mobile Computing

On this page

Abstract Introduction Preliminaries Conclusion Data Availability Disclosure Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2020 | Article ID 8837210 | https://doi.org/10.1155/2020/8837210

Modified Password Guessing Methods Based on TarGuess-I

Zhijie Xie,¹Min Zhang,¹Yuqi Guo,²Zhenhan Li,¹and Hongjun Wang¹

Academic Editor: Ding Wang

Received03 Sept 2020

Revised29 Sept 2020

Accepted17 Oct 2020

Published02 Dec 2020

Abstract

is a leading online targeted password guessing model using users’ personally identifiable information (PII) proposed at ACM CCS 2016 by Wang et al. It has attracted widespread attention in password security owing to its superior guessing performance. Yet, after analyzing the users’ vulnerable behaviors of using popular passwords and constructing passwords with users’ PII, we find that this model does not take into account popular passwords, keyboard patterns, and the special strings. The special strings are the strings related to users but do not appear in the users’ demographic information. Thus, we propose , a modified password guessing model with three semantic methods, including (1) identifying popular passwords by generating top-300 lists from similar websites, (2) recognizing keyboard patterns by relative position, and (3) catching the special strings by extracting continuous characters from user-generated PII. We conduct a series of evaluations on six large-scale real-world leaked password datasets. The experimental results show that our modified model outperforms by 2.62% within guesses.

1. Introduction

Password-based authentication is still an essential method in cybersecurity [1]. To understand password security, people have gone through several stages, from some heuristic methods that lack theoretical foundations to those algorithms that conform to strict probability models [2]. Since the emergence of Markov-based [3, 4] and probabilistic context-free grammar- (PCFG-) based [5, 6] password guessing algorithms, trawling password guessing has been intensively studied [7–10]. Recently, several large-scale personal information database leakage events have caused widespread concern in the security community [11–14]. With the development of related researches, the targeted password guessing algorithms using users’ personally identifiable information (PII) have emerged [15–17].

Das et al. [15] studied the threat posed by password reuse and proposed a cross-site cracking algorithm for the first time. However, without considering common popular passwords, this algorithm is not optimal. Li et al. [16] studied to what extent a user’s PII would affect password security and proposed a semantics-rich model, Personal-PCFG, which adopted with length-based PII matching and substitution. But it could not accurately capture the usage of users’ PII, thus greatly hindering the cracking efficiency. As a seminal work of password guessing, Wang et al. [17] put forward a framework, TarGuess, which systematically characterizes typical targeted guessing scenarios. It contains a type-based PII semantics-aware PCFG and recognitions of password reuse behaviors, both of which significantly outperform the former cracking algorithms. Their work has motivated successive new studies on password security [18–20] and even led the revision of the NIST SP800-63-3 [21, 22].

TarGuess framework is proposed after an in-depth analysis of users’ vulnerable behaviors. The framework includes four password guessing models, IV, for four attacking scenarios #1#4. caters for scenario #1, where the attacker is equipped with the users’ explicit PII, such as name, birthday, and phone number. The users’ explicit PII can be easily obtained from the Internet [23] and can be the building blocks of passwords [17]. The rest three models required user information such as PII attributes that play an implicit role in passwords (e.g., gender and profession) and/or sister passwords that were leaked from the user’s other accounts. This work mainly focuses on scenario #1. As more users’ PII is being leaked these days, attacking scenario #1 becomes more practical.

Wang et al. [17] showed that their model is more efficient than previous algorithms using users’ PII to crack passwords, which can gain success rates over 20% with just 100 guesses. Whether it can continue to improve the success rate of guessing models has become a peer research hotspot [24]. After analyzing the users’ vulnerable behaviors in constructing passwords based on , we find that there are some missing attributes in . Therefore, based on , we put forward three modified methods and conduct a series of experiments to examine their feasibility. In the end, we proposed a modified model: , which includes our three improvements. Extensive evaluations have shown that, outperforms its original model by 2.62% within guesses.

1.1. Our Triple Contributions

This article is an extended version of the paper [25]. In this work, we make the following key contributions.

1.1.1. A Modified Password Guessing Model

After analyses of the users’ vulnerable behaviors in constructing passwords on a total of 163,041,192 public leaked data based on , we find that some effective semantic tags have not been testified and employed in . To fill the gap, we make use of the adaptiveness of PII tags and define three new tags: the popular password tag , the keyboard pattern tag , and the special string tag . This gives rise to a variant of TarGuess-I, we call it .

1.1.2. An Extensive Evaluation

To demonstrate the feasibility of these incremental tags, we perform a series of experiments on six large-scale real-world leaked datasets. The experimental results show that our single-tag-modified models ( with each tag we defined individually) outperform by 0.75% in optimal and 0.33% on average within 100 guesses. Particularly, our modified model works best among the 10 models we experimented with. It can successfully crack a target user’s password with an optimal chance of 20.9% within 100 guesses when it gets the same users’ PII as gets, which outperforms by 2.62% (the target user is come from the four sites, see Table 1).

1.1.3. A New Insight

We propose a new method to modify the password guessing model: parsing the passwords into the special strings tag, such as anniversary dates or someone’s name, that do not appear in users’ demographic PII. It can be identified by adding incremental information to the model or refining the model recognitions of user-generated PII (such as e-mail addresses and user names). This method gives a new insight into targeted password guessing.

2. Preliminaries

is a targeted guessing model using users’ PII and builds on the PCFG-based algorithm. This section explicates what kinds of users’ vulnerable behaviors are considered in this work and gives a brief introduction to the PCFG-based algorithm and .

2.1. Explication of Users’ Vulnerable Behaviors

Users’ vulnerable behaviors are the key influence factor of password crackability [26]. A series of related studies have been conducted since the pioneering work of Morris and Thompson in 1979 [27]. Part of the studies based on data analyses, such as [3, 12, 14, 28–31], the others based on user surveys, such as [15, 32–35]. In summary, the discovered users’ vulnerable behaviors can be classified into the following three categories.

2.1.1. Popular Passwords

A large number of studies (such as [3, 14, 30]) have shown that users often choose simple words as passwords or make simple transformed strings to meet the requirements of the website password setting strategy, such as “123456a” meets the “alphanumeric” strategy. These strings, which are frequently used by users, called popular passwords. Furthermore, Wang et al. [36] have found that the Zipf distribution is the main cause of the aggregation of popular passwords.

2.1.2. Password Reuse

After a series of interviews to investigate how users cope with keeping track of many accounts and passwords, Stobert and Biddle [32] point out that users have more than 20 accounts on average. It is fairly impossible for them to create a unique password for each account, so reusing passwords is a rational approach. At the same time, password reuse is a vulnerable behavior; the key is how to reuse.

2.1.3. Passwords Containing Personal Information

Wang et al. [37] note that Chinese users tend to construct passwords with their pinyin name and relevant digits, such as phone number and birthdate, which are quite different from English users. They revealed a new insight into what extent users’ native languages influence their passwords and what extent users’ personal information plays a role in their passwords.

Considering that caters for scenario #1, we only analyze the two categories of users’ vulnerable behaviors (i.e., popular passwords and password containing personal information).

2.2. PCFG-Based Password Guessing Algorithm

Weir et al.’s PCFG-based algorithm [5] has shown a great success in dealing with trawling guessing scenarios [17]. The context-free grammar in [5] is defined as , where (i) is a finite set of variables(ii) is a finite set disjoint from and contains all the terminals of (iii) is the start symbol and (iv) is a finite set of productions of the form: , where

The core assumption of the algorithm is the segments of letters, numbers, and symbols in a password which are independent with each other. Thus, in set, except for start symbol, there are only letters, digits, and symbols tag sets, where represents the segment length, such as represents three-letter segments, and represents four-digit segments.

There are two phases in the algorithm, the training phase and the guess generation phase, as shown in Figure 1. In the training phase, the password is parsed into the segments based on the length and type to generate the password base structure (the start symbol ). Then, it counts the segment frequency table in each tag set and outputs the context-free grammar . In the guess generation phase, passwords are derived by the grammar and the segment frequency tables. The final guess candidates are arranged based on the probability multiplied by all the frequency of segments in the password.

2.3. TarGuess-I Model

is a semantics-aware PCFG model built by the type-based PII tags, which are firstly proposed by Wang et al. Besides the three basic tags in the PCFG-based algorithm, the grammar in includes six PII tags (such as name, user name, birthday, phone number, id card, and e-mail address). For each PII tag, its index number is different from the tags, which represents the type of generation rule for this PII. For example, stands for name usage, while stands for the full name, and stands for the abbreviation of the full name (such as “Zhang San” abbreviated as “zs”). See Figure 2 for a specific description. The grammar is highly adaptive. It can be modified simply by adding incremental tags without changing the whole structure to confirm the function, which brings great convenience to our research.

As shown in Figure 3, for each user, the segment frequency table of each PII tag is generated through the user’s PII. In the training phase, the password is firstly parsed with the PII segments into PII tags, and the rest of the segments are parsed into tags. The guessing phase is similar to the PCFG-based algorithm, but a part of products are intermediate candidates consisting of PII tags (e.g., and ). These intermediate candidates will be matched by the segments from users’ PII before be added to the final guess candidates.

3. Users’ Vulnerable Behaviors in Constructing Passwords

In this section, we analyze the users’ vulnerable behaviors based on real-world leaked data for inspiration to improve . Because of the lack of studies on how Chinese users select passwords, we only focus on Chinese users. We dissect 163,041,192 leaked user passwords from the six websites (see Table 2) for analyzing. Hereinafter, the bold shorthand notation in the brackets of Table 2 represents the source of each dataset. The datasets were hacked by attackers or leaked by insiders and disclosed publicly on the Internet, and some of them have been used in the former research (as shown in Table 2). Due to the lack of datasets containing users’ PII, we choose the unique PII (e.g., e-mail address) in 12306 datasets to match passwords in other datasets. The sizes of matched datasets with PII from each dataset are shown in Table 2.

3.1. Analysis of Popular Passwords

According to the occurrence frequency, the top-10 popular passwords in six filtered databases with the proportion of them are calculated, and the results are shown in Table 3. It shows that 0.51% to 3.40% of users’ passwords can be cracked successfully by just using the top-10 popular passwords. Chinese users prefer simple combinations of numbers (such as “123456,” “111111,” “000000”) and the strings with the meaning of love (such as “5201314” and “woaini1314”). There are also some unique passwords in the top-10 list (such as “aptx4869” in Aipai and “7758521” in Youku). These passwords may come from the site’s name or culture or maybe come from a large number of “ghost accounts” held by a particular user of the website. Besides, the passwords constructed with the QWERTY keyboard pattern (such as “1q2w3e4r” and “1qaz2wsx”) also account for a certain proportion in the popular passwords.

The statistical results of the component form of the top- popular passwords, which are analyzed by the PCFG-based algorithm, are shown in Table 4. It illustrates that, though the majority of popular passwords are composed of pure numbers, composite passwords (e.g., a password included multiple types of characters) also account for a considerable part, especially 45.81% in 12306 and 49.33% in Dodon.

Since the grammar of does not contain tags related to popular passwords, it could lower the success rate if the targeted users are likely to choose composite passwords. Because is based on PCFG, which generates passwords according to the existing base structures generated from data and the elements in each set of tags. Thus, in the training phase, the model parses the composite password into segments, and it might generate many invalid outputs at last, an illusion is shown in Figure 1. For instance, “1qaz2wsx,” a password constructed with keyboard patterns, which is the 3rd popular password in Aipai, will be parsed into by . Meanwhile, “1” ranks the first in the set of tag, and “qaz” ranks the first in . Therefore, in the guessing phase, the first password output with the base structure is “1qaz1qaz.” This password occupies a relatively small proportion in actual password distribution but ranks much higher in the output list, thus reducing the overall password guessing success rate. From this perspective, we come up with an idea to take popular passwords and the keyboard pattern into consideration for the training phase in .

By analyzing popular passwords, we find that there are two missing attributes in the grammar : the popular password tag and the keyboard pattern tag .

3.2. Analysis of Passwords Containing Personal Information

We adopt the model to analyze the datasets. is improved with the popular password tag containing top- list and the keyboard pattern tag . The rank of top-10 password base structures and the proportion of passwords containing PII tag or tag are shown in Table 5. It indicates that about half of Chinese users generally construct passwords using PII or just choose popular ones. We speculate that the top-10 base structures of passwords should be related to the strings that are easy for users to remember. And we also find that some password base structures in the top-10 list are not relevant to users’ PII or containing tag.

The strings which are accessible to memorize include users’ PII conversions, keyboard patterns, and popular passwords. They also include the user-created strings that have special meaning for the user but are no equal importance to other users, we call them “the special strings.” To give an example, we assume that user creates a string “080405” as his password, then “080405” shall be special for him. But for another user , “080405” is nothing other than a common string; then, the probability of ’s password containing this string shall be different from ’s. Meanwhile, we cannot find the string “080405” in ’s or ’s demographic information (such as name, ID number, and telephone number). The special string cannot be extracted from the user’s demographic information but may appear in some like the prefix of e-mail addresses and user names that the user-created strings, or in user’s passwords from other servers.

The parse of users’ PII in also includes the user-generated strings, such as e-mail address and user name . However, the analyses of these two user-generated strings are not sophisticated enough. Only three parse type (Entire , the first letter segments and the first digit segments ) are proposed. The special strings for each user, as the above said, the probability distribution is different. If we use the original model, because of the lack of recognition of the special string, most of these segments will be parsed into typical tags, merging the user behavior characteristics, thus hinder the effectiveness of password cracking. Therefore, we consider adding the special string tag to the grammar of .

We analyze the coverage of consecutive substrings of the e-mail address and user name in the password. The result is shown in Figure 4. It reveals that a significant number of user passwords do overlap user-created strings; thus, it gives us a new hint that when an attacker obtains information about a user that is not public or very useful, they may turn that information into a special string to participate in password guessing. This idea may serve as a new direction for further research.

4. Implementations of Modified Methods

After analyzing users’ vulnerable behaviors in constructing passwords in Section 3, we find that does not take into account three attributes, including popular passwords, keyboard patterns, and the special strings. Thus, we come up with three ideas for modifying . (1)Add the popular password tag to the grammar and apply the popular password list generated from a dataset similar to the target website(2)Add the keyboard pattern tag to the grammar and identify password segments with physical location sequence in QWERTY keyboard(3)Add the special string tag to the grammar and extracted continuous characters from the user-generated PII

Figure 5 gives a brief explanation on how we try to modified and generate . In this section, we will study for the implementations of these modified methods.

4.1. Popular Passwords

Add the popular password tag to the grammar , and the set of elements in tag is a top- popular password list based on the data statistics of similar websites. The number in has no meaning but to conform to the grammar format. The parse of tag is shown in Figure 6.

In the training phase, the top- list is matched with the password data by a regular expression. If the match occurs, the occurrence of the corresponding password in set is increased by 1. In the guess generation phase, the probabilities of containing password structures are multiplied by the frequencies of the corresponding password in the element set of as the final probabilities of output passwords.

To find out which is the optimal parameter in setting the improved guessing model, we conduct a series of evaluations with top- popular passwords based on 12306 data. Figure 7 is a contour plot displaying how top- popular passwords influence the success rate. It can be seen that, within 100 guesses, the success rate increased slightly with the growing , and the success rate is a bit higher with under 100 guesses. Figure 8 shows the similarities of top- popular passwords between two different services. The similarity fluctuates greatly within top-100, and it tends to a stable peak when is around at 300, then gradually reduces as continues to grow. With the exception of Aipai and 12306, the top-300 popular password list of each dataset has a similar list from another dataset (their similarity ). The dispersion of shared password fractions implies that different types of services do impact on top popular passwords. Based on the above experimental results, we set in the cross-site password guessing scenario.

(a) With 12306

(b) With Youku

(c) With Dodon

(d) With Tinya

(e) With Senda

(f) With Aipai

4.2. Keyboard Patterns

The process of keyboard pattern is compliant with the left-hand side () principle. First, get the relative position of the character on the keyboard and then determine whether the latter character is adjacent to the previous character position. If the length of the string with adjacent characters , then divide the segment into the keyboard order variable, where . An illusion of the process is shown in Figure 9.

See Table 6 for the password base structure ranking and proportion with keyboard pattern tags in our experimental datasets. represents the password generated by 8-length of keyboard patterns (such as “1qaz2wsx”). It should note that is a password composed of 2 nonadjacent keyboard pattern strings, such as “1234asdf.” Table 6 shows that the proportion of the passwords containing keyboard patterns is 0.88% to 1.37% in our experimental datasets.

4.3. The Special Strings

Considering that only two user-generated PII is needed in , e-mail addresses and user names, and the limitation of experimental resources, we only generated the elements of the special string from these two PII. Since there are various and different ways for each user to generate special strings, it is difficult to categorize the generation methods of special strings uniformly and may cause sparse data. Therefore, and only classified according to the length of special strings and the position where the special string occurs. An illustration of the special string process is shown in Figure 10.

Figure 11 shows the success rate of the -tag-modified models compared to . Each modified model has a different threshold number of the identification length for tag. Unfortunately, we find that no matter how we change the threshold, the model with tags does not work well. The reasons for the poor performance of the tag may be as follows: (1)This semantic tag is originally based on incremental information to improve the targeted password guessing model, that is, the user’s other information besides users’ demographic PII (such as work number, home address, and lover’s name). However, due to the limitation of experimental conditions, we cannot obtain more incremental information, but can only make a finer segmentation from the user-generated PII (such as e-mail address and user name, which have been already analyzed in )(2)Our improved method of the special strings is not in line with the habit of users setting passwords. We divide the user-generated PII according to length and calculate the relative position of substrings (as shown in Figure 10), which is a length-based method. That was also confirmed by Wang et al. to be insufficient for the analysis of users’ behavioral characteristics. For a long user-generated PII string, this method generates too many invalid substrings

In a real situation, if an attacker is about to attack a targeted user, he/she will do his/her best to obtain the information required for attacking. Therefore, the problem caused by reason 1 does not exist. What we do is highlight the threat that incremental information from users will help for targeted password guessing. And for reason 2, we deleted the relative position with low frequency in the training results, leaving the one with the largest proportion. Thus, we regenerate the implementation of tag, and the experimental result is shown as the success rate of tag (filtered) in Figure 11. We will further study the implementation of tag in the future.

5. Experiments

is mainly used in online guessing scenarios, where the guess number allowed is the scarcest resource, while computational power and bandwidth are not essential. Therefore, we mainly evaluate the availability of the modified guessing models by the success rates with guess number .

5.1. Experiment Setup

To make our experiments as scientific as possible, we follow 3 rules. (1)Training sets and testing sets are strictly separated(2)The comparison experiments of the two models are based on the same training sets and testing sets(3)The training sets shall be as large as possible

To abide by rule 1 and rule 3, we chose the largest sized datasets 12306 and Youku as training sets and remove it from the testing list. Particularly, users’ passwords in each dataset are highly heterogeneous. The distribution of password semantics may greatly different between two sites even in distinct parts of the same dataset. Thus, the fraction of successfully cracked passwords in each dataset evaluated by different password guessing model may fluctuate greatly. To avoid the heterogeneity of datasets that may hinder our observation of the feasibility of improved methods, we use Monte Carlo method to randomly extract data and generate 10 testing sets from each dataset. The size of each testing set is . We applied these sets for every evaluation.

Table 7 shows the four-dimensional variables of the experiment setup. We build nine models by adding three improved tags individually or in combination to (hereinafter referred to as “”). The four single-tag-modified models (e.g., , , , and ) are built to evaluate the validation of our three modified methods (e.g., the popular password tag , the keyboard pattern tag , and the special string tag ). To make our evaluations more realistic, we define two kinds of scenarios with tags. In one optimal scenario (e.g., with P tag), attackers have got the target site’s top-300 popular password list for cracking. In the other scenario (e.g., with P' tag), which is more realistic, attackers are only able to crack the target site with similar top-300, from which we choose based on the analysis in Figure 8. The four combined-tag-modified models (e.g., , , , , and ) are built to find out the optimal model and whether incremental attributes can improve the efficiency of password guessing. We set a total of 80 attacking scenarios based on these four-dimensional variables and conducted 10 experiments on each one.

Figure 12 shows the average of cracking success rate with guess number evaluated by nine models trained from two sites and tested from four sites. As shown in the figure, the differences of cracking success rate in each model are not obvious if we just compare them with guess-number-graph like this. Thus, to make the experimental results easier to analyze, we calculate the relative values between each model and original with guess number , which: where is the success rate of improved model tested from ()th testing set with guess number , and is that of TG-I.

5.2. Experiment 1: Validation of the Modified Methods

To demonstrate the effectiveness of our modified methods, we compare the cracking success rate of the four single-tag-modified models (e.g., , , , and ) with that of based on the testing data. The -number-graphs of the four single-tag improved models are shown in Figure 13, and the average statistics of them are shown in Table 8. As shown in the Table 8, except that the trained by Youku has an average of 0.03% lower than within 100 guesses, the rest of the single-tag-modified models outperformed . They outperformed by on average within guesses. It proves the effectiveness of our three modified methods.

(a)

(b)

(c)

(d)

Figures 13(a) and 13(b) show that, compared to , the guess performances of and models are magnified as the number of guesses increases. In the range of guesses, our modified models have no better performances than , and the cracking success rates of our models are even lower than under some guess numbers. For instance, when is evaluated using the Aipai datasets, the cracking success rate of is (trained from 12306) lower than at 50 guesses and (trained from Youku) lower than at 40 guesses. This is because the passwords containing or tag are relatively small in the overall password distribution, and the addition of the corresponding tag will only affect the lower-ranked candidate passwords. In addition, for , it also may be because the implementation of our tag is still not consistent with the users’ behaviors. It causes the modified model to incorrectly generate the candidate passwords with a higher ranking. Nevertheless, as the number of guesses increases, the advantages of our model are reflected. In the guess range of , the of our models are slightly increased. And in the guess range of , the cracking success rates of our models are significantly and stably better than . At the number of guesses, compared with , the cracking success rates of are increased by at most and at least, and the cracking success rates of are increased by at most and at least. We find that these methods worked for trawling scenarios because it does increase the success rate of the modified models over 100 guesses.

As shown in Figures 13(c) and 13(d), the models with tag significantly outperform between 100 and 10³ guesses. In this range, the cracking success rates of are at most (trained from 12306) and (trained from Youku) higher than , and the cracking success rates of are at most (trained from 12306) and (trained from Youku) higher than . The reason for this is that popular passwords rank first in the grammar , while the composite-form popular passwords are in the bottom half of the top-300 list. As the above said, composite-form popular passwords cause to produce invalid output.

Particularly, has a lower guess success rate than within 100 guesses. It is because the tops of the top-300 popular password lists included in the tag are different from these in the testing site. As a result, there are a few invalid outputs in the top 100 candidate passwords. It can be seen that there is no such phenomenon in the results of the models, which show that the models also have improvements within 100 guesses.

Interestingly, there is an outlier curve in each -number-graph at Figures 13(b)–13(d). Some curves are significantly higher than others, and some are significantly lower than others. The model has an average guess success rate of higher than on the Dodon testing data, while it has an average of higher than on the rest of testing data. The model (trained from Youku) shows a lower success rate than on the Aipai testing data, while the worst guess performance for other testing data is lower than . This phenomenon may be due to the different distribution of each password dataset. We find that there are a few “uncleaned” passwords data in Aipai’s popular password list and datasets, such as “0a2cb03c4dc29cfc0d56afa46ae8fd2e” ranked 20th in the top-300 list. Thus, these “uncleaned” popular password data may cause a reduction in the models’ success rate.

5.3. Experiment 2: Comparison and Evaluation of Modified Models

We evaluate each combined-tag-modified model to find out the optimal skim. Table 9 calculates the average compared each modified model with . modified with our three incremental tags has the best improvement effect (see Figure 14(f)).

(a)

(b)

(c)

(d)

(e)

(f) Avg of the modified models

It can be seen in Figure 14(f) that, by comparing , , and models, the improvement effects of the modified models are magnified as the number of incremental tags increase. This phenomenon can also be seen by comparing , , , and models. However, we also find that the improvement effects of the combined-tag-modified models with tag are not strongly correlated with the single-tag-modified models with or tag. This may because popular passwords account for a large proportion of the password distribution, and there is an obvious gap to the proportion of passwords containing the keyboard patterns or the special strings. Therefore, in terms of the degree of influence on the success rate of guessing, the influence of adding tag is far greater than that of adding or tag.

Table 1 shows the guessing performance of evaluated based on each testing dataset. It can be seen that, except that the success rate of based on Aipai dataset is weaker than that of within 100 guesses, outperforms based on the other three testing datasets. The reason for the poor results of Aipai dataset has been mentioned in the analysis of in experiment 5.2. outperforms by (trained from 12306) and (trained from Youku) within 100 guesses.

Table 10 shows the top-10 base structure of candidate passwords generated by and and the proportions of base structures containing incremental tags in candidate passwords. It can be seen that generates nearly 10% more candidate passwords containing incremental tags than does. Meanwhile, the passwords with the top-10 base structure in Table 10 are very easy to be cracked by the targeted password guessing models. Therefore, we recommend users to avoid setting similar passwords.

In all, the modified methods we proposed are effective. Our results reiterate the threat posed by users using popular passwords and keyboard mode passwords and highlight the threat of targeted password guessing. When an attacker gets more information about a user, the user’s password is more likely to be cracked. Our work implies that for important applications, a multifactor authentication scheme (e.g., [40–42]) is necessary.

6. Conclusion

Based on the well-known password guessing model and six real-world leaked password datasets, we conduct an in-depth analysis of users’ vulnerable password setting behavior and targeted password guessing. We find three missing elements in and propose an improved model: , which is capable of identifying popular passwords, keyboard patterns, and the special strings. Experimental results show that our improved model outperforms by 2.62% within guesses. We highlight the threat posed by targeted password guessing. Our modified idea of the special strings sheds new light on password guessing, but the implementation of this idea is not optimal. We will further study in this direction.

Data Availability

The experimental datasets were disclosed publicly on the Internet (https://breachalarm.com/all-sources). And the probabilistic context-free grammar- (PCFG-) based algorithm code can be found in https://github.com/lakiw/pcfg_cracker.

Disclosure

This article is an extended version of the paper [25].

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors are very grateful to the anonymous reviewers for their valuable advice that improves the completeness of this paper. This work was supported by the National Natural Science Foundation of China under grant no. 61971473.

References

J. Bonneau, C. Herley, P. C. Van Oorschot, and F. Stajano, “Passwords and the evolution of imperfect authentication,” Communications of the ACM, vol. 58, no. 7, pp. 78–87, 2015.
View at: Publisher Site | Google Scholar
D. Wang, Research on Key Issues in Password Security, [Ph.D. thesis], Peking University, 2017, http://wangdingg.weebly.com/uploads/2/0/3/6/20366987/phd_thesis0103.pdf.
J. Ma, W. Yang, M. Luo, and N. Li, “A study of probabilistic password models,” in 2014 IEEE Symposium on Security and Privacy, pp. 689–704, San Jose, CA, USA, 2014.
View at: Publisher Site | Google Scholar
A. Narayanan and V. Shmatikov, “Fast dictionary attacks on passwords using time-space tradeoff,” in Proceedings of the 12th ACM conference on Computer and communications security - CCS '05, pp. 364–372, 2005.
View at: Publisher Site | Google Scholar
M. Weir, S. Aggarwal, B. De Medeiros, and B. Glodek, “Password cracking using probabilistic context-free grammars,” in 2009 30th IEEE Symposium on Security and Privacy, pp. 391–405, Berkeley, CA, USA, 2009.
View at: Publisher Site | Google Scholar
R. Veras, C. Collins, and J. Thorpe, “On semantic patterns of passwords and their security impact,” in Proceedings 2014 Network and Distributed System Security Symposium, San Diego, CA, USA, 2014.
View at: Publisher Site | Google Scholar
W. Melicher, B. Ur, S. M. Segreti et al., “Fast, lean, and accurate: modeling password guessability using neural networks,” in 25th USENIX Security Symposium (USENIX Security 16), pp. 175–191, Austin, TX, USA, 2016.
View at: Google Scholar
S. Aggarwal, S. Houshmand, and M. Weir, “New technologies in password cracking techniques,” in Cyber Security: Power and Technology, pp. 179–198, Springer, 2018.
View at: Google Scholar
E. Tirado, B. Turpin, C. Beltz, P. Roshon, R. Judge, and K. Gagneja, “A new distributed brute-force password cracking technique,” in Future Network Systems and Security. FNSS 2018, pp. 117–127, Springer, 2018.
View at: Publisher Site | Google Scholar
B. Hitaj, P. Gasti, G. Ateniese, and F. Perez-Cruz, “Passgan: a deep learning approach for password guessing,” in Applied Cryptography and Network Security. ACNS 2019, pp. 217–237, Springer, 2019.
View at: Publisher Site | Google Scholar
S. Ji, S. Yang, X. Hu, W. Han, Z. Li, and R. Beyah, “Zero-sum password cracking game: a large-scale empirical study on the crackability, correlation, and security of passwords,” IEEE Transactions on Dependable and Secure Computing, vol. 14, no. 5, pp. 550–564, 2017.
View at: Publisher Site | Google Scholar
Z. Li, W. Han, and W. Xu, “A large-scale empirical analysis of chinese web passwords,” in 23rd USENIX Security Symposium (USENIX Security 14), pp. 559–574, San Diego, CA, USA, 2014.
View at: Google Scholar
R. V. Yampolskiy, “Analyzing user password selection behavior for reduction of password space,” in Proceedings 40th Annual 2006 International Carnahan Conference on Security Technology, pp. 109–115, Lexington, KY, USA, 2006.
View at: Publisher Site | Google Scholar
M. K. Liu Gong-Shen, Q. Wei-Dong, and L. Jian-Hua, “Password vulnerability assessment and recovery based on rules mined from large-scale real data,” Chinese Journal of Computers, vol. 39, no. 3, pp. 454–467, 2016.
View at: Google Scholar
A. Das, J. Bonneau, M. Caesar, N. Borisov, and X. Wang, “The tangled web of password reuse,” in Proceedings 2014 Network and Distributed System Security Symposium, p. 7, San Diego, CA, USA, 2014.
View at: Publisher Site | Google Scholar
Y. Li, H. Wang, and K. Sun, “A study of personal information in human-chosen passwords and its security implications.,” in IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications, pp. 1–9, San Francisco, CA, USA, 2016.
View at: Publisher Site | Google Scholar
D. Wang, Z. Zhang, P. Wang, J. Yan, and X. Huang, “Targeted online password guessing: an underestimated threat,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 1242–1254, Vienna Austria, 2016.
View at: Publisher Site | Google Scholar
K. C. Wang and M. K. Reiter, “How to end password reuse on the web,” in Proceedings 2019 Network and Distributed System Security Symposium, San Diego, CA, USA, 2019.
View at: Publisher Site | Google Scholar
B. Lu, X. Zhang, Z. Ling, Y. Zhang, and Z. Lin, “A measurement study of authentication rate-limiting mechanisms of modern websites,” in Proceedings of the 34th Annual Computer Security Applications Conference, pp. 89–100, San Juan, PR, USA, 2018.
View at: Publisher Site | Google Scholar
B. Pal, T. Daniel, R. Chatterjee, and T. Ristenpart, “Beyond credential stuffing: password similarity models using neural networks,” in 2019 IEEE Symposium on Security and Privacy (SP), pp. 417–434, San Francisco, CA, USA, 2019.
View at: Publisher Site | Google Scholar
P. A. Grassi, J. L. Fenton, E. Newton et al. et al., “Nist special publication 800-63b: digital identity guidelines,” Enrollment and Identity Proofing Requirements, 2017, https://pages.nist.gov/800-63-3/sp800-63b.html.
View at: Google Scholar
A. D. Jaggard and P. Syverson, “Oft target,” in Proceedings of the PET, Barcelona, Spain, 2018.
View at: Google Scholar
M. Guri, E. Shemer, D. Shirtz, and Y. Elovici, “Personal information leakage during password recovery of internet services,” in 2016 European Intelligence and Security Informatics Conference (EISIC), pp. 136–139, Uppsala, Sweden, 2016.
View at: Publisher Site | Google Scholar
C. Wang, S. T. Jan, H. Hu, D. Bossart, and G. Wang, “The next domino to fall: empirical analysis of user passwords across online services,” in Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy, pp. 196–203, New York, NY, USA, 2018.
View at: Publisher Site | Google Scholar
Z. Xie, M. Zhang, A. Yin, and Z. Li, “A new targeted password guessing model,” in Information Security and Privacy. ACISP 2020, pp. 350–368, Springer, 2020.
View at: Publisher Site | Google Scholar
A. Adams and M. A. Sasse, “Users are not the enemy,” Communications of the ACM, vol. 42, no. 12, pp. 40–46, 1999.
View at: Publisher Site | Google Scholar
R. Morris and K. Thompson, “Password security,” Communications of the ACM, vol. 22, no. 11, pp. 594–597, 1979.
View at: Publisher Site | Google Scholar
J. Bonneau, “The science of guessing: analyzing an anonymized corpus of 70 million passwords,” in 2012 IEEE Symposium on Security and Privacy, pp. 538–552, San Francisco, CA, USA, 2012.
View at: Publisher Site | Google Scholar
M. L. Mazurek, S. Komanduri, T. Vidas et al., “Measuring password guessability for an entire university,” in Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security - CCS '13, pp. 173–186, Berlin Germany, 2013.
View at: Publisher Site | Google Scholar
D. V. Bailey, M. Durmuth, and C. Paar, “Statistics on password re-use and adaptive strength for financial accounts,” in Security and Cryptography for Networks. SCN 2014, pp. 218–235, Springer, 2014.
View at: Publisher Site | Google Scholar
E. I. Tatlı, “Cracking more password hashes with patterns,” IEEE Transactions on Information Forensics and Security, vol. 10, no. 8, pp. 1656–1665, 2015.
View at: Publisher Site | Google Scholar
E. Stobert and R. Biddle, “The password life cycle: user behaviour in managing passwords,” in 10th Symposium On Usable Privacy and Security (SOUPS 2014), pp. 243–255, Menlo Park, CA, USA, 2014.
View at: Google Scholar
P. G. Kelley, S. Komanduri, M. L. Mazurek et al., “Guess again (and again and again): measuring password strength by simulating password-cracking algorithms,” in 2012 IEEE Symposium on Security and Privacy, pp. 523–537, San Francisco, CA, USA, 2012.
View at: Publisher Site | Google Scholar
B. Ur, F. Noma, J. Bees et al., ““I added‘!’at the end to make it secure”: observing password creation in the lab,” in Eleventh Symposium On Usable Privacy and Security (SOUPS 2015), pp. 123–140, Ottawa, Canada, 2015.
View at: Google Scholar
R. Shay, L. Bauer, N. Christin et al., “A spoonful of sugar? The impact of guidance and feedback on passwordcreation behavior,” in 2015 in Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems - CHI '15, pp. 2903–2912, Seoul Republic of Korea, 2015.
View at: Publisher Site | Google Scholar
D. Wang, H. Cheng, P. Wang, X. Huang, and G. Jian, “Zipf’s law in passwords,” IEEE Transactions on Information Forensics and Security, vol. 12, no. 11, pp. 2776–2791, 2017.
View at: Publisher Site | Google Scholar
D. Wang, P. Wang, D. He, and Y. Tian, “Birthday, name and bifacial-security: understanding passwords of chinese web users,” in 28th USENIX Security Symposium (USENIX Security 19), pp. 1537–1555, Santa Clara, CA, USA, 2019.
View at: Google Scholar
D. Wang, H. Cheng, Q. Gu, and P. Wang, “Understanding passwords of Chinese users: characteristics, security and implications,” in CACR Report, Presented at ChinaCrypt, Shanghai, China, 2015.
View at: Google Scholar
D. Wang, D. He, H. Cheng, and P. Wang, “fuzzyPSM: a new password strength meter using fuzzy probabilistic context-free grammars,” in 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 595–606, Toulouse, France, 2016.
View at: Publisher Site | Google Scholar
C. Wang, D. Wang, Y. Tu, G. Xu, and H. Wang, “Understanding node capture attacks in user authentication schemes for wireless sensor networks,” IEEE Transactions on Dependable and Secure Computing, 2020.
View at: Publisher Site | Google Scholar
Q. Jiang, N. Zhang, J. Ni, J. Ma, X. Ma, and K. K. R. Choo, “Unified biometric privacy preserving threefactor authentication and key agreement for cloudassisted autonomous vehicles,” IEEE Transactions on Vehicular Technology, vol. 69, no. 9, pp. 9390–9401, 2020.
View at: Publisher Site | Google Scholar
D. Wang, W. Li, and P. Wang, “Measuring two-factor authentication schemes for real-time data access in industrial wireless sensor networks,” IEEE Transactions on Industrial Informatics, vol. 14, no. 9, pp. 4081–4092, 2018.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2020 Zhijie Xie et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies