Volume 6 Number 12 (Dec. 2011)
Home > Archive > 2011 > Volume 6 Number 12 (Dec. 2011) >
JSW 2011 Vol.6(12): 2391-2398 ISSN: 1796-217X
doi: 10.4304/jsw.6.12.2391-2398

Multi-pattern Matching with Wildcards

Meng Zhang1, Yi Zhang2, Jijun Tang3, Xiaolong Bai1
1College of Computer Science and Technology, Jilin University, Changchun, China
2Department of Computer Science, Jilin Business and Technology College, Changchun, China
3Department of Computer Science & Engineering, Univ. of South Carolina, USA


Abstract—Multi-pattern matching with wildcards is to find all the occurrences of a set of patterns with wildcards in a text. This problem arises in various fields, such as computational biology and network security. But the problem is not extensively studied as the single pattern case and there is no efficient algorithm for this problem. In this paper, we present efficient algorithms based on the fast Fourier transform. Let P = {p1, . . . , pk} be a set of patterns with wildcards where the total length of patterns is |P|, and a text t of length n over alphabet a1, . . . , a. We present three algorithms for this problem where patterns are matched simultaneously. The first algorithm finds the matches of a small set of patterns in the text in O(n log |P| + occ log k) time where occ is the total number of occurrences of P in t. The words used in the algorithm are of size kd2 lg e+Pk i=1dlg |pi|e bits. The second algorithm is based on a prime number encoding. It runs in time O(n logm + occ log k) where m is the length of the longest pattern in P. The algorithm uses words with kdlg(2m2 +k2)e bits. The third one finds the occurrences of patterns in the text in time O(n log |P| log  + occ log k) by computing the Hamming distance between patterns and the text. The algorithm uses words with Pk i=1dlg |pi|e bits. Moreover, we demonstrate an FFT implementation based on the modular arithmetic for machines with 64-bit word. Finally, we show that these algorithms can be easily parallelized, and the parallelized algorithms are given as well.

Index Terms—Algorithm; Multi-pattern matching; Wildcards; FFT.

[PDF]

Cite: Meng Zhang, Yi Zhang, Jijun Tang, Xiaolong Bai, "Multi-pattern Matching with Wildcards," Journal of Software vol. 6, no. 12, pp. 2391-2398, 2011.

General Information

ISSN: 1796-217X (Online)
Frequency: Monthly (2006-2019); Bimonthly (Since 2020)
Editor-in-Chief: Prof. Antanas Verikas
Executive Editor: Ms. Yoyo Y. Zhou
Abstracting/ Indexing: DBLP, EBSCO, ProQuest, INSPEC, ULRICH's Periodicals Directory, WorldCat, etc
E-mail: jsw@iap.org
  • Dec 06, 2019 News!

    Vol 14, No 1- Vol 14, No 4 has been indexed by EI (Inspec)   [Click]

  • Apr 16, 2020 News!

    Papers published in JSW Vol 14, No 1- Vol 15 No 1 have been indexed by DBLP     [Click]

  • May 12, 2020 News!

    Vol 15, No 4 has been published with online version     [Click]

  • Aug 01, 2018 News!

    [CFP] 2020 the annual meeting of JSW Editorial Board, ICCSM 2020, will be held in Rome, Italy, July 17-19, 2020   [Click]

  • May 12, 2020 News!

    The papers published in Vol 15, No 4 have all received dois from Crossref     [Click]