Volume 6 Number 12 (Dec. 2011)
Home > Archive > 2011 > Volume 6 Number 12 (Dec. 2011) >
JSW 2011 Vol.6(12): 2391-2398 ISSN: 1796-217X
doi: 10.4304/jsw.6.12.2391-2398

Multi-pattern Matching with Wildcards

Meng Zhang1, Yi Zhang2, Jijun Tang3, Xiaolong Bai1

1College of Computer Science and Technology, Jilin University, Changchun, China
2Department of Computer Science, Jilin Business and Technology College, Changchun, China
3Department of Computer Science & Engineering, Univ. of South Carolina, USA


Abstract—Multi-pattern matching with wildcards is to find all the occurrences of a set of patterns with wildcards in a text. This problem arises in various fields, such as computational biology and network security. But the problem is not extensively studied as the single pattern case and there is no efficient algorithm for this problem. In this paper, we present efficient algorithms based on the fast Fourier transform. Let P = {p1, . . . , pk} be a set of patterns with wildcards where the total length of patterns is |P|, and a text t of length n over alphabet a1, . . . , a. We present three algorithms for this problem where patterns are matched simultaneously. The first algorithm finds the matches of a small set of patterns in the text in O(n log |P| + occ log k) time where occ is the total number of occurrences of P in t. The words used in the algorithm are of size kd2 lg e+Pk i=1dlg |pi|e bits. The second algorithm is based on a prime number encoding. It runs in time O(n logm + occ log k) where m is the length of the longest pattern in P. The algorithm uses words with kdlg(2m2 +k2)e bits. The third one finds the occurrences of patterns in the text in time O(n log |P| log  + occ log k) by computing the Hamming distance between patterns and the text. The algorithm uses words with Pk i=1dlg |pi|e bits. Moreover, we demonstrate an FFT implementation based on the modular arithmetic for machines with 64-bit word. Finally, we show that these algorithms can be easily parallelized, and the parallelized algorithms are given as well.

Index Terms—Algorithm; Multi-pattern matching; Wildcards; FFT.

[PDF]

Cite: Meng Zhang, Yi Zhang, Jijun Tang, Xiaolong Bai, "Multi-pattern Matching with Wildcards," Journal of Software vol. 6, no. 12, pp. 2391-2398, 2011.

General Information

ISSN: 1796-217X (Online)
Frequency:  Quarterly
Editor-in-Chief: Prof. Antanas Verikas
Executive Editor: Ms. Yoyo Y. Zhou
Abstracting/ Indexing: DBLP, EBSCO, CNKIGoogle Scholar, ProQuest, INSPEC(IET), ULRICH's Periodicals Directory, WorldCat, etc
E-mail: jsweditorialoffice@gmail.com
  • Mar 01, 2024 News!

    Vol 19, No 1 has been published with online version    [Click]

  • Apr 26, 2021 News!

    Vol 14, No 4- Vol 14, No 12 has been indexed by IET-(Inspec)     [Click]

  • Nov 18, 2021 News!

    Papers published in JSW Vol 16, No 1- Vol 16, No 6 have been indexed by DBLP   [Click]

  • Jan 04, 2024 News!

    JSW will adopt Article-by-Article Work Flow

  • Nov 02, 2023 News!

    Vol 18, No 4 has been published with online version   [Click]