doi: 10.4304/jsw.7.1.228-236
f-Fractional Bit Minwise Hashing
Abstract—In information retrieval, minwise hashing algorithm is often used to estimate similarities among documents. b-bit minwise hashing is capable of gaining substantial advantages in terms of computational efficiency and storage space by only storing the lowest b bits of each (minwise) hashed value (e.g., b=1 or 2). In this paper, we propose a fractional bit hashing method, which extends the existing b-bit Minwise hashing. It is shown theoretically that the fractional bit hashing has a wider range of selectivity for accuracy and storage space requirements. Theoretical analysis and experimental results demonstrate the effectiveness of this method.
Index Terms—similarity, hashing, fractional bit
Cite:Xinpan YUAN, Jun LONG*, Zuping ZHANG, Yueyi LUO, Hao Zhang, Weihua Gui, "f-Fractional Bit Minwise Hashing," Journal of Software vol. 7, no.1, pp. 228-236, 2012.
General Information
ISSN: 1796-217X (Online)
Abbreviated Title: J. Softw.
Frequency: Quarterly
APC: 500USD
DOI: 10.17706/JSW
Editor-in-Chief: Prof. Antanas Verikas
Executive Editor: Ms. Cecilia Xie
Abstracting/ Indexing: DBLP, EBSCO,
CNKI, Google Scholar, ProQuest,
INSPEC(IET), ULRICH's Periodicals
Directory, WorldCat, etcE-mail: jsweditorialoffice@gmail.com
-
Oct 22, 2024 News!
Vol 19, No 3 has been published with online version [Click]
-
Jan 04, 2024 News!
JSW will adopt Article-by-Article Work Flow
-
Apr 01, 2024 News!
Vol 14, No 4- Vol 14, No 12 has been indexed by IET-(Inspec) [Click]
-
Apr 01, 2024 News!
Papers published in JSW Vol 18, No 1- Vol 18, No 6 have been indexed by DBLP [Click]
-
Jun 12, 2024 News!
Vol 19, No 2 has been published with online version [Click]