Skip to content

Python port of the R package 'jaccard' - Jaccard/Tanimoto similarity test and estimation methods

License

Notifications You must be signed in to change notification settings

ncchung/jaccard-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Statistical test of similarity between binary data using the Jaccard/Tanimoto coefficients

Minimal Python port of the R package jaccard on CRAN and dev version on GitHub by Chung et al. (2019).

Computes the Jaccard/Tanimoto similarity coefficient for binary vectors, its expectation under independence, and a bootstrap hypothesis test.

Functions

  • jaccard(x, y, center=False, px=None, py=None) — Jaccard similarity, optionally centered by its expectation.
  • jaccard_test(x, y, px=None, py=None, verbose=True, fix="x", B=1000, seed=None) — Bootstrap hypothesis test returning observed statistic, null distribution, p-value, and expectation.

Usage

import numpy as np
from jaccard import jaccard, jaccard_ev, jaccard_test

jaccard(x, y, center=True)
jaccard_test(x, y, B=1000, seed=1, verbose=True)

Reference

Chung, N.C., Miasojedow, B., Startek, M., and Gambin, A. (2019). "Jaccard/Tanimoto similarity test and estimation methods for biological presence-absence data." BMC Bioinformatics.

License

GPL (>= 2)

About

Python port of the R package 'jaccard' - Jaccard/Tanimoto similarity test and estimation methods

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published