Dimension scores are derived from public data and fields; weighted into the composite. Reference only.
chemfp is a Python package and command-line toolkit for binary cheminformatics fingerprints. It has a very focused positioning: helping researchers and developers efficiently generate, read, convert, search, and analyze molecular fingerprints in Python environments. Its core selling point is high-performance similarity search, with additional support for Butina clustering, sphere exclusion, directed sphere exclusion, MaxMin diversity selection, and full similarity matrix generation.
In terms of functionality, chemfp covers the full workflow from file formats to algorithms to engineering integration. It supports the FPS text format, the high-performance FPB binary format, and the FPC sparse count fingerprint format. FPB files can be opened quickly via memory mapping, making them suitable for large datasets and scenarios where web services are restarted frequently. At the tooling level, it provides both Unix-style command-line tools and a complete Python API, which can be used in scripts, Django services, Jupyter components, or PyQt desktop applications.
Ecosystem compatibility is another major strength. chemfp supports RDKit, OEChem/OEGraphSim, CDK, Open Babel, and jCompoundMapper, and provides a cross-toolkit Toolkit API and Text Toolkit API to unify molecular I/O, format discovery, error handling, and SDF/SMILES text record processing. It also integrates with NumPy, SciPy, and Pandas, and can output SciPy sparse matrices, NumPy arrays, or full matrices for use with scikit-learn.
The current mainline version is not fully open source. The official materials describe an unlimited source license: after purchase, users receive the full source code except for the license manager, plus one year of support, including new version updates during the support period. Renewal support, time-limited licenses, and binary-only licenses are also available. Precompiled Linux packages can be installed, but some features are restricted or disabled and require an evaluation key. The older chemfp 1.6.1 is a free/open-source version, but it only supports Python 2.7 and is more suitable as a benchmark or for legacy systems.
Its strengths are strong performance, both API and command-line support, extremely detailed documentation, and unified wrappers for multiple chemistry toolkits, which can significantly reduce the cost of working across toolchains. Its drawbacks are that commercial pricing is not disclosed and modern versions have a relatively high licensing barrier. It also cannot generate fingerprints from structure files without third-party chemistry toolkits, and its use cases are concentrated in cheminformatics.
The collected text does not provide information on mainland China access, payment methods, or mirrors, so its access status is rated as unknown. If network access or procurement is constrained, alternatives to evaluate include RDKit’s built-in capabilities, Open Babel, CDK, OEChem/OEGraphSim, or custom workflows built with NumPy/SciPy/scikit-learn. However, large-scale fingerprint search and unified toolkit abstraction may require more engineering effort.
⚠ This review is compiled from public sources and does not constitute a purchase recommendation. Verify all facts on the vendor's official site. Verify on chemfp.com official site.
chemfp.com is an Unknown Dev Tools provider. TG4G tracks its product information, an overall rating of 7.0/10, and a China-accessibility score of China direct-connect friendly. Click "Visit Official Site" to reach chemfp.com directly.