Dimension scores are derived from public data and fields; weighted into the composite. Reference only.
Locust is a Serverless web discovery and data extraction framework for Node.js, with core use cases including web indexing, web crawling, and web scraping. It is not just a simple single-machine crawler script library; instead, it is designed around Redis queues, Chrome/Puppeteer page execution, and an extensible launch mechanism, making it suitable for splitting crawling tasks into multiple independent jobs that run in a distributed setup.
Locust uses a configuration-driven approach to define tasks. Developers describe the entry URL, extraction logic, concurrency limits, depth limits, filtering rules, and connection configuration. Data extraction supports CSS selectors and provides hooks such as extract, before, after, beforeAll, start, and filter. One of its highlights is support for SPAs: it can wait for page elements to appear and handle frontend applications built with AngularJS, React, Vue.js, and similar frameworks. At the execution layer, it relies on Redis to maintain states such as queued, processing, and done, and uses Chrome/Puppeteer to make requests and execute client-side JavaScript.
Locust can run in local system processes, Node.js processes, AWS Lambda, Google Cloud Functions, and other environments, as long as the start hook defines how new tasks should be launched. For local development, the recommended setup is to use Docker Compose with Redis and browserless/Chrome. It provides a Node.js API, such as execute(jobDefinition), as well as locust-cli, which supports commands including run, start, stop, generate, validate, and info. However, the documentation clearly notes that the CLI is alpha-grade, so it should be evaluated carefully before production use.
The main documentation does not mention commercial pricing, nor does it clearly list a license. The project can be installed via npm as @achannarasappa/locust and @achannarasappa/locust-cli, and the documentation references GitHub issues, so it appears to be more of an open-source developer tool. However, whether it can be used commercially still needs to be verified against the repository license.
Its strengths are a clear architecture, distributed scalability, the ability to handle client-rendered pages, and relatively complete documentation for its API, CLI, lifecycle, and architecture. The downsides are that it depends on Redis and Chrome, which increases deployment complexity; it does not include built-in persistence, so collected results must be written to a database manually in the after hook; and the documentation updates are concentrated around 2019–2020, so its maintenance activity should be confirmed. It is best suited to developers familiar with Node.js who need to build custom distributed crawlers or Serverless data collection systems. It is less suitable for non-engineering users who simply want to grab a few static pages quickly.
The main documentation does not provide information about access from mainland China, mirrors, payment, or hosted services, so this remains unknown. If access to npm or related mirrors is restricted, using an npm mirror source may be an option. Alternatives include Scrapy, Puppeteer, Playwright, Crawlee, and Apify SDK.
⚠ This review is compiled from public sources and does not constitute a purchase recommendation. Verify all facts on the vendor's official site. Verify on locust.dev official site.
locust.dev is an Unknown Dev Tools provider. TG4G tracks its product information, an overall rating of 7.0/10, and a China-accessibility score of China direct-connect friendly. Click "Visit Official Site" to reach locust.dev directly.