Roark is a San Francisco-based, Y Combinator-backed Voice AI Testing & QA platform, positioned as a safety net for voice AI agents. It covers both post-launch monitoring/evaluation and pre-launch simulation testing, aiming to catch “uh-oh” moments—latency, repetition, script deviation, tool-call failures, payment-processing failures, and more—before customers run into them. The site says it has processed 10 million minutes of calls.
On the monitoring side, Roark can capture voice interactions and provide 40+ built-in metrics and events, including latency, instruction-following, repetition detection, sentiment, and more. It also allows custom metrics and events. It supports multi-speaker analysis for up to 15 speakers, automatic speaker recognition, and mentions emotion models, vocal cues, and fine-tuned ASR. Evaluators can be run on demand in the UI or automated through the SDK/API, with support for dashboards, scheduled reports, threshold alerts, and Webhooks.
On the testing side, Roark supports end-to-end simulation of inbound and outbound agents over phone or WebSocket. One particularly useful feature is its ability to turn failed production calls into repeatable tests, while using graphical conversation flows to cover branches and edge cases. Personas are highly configurable, including gender, language, accent, background noise, speaking speed/voice pattern, emotion, clarity of intent, and backstory.
Roark offers one-click integrations with VAPI, Retell, LiveKit Cloud, and Pipecat, as well as Node and Python SDKs. Pricing is usage-based with a minimum monthly spend. All plans include monitoring, simulation, and evaluators, with volume discounts, custom high-usage packages, and no long-term contract requirement. However, the site does not publish unit pricing or the minimum monthly spend amount. For compliance, it mentions SOC2 and HIPAA compliance available, but does not provide details on data retention, training usage, regional deployment, or related topics.
Roark’s strengths are its vertical focus on Voice AI and its relatively complete workflow across metrics, simulation, alerts, and integrations. It is especially suitable for voice agent teams that need continuous regression testing and production quality monitoring. Its limitations are that pricing is not transparent, and support for Chinese speech, Chinese ASR, a Chinese interface, and Chinese customer service is not clearly stated. The underlying models, evaluation accuracy, and false positive/false negative rates are also not disclosed. Overall, it is better suited to mid-sized and large Voice AI teams with real call volume that are willing to contact sales, rather than individual developers or general-purpose AI application teams.
The site does not provide information on access from mainland China, payment methods, or local deployment, so china_access can only be considered unknown. Before purchasing, China-based teams should verify network connectivity, support for China phone numbers/voice routes, payment options, cross-border data transfer, and compliance requirements. Comparable options include LangSmith, Braintrust, Promptfoo, Arize Phoenix, as well as call-center quality inspection solutions from domestic cloud providers.
⚠ This review is compiled from public sources and does not constitute a purchase recommendation. Verify all facts on the vendor's official site. Verify on roark.ai official site.
roark.ai is an United States AI Apps provider. TG4G tracks its product information, an overall rating of 8.0/10, and a China-accessibility score of Workable. Click "Visit Official Site" to reach roark.ai directly.