The goal of the BenchFlow project is to design the first benchmark for assessing and comparing the performance of workflow management systems (aka business process execution engines). Given the large number of heterogeneous systems and languages that have been proposed for modeling and executing workflows, to ensure the feasibility of this project, we will initially focus our efforts on a benchmark for standard WS-BPEL compliant Web service composition engines. Due to the inherent complexity of workflow engine architectures and the very large number of parameters affecting their performance, benchmarking workflow systems poses a number of scientific research challenges that require investigating a novel set of performance evaluation techniques. These will complement existing ones used to benchmark database management systems (e.g., TPC) or programming language compilers (e.g., SPEC), which are already well understood and have contributed to drive a very large performance improvement within the database, compiler and processor architecture industry. Workflow systems have become the platform to build composite service-oriented applications, whose performance depends on two factors: the performance of the workflow system itself and the performance of the composed services (which could lie outside of the control of the workflow). We plan to use a model-driven, self-/recursive testing approach to eliminate the impact of the external services by having them implemented as processes. The processes selected for the benchmark will be synthesized out of real-world WS-BPEL processes. Given the very large number of metrics that can be used to observe the performance of a workflow system, we aim at distilling a reduced set of performance indicators to compare different engines as well as different configurations of a given engine.