Composite endpoints are frequently used in clinical trials of investigational treatments for acute heart failure, eg, to boost statistical power and reduce the overall sample size. By incorporating multiple and varying types of clinical outcomes they provide a test for the overall efficacy of the treatment. Our objective is to compare the performance of popular composite end points in terms of statistical power and describe the uncertainty in these power estimates and issues concerning implementation. We consider several composites that incorporate outcomes of varying types (eg, time to event, categorical, and continuous). Data are simulated for 5 outcomes, and the composites are derived and compared. Power is evaluated graphically while varying the size of the treatment effects, thus describing the sensitivity of power to varying circumstances and eventualities such as opposing effects. The average z score offered the most power, although caution should be exercised when opposing effects are anticipated. Results emphasize the importance of an a priori assessment of power and scientific basis for construction, including the weighting of individual outcomes deduced from data simulations. The interpretation of a composite should be made alongside results from the individual components. The average z score offers the most power, but this should be considered in the research context and is not without its limitations.