This article presents parallel cost analysis, a static cost analysis targeting to over-approximate the cost of parallel execution in distributed systems. In contrast to the standard notion of serial cost, parallel cost captures the cost of synchronized tasks executing in parallel by exploiting the true concurrency available in the execution model of distributed processing. True concurrency is challenging for static cost analysis, because the parallelism between tasks needs to be soundly inferred, and the waiting and idle processor times at the different locations need to be accounted for. Parallel cost analysis works in three phases: (1) it performs a block-level analysis to estimate the serial costs of the blocks between synchronization points in the program; (2) it then constructs a distributed flow graph (DFG) to capture the parallelism, the waiting, and idle times at the locations of the distributed system; and (3) the parallel cost can finally be obtained as the path of maximal cost in the DFG. We prove the correctness of the proposed parallel cost analysis, and provide a prototype implementation to perform an experimental evaluation of the accuracy and feasibility of the proposed analysis..