Find duplicate code with CPD

A copy/paste detector

I’m sure that you know PMD, the famous static code analyzer. But have you ever tried CPD?


CPD belongs to PMD. You have to install PMD first.

 $ cd $HOME
 $ wget
 $ unzip
 $ alias pmd="$HOME/tools/pmd-bin-5.8.1/bin/ pmd"
 $ alias cpd="$HOME/tools/pmd-bin-5.8.1/bin/ cpd"

Command line usage

I give you some examples here.

 cpd --minimum-tokens 20 --language java --files src > cpd-20.txt
 cpd --minimum-tokens 20 --format csv --files src > cpd-20.csv

minimum-tokens (minimum duplicate size) and files (source directory) are the only required options. Then you can specify the language and the report format. What I really like is the csv format. I sort the results in Excel (tokens) and attack the Top 5 duplicates.


CPD helps you to find copied and pasted code. Duplicate code tells you how to improve the code structure.