Detect and Refactor JavaScript Copy-Paste Code
February 14, 2018
In this post (and the above 5 minute embedded video) we’ll look at how to detect copy and pasted code inside of your web application using two different node command-line tools.
jsinspect
The first tool we are going to use is a node command-line tool called jsinspect
which understands ES6, JSX, and Flow. There are quite a few CLI options to choose from, but thankfully it’s pretty easy to get started.
Detect Copy-Paste JavaScript
In the screenshot below I'm using the integrated terminal inside of Visual Studio Code. I'm using the npx
package runner (that comes with npm@5.2.0
) to execute the jsinspect
tool against our JavaScript code found in the src
folder.
# detect copy-paste code in src folder
npx jsinspect src
Ignore Paths
If the results of the above command returns more than you bargain for, then you can pass the --ignore
CLI option and tell jsinspect
to ignore one or more paths. The following command ignores the lib
, test
, and config
folders.
# ignore the lib, test, and config folders
npx jsinspect src --ignore "lib|test|config"
Adjust Thresholds
Another feature jsinspect
has is to control the threshold (the number of nodes) it uses to determine if a section of code is structurally similar to another. The default threshold is 30, but you can tweak the value by using the --threshold
CLI option.
# a lower threshold should yield more matches
npx jsinspect src --ignore "lib|test|config" --threshold 10
# a higher threshold should yield less matches
npx jsinspect src --ignore "lib|test|config" --threshold 40
Fix the Duplication
Now, let’s change our focus to actually fixing our copy-paste issues starting in utils.js
. You can probably spot pretty quickly the section of code that is duplicated. Yes, this is completely contrived, so bear with me.
export function getNodeJokes(list) {
const jokes = [];
for (let i = 0; i < list.length; ++i) {
const joke = list[i];
if (joke.tags.indexOf('node') !== -1) {
jokes.push(joke);
}
}
return jokes;
}
export function getJavaScriptJokes(list) {
const jokes = [];
for (let i = 0; i < list.length; ++i) {
const joke = list[i];
if (joke.tags.indexOf('javascript') !== -1) {
jokes.push(joke);
}
}
return jokes;
}
Instead of just taking the code and wrapping it in a function call, this could be an opportunity to rewrite the code to use newer JavaScript features. In this case, we'll use ES5 and ES6 features.
export const filterJokes = (jokes, type) =>
jokes.filter((j) => j.tags.includes(type));
export const getNodeJokes = (list) => filterJokes(list, 'node');
export const getJavaScriptJokes = (list) =>
filterJokes(list, 'javascript');
NOTE: I don't show refactoring the
RandomJokes.js
file in this post. However, if you'd like to see the refactor then feel free to watch the embedded video at the top of this blog post.
Verify Duplication is Gone and Unit Tests Pass
Before we proceed, we should probably verify that all of our unit tests still pass. In our terminal we can run npm test
to execute our Jest tests.
In addition, we should probably also re-run jsinspect
to show that we’ve address all of the copy-paste violations at the default threshold… and sure enough, we did.
Now, we can kick up our development web server and watch our app work. And yes, here are some glorious react puns. oh yeah.
jscpd
The other tool that can be verify handy detecting Copy-Paste is the jscpd
command-line tool. The neat thing about this one is that it supports a wide variety of programming languages.
The CLI options are slightly different than jsinspect
, but it’s also pretty easy to get started. We’ll using -f
to indicate which files to include in our detection… in our case, we’ll recursively look for JavaScript files. And we’ll -e
exclude any files in the lib
folder. Like jsinspect
we can also control the threshold with the -t
option, which stands for the minimum number of tokens to use when determining duplication.
# search js files, exclude lib folder, tokens at 30
npx jscpd -f "src/**/*.js" -e "**/lib/**" -t 30
# exclude multiple folders and adjust tokens to 10
npx jscpd -f "src/**/*.js" -e "**/+(lib|test)/**" -t 10
Detect Copy-Paste in CSS Files
However, as we stated earlier, one of the really cool things about jscpd
is that it understands multiple computer languages. So, we could, for example, switch our detection to search CSS files instead of JavaScript. As you can see in the following screenshot, it found some duplication between our App.css
and CardFilp.css
files.
# changes files to search for css copy-paste
npx jscpd -f "src/**/*.css" -e "**/+(lib|test)/**" -t 10
Conclusion
Thanks for reading this post and/or watching the above embedded video. I hope you find the jsinspect
and jscpd
tools helpful in your projects.
Tweet about this post and have it show up here!