Please enter up to 10 URLs (on separate lines):
Choose your desired shingle-size (number of words): 3 5 10 15 20
This tool spiders a page, extracts the indexable text on a block level, creates 'shingles' (groups of words) and compares those to the other URLs you have specified. This algorithm can be used to help determine how similar different pages are - or how unique the content on the page actually is.
It can make sense to run several URLs from the same site through this tool to determine the in-site duplicity. It can also make sense to run several related URLs through this tool to dermine how related their indexable content is.
The "k-shingle" algorithm is sometimes referenced in patents issued to search engines.
[ See latest shingle analysis reports | Discussion ]