Jump to content

Search the Community

Showing results for tags 'near-duplicate'.

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Forums

  • Technical Support
    • General Technical Information
    • Intella 10/100/250/Pro/Team
    • Intella Connect/Node
    • Wishlist
  • News and announcements
    • Announcements
  • Talking Tech with Vound
    • Webinars

Find results in...

Find results that contain...


Date Created

  • Start

    End


Last Updated

  • Start

    End


Filter by number of...

Joined

  • Start

    End


Group


AIM


MSN


Website URL


ICQ


Yahoo


Jabber


Skype


Location


Interests

Found 1 result

  1. Intella does paragraph-level deduplication. What we'd like to stipulate here is the identification of near-duplicate items (and paragraphs). This could be done using shingles, calculating the ratio of shared shingles amongst items (shingles from item A contained in item B and vice-versa). See also "Jaccard Similarity."
×
×
  • Create New...