This was done as a school assignment (Applied Artificial Intelligence at Blekinge Institute of Technology).
The idea is to load two texts and save the characteristics. You then provide a sample from one of the texts. I then should be able to decide from which one of the two text your sample comes from.
Text | Word count | Name | Word length | Sentence length | Commas | Newlines | Similiarity | Semicolons | words/ line | Quote length | Colons | Words t1 | Words t2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Text 1 | 14018 | 0.8469 | 5.1028 | 32.5244 | 0.0573 | 0.3318 | 1549 | 0.109 | 98.028 | 12.9739 | 0 | ||
Text 2 | 4114 | 10.9627 | 4.5352 | 15.3507 | 0.1062 | 7.5 | 399 | 0 | 7.0931 | 78.9286 | 5 | ||
Text 3 | 252 | 1 | 4.9921 | 31.5 | 0.0635 | 0.125 | 1549 | 0.5 | 252 | 26.8 | 0 | 14 | 6 |
Algorithm 1: One point for the closest match.
Algorithm 2: The assigned score is the ratio between the difference of the closest and not.