csv-join-similarity
6 months ago0.7 is ok, 0.6 is too random
Dobrica Pavlinusic [Thu, 16 Nov 2023 14:15:55 +0000 (15:15 +0100)]
0.7 is ok, 0.6 is too random

6 months agosort kandidates by score, and if same prefer longer one
Dobrica Pavlinusic [Thu, 16 Nov 2023 11:55:54 +0000 (12:55 +0100)]
sort kandidates by score, and if same prefer longer one

6 months agotry all limits from 0.9 in descending orders
Dobrica Pavlinusic [Thu, 16 Nov 2023 10:39:20 +0000 (11:39 +0100)]
try all limits from 0.9 in descending orders

6 months agoenv LIMIT=0.9 is default
Dobrica Pavlinusic [Wed, 15 Nov 2023 09:03:17 +0000 (10:03 +0100)]
env LIMIT=0.9 is default

6 months agocheck duplicate before merge
Dobrica Pavlinusic [Tue, 14 Nov 2023 22:45:40 +0000 (23:45 +0100)]
check duplicate before merge

6 months agocleanup, merge only non-duplicate val for keys
Dobrica Pavlinusic [Tue, 14 Nov 2023 21:20:29 +0000 (22:20 +0100)]
cleanup, merge only non-duplicate val for keys

6 months agomerge at end
Dobrica Pavlinusic [Tue, 14 Nov 2023 19:45:39 +0000 (20:45 +0100)]
merge at end

6 months agocleanup, collect unique_id
Dobrica Pavlinusic [Tue, 14 Nov 2023 11:04:51 +0000 (12:04 +0100)]
cleanup, collect unique_id

6 months agosimilarity 0.9, merge all suggestions
Dobrica Pavlinusic [Tue, 14 Nov 2023 09:43:07 +0000 (10:43 +0100)]
similarity 0.9, merge all suggestions

6 months agosimilarity on all keys
Dobrica Pavlinusic [Tue, 14 Nov 2023 09:31:36 +0000 (10:31 +0100)]
similarity on all keys

7 months agosimilarity, forward progress only
Dobrica Pavlinusic [Tue, 14 Nov 2023 09:00:04 +0000 (10:00 +0100)]
similarity, forward progress only

7 months agosimilarity
Dobrica Pavlinusic [Tue, 14 Nov 2023 06:55:19 +0000 (07:55 +0100)]
similarity

7 months agofirst cut
Dobrica Pavlinusic [Mon, 13 Nov 2023 21:48:21 +0000 (22:48 +0100)]
first cut