projects
/
csv-join-similarity
/ search
commit
grep
author
committer
pickaxe
?
search:
re
summary
|
shortlog
|
log
|
commit
|
commitdiff
|
tree
first ⋅ prev ⋅ next
duplicate.csv
2023-11-27
Dobrica Pavlinusic
duplicate.csv
commit
|
commitdiff
|
tree
2023-11-27
Dobrica Pavlinusic
dump more info about duplicate input rows
commit
|
commitdiff
|
tree
2023-11-23
Dobrica Pavlinusic
add _val to column names to make them unique
commit
|
commitdiff
|
tree
2023-11-22
Dobrica Pavlinusic
cleanup ids correctly (only \w and \d allowed)
commit
|
commitdiff
|
tree
2023-11-22
Dobrica Pavlinusic
cleanup after merge, produce valid output
commit
|
commitdiff
|
tree
2023-11-22
Dobrica Pavlinusic
dump merged.csv
commit
|
commitdiff
|
tree
2023-11-22
Dobrica Pavlinusic
cleanup output, maintain merged $data
commit
|
commitdiff
|
tree
2023-11-22
Dobrica Pavlinusic
val
commit
|
commitdiff
|
tree
2023-11-21
Dobrica Pavlinusic
collect A_ counts (original data stats) only on first...
commit
|
commitdiff
|
tree
2023-11-21
Dobrica Pavlinusic
corrupt razred/skola
commit
|
commitdiff
|
tree
2023-11-16
Dobrica Pavlinusic
0.7 is ok, 0.6 is too random
commit
|
commitdiff
|
tree
2023-11-16
Dobrica Pavlinusic
sort kandidates by score, and if same prefer longer one
commit
|
commitdiff
|
tree
2023-11-16
Dobrica Pavlinusic
try all limits from 0.9 in descending orders
commit
|
commitdiff
|
tree
2023-11-15
Dobrica Pavlinusic
env LIMIT=0.9 is default
commit
|
commitdiff
|
tree
2023-11-14
Dobrica Pavlinusic
check duplicate before merge
commit
|
commitdiff
|
tree
2023-11-14
Dobrica Pavlinusic
cleanup, merge only non-duplicate val for keys
commit
|
commitdiff
|
tree
2023-11-14
Dobrica Pavlinusic
merge at end
commit
|
commitdiff
|
tree
2023-11-14
Dobrica Pavlinusic
cleanup, collect unique_id
commit
|
commitdiff
|
tree
2023-11-14
Dobrica Pavlinusic
similarity 0.9, merge all suggestions
commit
|
commitdiff
|
tree
2023-11-14
Dobrica Pavlinusic
similarity on all keys
commit
|
commitdiff
|
tree
2023-11-14
Dobrica Pavlinusic
similarity, forward progress only
commit
|
commitdiff
|
tree
2023-11-14
Dobrica Pavlinusic
similarity
commit
|
commitdiff
|
tree
2023-11-13
Dobrica Pavlinusic
first cut
commit
|
commitdiff
|
tree