Home - General Discussion - HUGE word-lists duplicate remover and merge tool


215 Results - Page 7 of 8 -
1 2 3 4 5 6 7 8
Author Message
Avatar
12monkeys

Status: n/a
Joined: Thu, 05 Nov 2015
Posts: 72
Team:
Reputation: 10 Reputation
Offline
Thu, 03 Dec 2015 @ 22:49:35

Im just saying that i merged 2 big dictionaries and output file is smaller than either one of them.


write something positive, mad.

Avatar
blandyuk
Admin / Owner
Status: Trusted
Joined: Tue, 05 Jul 2011
Posts: 2916
Team: HashKiller
Reputation: 3911 Reputation
Offline
Thu, 03 Dec 2015 @ 23:00:33

There are two sceanros for this:

1). Both lists had duplicates and thus, resulted in a word-list smaller than either of them.
2). App.Merge.exe uses \n between each word, meaning if your lists orignlly use \r\n, the output will be smaller by the number of bytes which will equal the number of words.

Hope that explains the possible outcomes


Please read the forum rules | Please read the paid section rules
I accept private hash lists, with forum donations only.
BTC: 15qF9WUeFUD63ishxyAMiEgGqTcYzk4j9b
GPU Power: 7x GeForce GTX 1070 and My Brain

Avatar
Niko

Status: Elite
Joined: Sat, 16 May 2015
Posts: 810
Team:
Reputation: 2307 Reputation
Online
Thu, 03 Dec 2015 @ 23:01:53

Hey, Thanks for this tool.

Just helped me combine all my word-lists Extremely fast!


+rep or a tip is really appreciated if I helped

BTC: 14dBvuigXHaZmCAEsCm4DBbeCJBeQxN6ff
LTC: LKEeVrzivDgjJkSdkGb35vjc4vmLM7Agh8
ETH: 0x994b829c9fBf6f665c17a52517E3d005290C8eb4
BCH: qqngm26gwkcmfvldhyflhfksnd6flnlnrgmsytlkyp

Avatar
12monkeys

Status: n/a
Joined: Thu, 05 Nov 2015
Posts: 72
Team:
Reputation: 10 Reputation
Offline
Fri, 04 Dec 2015 @ 01:17:42

Yeah i understand that.
But...
I merged file1 and file2 and called it outcome1. Then i merged outcome1 with file3 and called it outcome2. Outcome2 is smaller than outcome1.
If i can suggest something addind "pause" option would be really awesome. And hint for other people. Dont merge too many files. My pc with i5 2.4ghz was dying. Im still in process of merging all my files and it is much faster to merge files separately then all at once.


write something positive, mad.

Avatar
blandyuk
Admin / Owner
Status: Trusted
Joined: Tue, 05 Jul 2011
Posts: 2916
Team: HashKiller
Reputation: 3911 Reputation
Offline
Fri, 04 Dec 2015 @ 08:20:28

How much RAM do you have? Also, how good is your HD? It's not just the CPU that matters.


Please read the forum rules | Please read the paid section rules
I accept private hash lists, with forum donations only.
BTC: 15qF9WUeFUD63ishxyAMiEgGqTcYzk4j9b
GPU Power: 7x GeForce GTX 1070 and My Brain

Avatar
12monkeys

Status: n/a
Joined: Thu, 05 Nov 2015
Posts: 72
Team:
Reputation: 10 Reputation
Offline
Fri, 04 Dec 2015 @ 09:26:59

I have 4gb ddr3. Hdd is regular seagate 1tb. But it's all good now. Im almost done with merging separately dictionaries. I merged just once all dictionaries which are lower than 1gb each.
I like your app its just im curious about that merged output2 file which somehow was smaller than output1 file. Its kind of strange. I understand that big dictionaries can have many words same but it shouldnt happen after you merge already merged file with another newfile.
And as i already said your app made my pc super laggy(after 24hrs) and almost killed oclhashcat making him running at 0 speed.


write something positive, mad.

Avatar
blandyuk
Admin / Owner
Status: Trusted
Joined: Tue, 05 Jul 2011
Posts: 2916
Team: HashKiller
Reputation: 3911 Reputation
Offline
Fri, 04 Dec 2015 @ 10:05:01

OK, I would not do anything else while the merge is running. Uses a lot of resources needless to say.

Run the merge on the files individually you may find they contain duplicates already, hence the reason it's smaller, (basically, you assuming they don't contain dups).


Please read the forum rules | Please read the paid section rules
I accept private hash lists, with forum donations only.
BTC: 15qF9WUeFUD63ishxyAMiEgGqTcYzk4j9b
GPU Power: 7x GeForce GTX 1070 and My Brain

Avatar
12monkeys

Status: n/a
Joined: Thu, 05 Nov 2015
Posts: 72
Team:
Reputation: 10 Reputation
Offline
Sun, 06 Dec 2015 @ 04:03:56

Ehhhh, I think you didnt understand me...


write something positive, mad.

Avatar
d2

Status: n/a
Joined: Tue, 22 Dec 2015
Posts: 170
Team:
Reputation: 443 Reputation
Offline
Wed, 23 Dec 2015 @ 10:48:17

Hey blandyuk , nice tool indeed. Recently I used it with ~40GB wordlist. Here are my results:


Code:

~40GB wordlists merged into one file

Merge complete to: d2_16.txt
Total words  : 3807530006
Words skipped: 18
Duplicates removed: 2157458164
$HEX[...] conversions: 280
Total time: 1 hrs 8 mins 42.560 secs

I'm not sure if 16 threads were utilised properly. Same goes for memory. Despite that I specified 50GB I was observing something around 5GB of usage.


+rep if I helped
jabber: d2@xmpp.is

Avatar
12monkeys

Status: n/a
Joined: Thu, 05 Nov 2015
Posts: 72
Team:
Reputation: 10 Reputation
Offline
Mon, 11 Jan 2016 @ 23:59:54

blandyuk
Can you add "remove numerical words" to your app?


write something positive, mad.

Avatar
cvsi
Moderator
Status: Trusted
Joined: Fri, 23 May 2014
Posts: 2279
Team:
Reputation: 3322 Reputation
Offline
Tue, 12 Jan 2016 @ 00:10:32

You can do that with ULM if you really need it done.


Load up ULM. Click on Line tools, select Remove only if contains then select numbers.


Please read the forum rules. | Please read the paid section rules.

GTX 1080 Ti , GTX 1080 , 1070 Ti , 2x GTX 1070 Everything watercooled

BTC - 1As13jsySvbN5wjcNJP3AASiazDX9pVdVw
ETH - 0xF35481E80a91ea8aB7D9E1E9c79f55390Cc00744

Avatar
12monkeys

Status: n/a
Joined: Thu, 05 Nov 2015
Posts: 72
Team:
Reputation: 10 Reputation
Offline
Tue, 12 Jan 2016 @ 00:30:20

what is ULM?


write something positive, mad.

Avatar
cvsi
Moderator
Status: Trusted
Joined: Fri, 23 May 2014
Posts: 2279
Team:
Reputation: 3322 Reputation
Offline
Tue, 12 Jan 2016 @ 00:39:27

Unified List Manager


http://unifiedlm.com/Download


Please read the forum rules. | Please read the paid section rules.

GTX 1080 Ti , GTX 1080 , 1070 Ti , 2x GTX 1070 Everything watercooled

BTC - 1As13jsySvbN5wjcNJP3AASiazDX9pVdVw
ETH - 0xF35481E80a91ea8aB7D9E1E9c79f55390Cc00744

Avatar
bharod

Status: n/a
Joined: Tue, 06 Jan 2015
Posts: 6
Team:
Reputation: 0 Reputation
Offline
Tue, 12 Jan 2016 @ 14:37:10

is this tool same like massivesort?


BTC: 1JgEbYB7HS3hrFGtfrAC2aFVJN39g6U6WD

Avatar
blandyuk
Admin / Owner
Status: Trusted
Joined: Tue, 05 Jul 2011
Posts: 2916
Team: HashKiller
Reputation: 3911 Reputation
Offline
Tue, 12 Jan 2016 @ 15:45:36

You can also use my App.RegEx app to remove anything based on RegEx.

https://forum.hashkiller.co.uk/topic-view.aspx?t=7645&m=55993#55993


Please read the forum rules | Please read the paid section rules
I accept private hash lists, with forum donations only.
BTC: 15qF9WUeFUD63ishxyAMiEgGqTcYzk4j9b
GPU Power: 7x GeForce GTX 1070 and My Brain

Avatar
DarkFly

Status: n/a
Joined: Thu, 07 Apr 2016
Posts: 3
Team:
Reputation: 0 Reputation
Offline
Sat, 09 Apr 2016 @ 23:52:10

@blandyuk
such a great tool man, thank you!

Here's my result, 40gb + 30gb

Merge complete to: ../all2.txt
Total words : 4927543916
Words skipped: 218
Duplicates removed: 1382854157
$HEX[...] conversions: 687427
Total time: 2 hrs 7 mins 44.472 secs


Avatar
WPA2

Status: Cracker
Joined: Thu, 11 Jun 2015
Posts: 125
Team:
Reputation: 128 Reputation
Offline
Sun, 17 Apr 2016 @ 21:05:42

Very nice program, finally combined a few wordlists i expect a few duplicates.


Merge complete to: final.txt
Total words : 398013713
Words skipped: 22
Duplicates removed: 145417253
$HEX[...] conversions: 40901109
Total time: 0 hrs 16 mins 35.879 secs

Second 2 files was 15gb an 2.3gb

Merge complete to: uniq.txt
Total words : 1454716724
Words skipped: 262
Duplicates removed: 132757234
$HEX[...] conversions: 3534867
Total time: 1 hrs 18 mins 32.511 secs


BItcoin : 1ME8L8zM7qVrLZvWY2Nyr28N6kqEujCGj5
General Forum Rules! | Paid Password Recovery Rules
Submitting WPA Handshakes

Feel free to rep or donate if i helped you along the way.

Avatar
WPA2

Status: Cracker
Joined: Thu, 11 Jun 2015
Posts: 125
Team:
Reputation: 128 Reputation
Offline
Mon, 18 Apr 2016 @ 07:52:01

Can any one explain this error to me ?

- Words: 854831326 ~ Skipped: 263 ~ Mem: 2679 MB

Unhandled Exception: System.IO.IOException: Data error (cyclic redundancy check).

at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.IO.FileStream.ReadCore(Byte[] buffer, Int32 offset, Int32 count)
at System.IO.FileStream.Read(Byte[] array, Int32 offset, Int32 count)
at App.Merge.Wordlist.ReadChunk(Int32 _size)
at App.Merge.Program.readWordlist(String file)
at App.Merge.Program.Main(String[] args)


Thanks.


BItcoin : 1ME8L8zM7qVrLZvWY2Nyr28N6kqEujCGj5
General Forum Rules! | Paid Password Recovery Rules
Submitting WPA Handshakes

Feel free to rep or donate if i helped you along the way.

Avatar
zido

Status: n/a
Joined: Wed, 30 Dec 2015
Posts: 211
Team:
Reputation: 55 Reputation
Offline
Mon, 18 Apr 2016 @ 19:04:24

WPA2 said:

Can any one explain this error to me ?

- Words: 854831326 ~ Skipped: 263 ~ Mem: 2679 MB

Unhandled Exception: System.IO.IOException: Data error (cyclic redundancy check).

at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.IO.FileStream.ReadCore(Byte[] buffer, Int32 offset, Int32 count)
at System.IO.FileStream.Read(Byte[] array, Int32 offset, Int32 count)
at App.Merge.Wordlist.ReadChunk(Int32 _size)
at App.Merge.Program.readWordlist(String file)
at App.Merge.Program.Main(String[] args)


Thanks.


Looks like the program is crashing, how big is your wl?


Avatar
WPA2

Status: Cracker
Joined: Thu, 11 Jun 2015
Posts: 125
Team:
Reputation: 128 Reputation
Offline
Mon, 18 Apr 2016 @ 19:06:28

zido said:

WPA2 said:

Can any one explain this error to me ?

- Words: 854831326 ~ Skipped: 263 ~ Mem: 2679 MB

Unhandled Exception: System.IO.IOException: Data error (cyclic redundancy check).

at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.IO.FileStream.ReadCore(Byte[] buffer, Int32 offset, Int32 count)
at System.IO.FileStream.Read(Byte[] array, Int32 offset, Int32 count)
at App.Merge.Wordlist.ReadChunk(Int32 _size)
at App.Merge.Program.readWordlist(String file)
at App.Merge.Program.Main(String[] args)


Thanks.


Looks like the program is crashing, how big is your wl?

Total wordlists in the directory is 38.5 GB (41,442,920,514 bytes)



BItcoin : 1ME8L8zM7qVrLZvWY2Nyr28N6kqEujCGj5
General Forum Rules! | Paid Password Recovery Rules
Submitting WPA Handshakes

Feel free to rep or donate if i helped you along the way.

Avatar
zido

Status: n/a
Joined: Wed, 30 Dec 2015
Posts: 211
Team:
Reputation: 55 Reputation
Offline
Mon, 18 Apr 2016 @ 19:51:01

WPA2 said:

zido said:

WPA2 said:

Can any one explain this error to me ?

- Words: 854831326 ~ Skipped: 263 ~ Mem: 2679 MB

Unhandled Exception: System.IO.IOException: Data error (cyclic redundancy check).

at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.IO.FileStream.ReadCore(Byte[] buffer, Int32 offset, Int32 count)
at System.IO.FileStream.Read(Byte[] array, Int32 offset, Int32 count)
at App.Merge.Wordlist.ReadChunk(Int32 _size)
at App.Merge.Program.readWordlist(String file)
at App.Merge.Program.Main(String[] args)


Thanks.


Looks like the program is crashing, how big is your wl?

Total wordlists in the directory is 38.5 GB (41,442,920,514 bytes)



try to split it


Avatar
payknight

Status: Cracker
Joined: Wed, 13 Apr 2016
Posts: 311
Team: just4fun
Reputation: 117 Reputation
Offline
Tue, 06 Sep 2016 @ 14:08:45

blandyuk said:

=
o=[out-file] - Output file.
t=[threads] - Used to speed sorting up only.
c=[mem] - Used to control how much RAM memory to use in MB. Default is 1024. Capped at 3072.
min=[num] - Minimum word length. Default = 1
max=[num] - Maximum word length. Default = 4096.

Words containing control characters will be converted into the Hashcat HEX format: $HEX[...]

2 questions,

1, is that possible raise the capped over 3gb of ram? most of the system today have atleast 8gb+, that will be usefull for high end computers/servers that have lots of ram perhaps that will reduce the timing? (maybe maxing it to 32gbs of ram? or it does not matter??)

2, reducing the "max=" to less than 4096 will that reduce the timing?


+rep if i helped
BTC : 1PAyKniGHt7yyCb8HdsziTHBEFX6zkGSHz

Avatar
edymola

Status: n/a
Joined: Mon, 26 Sep 2016
Posts: 1
Team:
Reputation: 0 Reputation
Offline
Tue, 08 Nov 2016 @ 23:29:15

139gb to 68gb
Merge complete to: pene
Total words : 13405660860
Words skipped: 800
Duplicates removed: 6853395362
$HEX[...] conversions: 54158357
Total time: 6 hrs 40 mins 25.778 secs


Avatar
AliUnique

Status: n/a
Joined: Sun, 04 Dec 2016
Posts: 4
Team: Ashiyane Digital Security Team
Reputation: 0 Reputation
Offline
Sun, 04 Dec 2016 @ 14:35:08

hi ...
i merged two List with this app...
one of them is 250 MB..
and other is 300 MB..
i runs the tool and merged successfully..but with all duplicates which removed finally my output file had 700 MB ...
Why?
i think removing duplicates decreases the file size... But seems i'm in mistake...
(sorry for my BAD English)


Avatar
Bruno

Status: n/a
Joined: Tue, 27 Dec 2016
Posts: 2
Team: African card
Reputation: 0 Reputation
Offline
Sat, 21 Jan 2017 @ 18:26:25

App.Merge.exe o="output-file.txt" t=6 "1.txt" "2.txt" "3.txt" "4.txt" "5.txt" "6.txt" "7.txt" "8.txt "9.txt" "10.txt" "C:\Users\Desktop\app.merge" c=3072 any idea why this isn't working


Avatar
d2

Status: n/a
Joined: Tue, 22 Dec 2015
Posts: 170
Team:
Reputation: 443 Reputation
Offline
Sun, 22 Jan 2017 @ 11:48:38

Bruno, put your wordlists into one directory, chdir to it and execute:

said:

app.merge.exe o=output-file.txt t=6 c=3072 .



+rep if I helped
jabber: d2@xmpp.is

Avatar
judithxx

Status: n/a
Joined: Sat, 17 Dec 2016
Posts: 177
Team:
Reputation: 67 Reputation
Offline
Sat, 25 Feb 2017 @ 12:20:19

If I do:

App.Merge.exe o="alles.txt" c=3072 "psychopack"

on a folder which includes 70.8 GB of files I get this message when it tries to slice it:


Code:
Slicing: 000.txt

Unhandled Exception: System.IO.DirectoryNotFoundException: Could not find a part of the path 'D:\Downloads\crackstation-1.txt\tmp\000-1.txt'.
   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options, String msgPath, Boolean bFromProxy)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share)
   at System.IO.File.WriteAllBytes(String path, Byte[] bytes)
   at App.Merge.Program.sliceFile(Int32 v)
   at App.Merge.Program.sliceCheck()
   at App.Merge.Program.Main(String[] args)


Seems it is looking for 000-1.txt but that file doesn't exist?
Anybody got a fix?


EDIT: just noticed, the path should be: D:\Downloads\crackstation.txt\tmp and not crackstation-1.txt. Note that crackstation.txt is a folder and not a file btw. But maybe the .txt has the program confused.


If I helped you, feel free to either +rep or donate below :)

1P56z7UjuFfmVypE8DfHUSodv4LVURzHoq

Avatar
30k

Status: n/a
Joined: Tue, 12 Aug 2014
Posts: 18
Team:
Reputation: 10 Reputation
Offline
Fri, 03 Mar 2017 @ 13:14:57

Still loving this..
146GB to 111GB of uniques



________________________________________
BTC: 1HUMD5LkAgfZh5PfWwZPeJ1Z4ERuX5ogfh

Avatar
frenchy1

Status: Cracker
Joined: Tue, 28 Jul 2015
Posts: 622
Team:
Reputation: 386 Reputation
Offline
Thu, 30 Mar 2017 @ 00:48:41

does the min=8 max=15 work? as i have 7 and below in finished txt file

C:\Users\x\Desktop\App.Merge.exe o=newmpa2.txt t=6 c=3072 min=8 max=15 F:\Wordlist\Wpa-Favorites\mywpalist2.txt



Just a hobbyist

Avatar
frenchy1

Status: Cracker
Joined: Tue, 28 Jul 2015
Posts: 622
Team:
Reputation: 386 Reputation
Offline
Thu, 30 Mar 2017 @ 02:23:38

frenchy1 said:

does the min=8 max=15 work? as i have 7 and below in finished txt file

C:\Users\x\Desktop\App.Merge.exe o=newmpa2.txt t=6 c=3072 min=8 max=15 F:\Wordlist\Wpa-Favorites\mywpalist2.txt

confirmed working . thanks for a great tool



Just a hobbyist


215 Results - Page 7 of 8 -
1 2 3 4 5 6 7 8

We have a total of 148426 messages in 18357 topics.
We have a total of 18219 registered users.
Our newest registered member is OrlandoX.