Text file duplicate line remover - Limpkinw.Com

Announcement

Collapse
No announcement yet.

Text file duplicate line remover

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Text file duplicate line remover

    well i need this program and i cant program for ****.
    i wasnt gonna post and ask for you guys to try and make it but you are all saying about how this place is dying and ****, and nobody cares about the place so i figured i would post a topic which everybody seems to enjoy on here programming

    anyway back to the point.

    i need this program and its not like i havent looked for it, i have infact i even found it many versions of it come to think about it.

    i have tried and tested all of them,
    and in some way or another they do not work correctly.
    either they are completely unaccurate (is that a word? should it be inaccurate? is that a word anyway?)
    some programs will say a completely different amount of dupes to other programs.
    or the program doesnt save the full text file after removing all the duplicates i encountered this with 3 different programs i think.
    it just crashs and does nothing.

    those are the most common erors that occured.
    the errors occur especially with bigger text files for example anything 2MB+ now when the wordlist is loading usually it will take a while it can even take like 45 seconds (i dont mind that)

    i have been told by a few people this is an easy program to make but because of all the different programs i have tried and none of them work absolutely 100% i put this in the intermmediate section.

    i even emailed a programmer asking if he was able to make this kind of program.
    the guy i emailed is a very well experienced programmer who has made many programs using text files in the past.

    this is what i wrote to the programmer.

    i was wondering if anytime in the near future you had any
    plans to create a text file duplicate line remover program. there
    are many programs which claim to do this that i have tried, but
    they are either not accurate with removing duplicates, save the
    file with lots of the list missing or the program seems to crash.
    especially with large text files.
    this is what he replied

    I did something in the past in that way, but gave it up when things
    kept going wrong. Although everything seemed easy at the start, more
    and more barriers came up that were difficult to overcome/handle -at
    least- in a reliable way. (Not all work was wasted, because I could
    use parts of it to create TABS2spaces :-)

    --
    Best regards,
    David.
    i figured if an experienced programmer like him couldnt do it not many people could.
    unless he isnt as experienced as i thought.

    but there are lots of great programmers here so i thought give it a shot and ask.

    anyway some of you might give a cra.p but i figured some of you might.
    some of you might even want to do it as a small team project just an idea anyway.

  • #2
    Well, You Cant Try Catcher. Its A Simple Dup Remover I Made. As Far As Holding That Much Info In A Listbox, heh Good Luck!

    + txt Files Wont Hold That Much & Become .doc's

    Then You Encounter Opening Them, Causes Lag, & If For Instance On My Computer, If The .doc Or .sql Is Over 9 Megs, Wordpad Nor Notepad Will Open It. So I Have To Turn To Dreamweaver Which Is By Far A RAM Hog.

    Anyways Point being, Dup Removers Are Not The Hard Part, Storing That Much Data In A Listbox, Is.
    Mayor Of Assholeville

    Comment


    • #3
      Originally posted by aUsTiN
      .........Anyways Point being, Dup Removers Are Not The Hard Part, Storing That Much Data In A Listbox, Is.
      So, dont use a Listbox or a Textbox. There are other ways to store extermly large amounts of data in your program.

      Comment


      • #4
        Are these single lines of text? If its a text file of

        line1
        line2
        line3

        Thats not at all hard to make and I could whip you somethign up fast. But as for removing dupes in another way, youll have to fill me in.

        Comment


        • #5
          nscopex that is how it would be.
          and austin is right man opening large text files on my computer is difficult aswell.
          i tried catcher it is the most accurate dupe remover i have come accross so far nothing beats it . but i can only get it to save as 510KB no matter how big the text file is.

          Comment


          • #6
            QUOTE:

            "Are these single lines of text? If its a text file of

            line1
            line2
            line3"


            That is 3 lines of text no matter what or where they are and if you add 1 more......

            line1
            line2
            line3
            line4

            .......then its 4 lines of text.

            Comment


            • #7
              Well What Would You Suggest baloney?

              You Say There Are Others,

              Listbox
              ComboBox
              Rtb
              Textbox

              I Mean There Are Only So Many & Tellin Me There Are Other Doesn't Fix The Fact They Wont Hold Huge Amounts Of Data.
              Mayor Of Assholeville

              Comment


              • #8
                what about making the textbox a lot bigger like 3 or 4 times bigger maybe it would load easier because even the scrollbar wouldnt have to move as fast, bare in mind i am a noob so dont flame my lack of knowledge just admire the part that i am using my brain to think lol.

                im not sure if blaoney meant ways of doing it without a text box but i dont have a clue.

                Comment


                • #9
                  IM ASKING:

                  Is this a text file with names and password? Or names?

                  Does it go:

                  a
                  b
                  c
                  d
                  a
                  b
                  c
                  d

                  ???

                  If it does i can have it so it removes the a, b, c, and d. TO look like

                  a
                  b
                  c
                  d

                  No matter how big the text file. Not hard. Is that what you want it to do?

                  Comment


                  • #10
                    I'm Thinking Random Words / Proxies / W-e In The List.

                    a
                    b
                    d
                    g
                    e
                    h
                    v
                    b
                    e

                    Etc. & It Removes The Duplicates. The Dup Remover Is Not The Hard Part By Any Means. The List Size Is.
                    Mayor Of Assholeville

                    Comment


                    • #11
                      yes basically it is for password cracking using combo lists.

                      like

                      johnassword
                      john:eggs
                      johns:sweets
                      john:dynamite
                      johnassword

                      so for example it would remove the duplicate johnassword because it is there 2 times.

                      Comment


                      • #12
                        Thats Simple. LoL

                        I'll See If I Can Get Huge Filesize Lists To Load / Save. Since Thats All Ya Really Need...
                        Mayor Of Assholeville

                        Comment


                        • #13
                          if you need test files i got some
                          i might make another post about a text file splitter lol.

                          i got a huge 57mb text file which is so big i cannot even open, i could probably change it to .doc to get it to open but it would take me forever to split the file into smaller files manually cos i can only copy and paste so much.

                          Comment


                          • #14
                            Originally posted by aUsTiN
                            Well What Would You Suggest baloney?

                            You Say There Are Others,

                            Listbox
                            ComboBox
                            Rtb
                            Textbox

                            I Mean There Are Only So Many & Tellin Me There Are Other Doesn't Fix The Fact They Wont Hold Huge Amounts Of Data.
                            The problem with Visual Basic programmers is that they tend to think only in terms of ListBoxes, TextBoxes, etc. These are fine for medium scale data manipulation.

                            The TextBox is great because it allows you to add records to it and VB does the sorting for you. Then, you just start from bottom up removing all duplicates in the Listbox. Simple.

                            But what do you do when you want to deal with, lets say, 50MB, 60MB, or even 100MB of data? Well, you betten get VB out of your mind because it just isnt going to work.

                            When I said that there are other ways to store large amounts of data in a program I am not necessarily speaking in terms of using Visual Basic.

                            Think C. Use the 'malloc' method. You can get up to 2GB of memory. Now I am not going to tell you how to go about writting your dup-removinh logic. I am only telling you how to get the memory space to do so. Of, course, you will have to write your own sorting algorythm but if you are serious about a program that will eliminate duplicates in a large scale data file then you better learn something about C and the usage of 'malloc'. If your not serious and you are content with small to medium scale data then VB and the Listbox is your best approach.

                            Also, you might want to look into the 'VirtualAlloc' API call that can be used in VB.

                            Here's a method that one of the programmers where I work used to perform duplicate removing.

                            He took each record from the input file and used its contents as the name of a file. He then created a 0 length file of that name in a directory. When the program tried to create another file of the same name the system rejected the request. So, the end results
                            was a directory of records that contained no duplicates. He then read the directory entries back into his program and used the name of the files as the names of the entries in his new output file.

                            I have no comment on this method wheather it's reasonable or not Im just saying thats how he did it and it worked.

                            Comment


                            • #15
                              well

                              finally i have asked my friend ^1ST^ he is a good coder and a nice hacker , and he said he could code this verey easy and if u wanna get some help from him pesonally join his forum or his irc server ,

                              irc.dynm8.net , and join #crack

                              also here is his forums address : www.gsmcracking.com



                              have fun


                              REGARDS

                              Comment

                              Working...
                              X