@# Quotes DB     useful, funny, interesting





Google
 
Web www.quotesdb.info
Undernet  |  EFnet  |  Quakenet  |  Freenode  |  Dalnet  |  Ircnet  |  Galaxynet
Page: 1 2 3



Comments:

<0> Hi, I was wondering if anyone is here that can ***ist me.
<0> I'm trying to manipulate strings in an arraylist.
<0> I have some code written, it is used to detect duplicates and remove them
<0> Would anyone be willing to help me out?
<1> adsd: sure
<0> Thanks man
<0> Let me pastebin it - sec
<1> adsd: it would be easier to put them all in a Set, then it will take care of the duplicates automatically
<0> Well, the thing is this:
<0> Only part of the string is a duplicate (the article ID # part)
<0> So I want to find where the article ID # is the same, and toss out the entire string based upon that
<0> http://pastebin.ca/88386
<0> That's the code i'm using now
<0> I dump the text file line-by-line into the array: registry_list
<0> In the first loop, I'm adding all the duplicates ( might be wrong..) to the removal array
<0> Then I iterate through the removal array, removing t hem from the original



<1> you only add a duplicate if its twin is the next in the array
<1> so foo1,foo2 would be removed, but not foo1,cow1,foo2
<0> well, I think they are sorted in order
<0> let me check
<1> ah
<0> yep, all duplicates are one after another
<0> "2002-01-00880","Gastric byp*** surgery","No treatment"
<0> for example..
<0> see how the article ID matches?
<0> I need to remove the last 3..
<0> For some reason what i've written isn't work -- it's removing some that are NOT duplicates -- and it's leaving some duplicates that should be removed
<0> I've been working on this for hours-- no one has been able to come up with a solution
<1> do you have an example of some specific input where it fails?
<0> yeah, sure
<0> one sec
<0> "2003-01-00846","PROPHYLAXIS WITH TRIMETHOPRIM-SULFAMETHOXAZOLE, AZITHROMYCIN, FLUCONAZOLE, AND GANCICLOVIR","PROPHYLAXIS WITH TRIMETHOPRIM-SULFAMETHOXAZOLE, AZITHROMYCIN, AND FLUCONAZOLE"
<0> that's where it shows duplicates that should have been removed
<0> next is:
<0> "2002-01-00965"
<0> it excluded that one..
<0> didnt even show up
<0> even tho I know its a match..
<0> "2002-01-00965","Adjuvant chemotherapy (AC) plus surgery","Surgery alone"
<0> I don't notice any patterns for why it's failing
<1> I tried the first one, and it only spits out one copy. are you sure the removal_list and such is cleared between uses?
<0> What do you mean?
<0> Ah, probably a newbie mistake on my part.
<0> So I should put a clear after the loop?
<1> if the same List is going to be used again, that seems like a fair idea
<0> ok, well i just tried -- not working
<0> removal_list.clear();
<0> right?
<1> yes
<0> not working meaning, there's still duplicates
<0> I clear both of them @ end of code now.
<1> http://rafb.net/paste/results/08iKeL21.html
<1> prints just one.
<0> hm
<0> same as my code?
<0> oh --
<0> must be the matching mechanism
<0> to match with my other text file then
<1> if you use two files, are you sure they're sorted together and not individually?
<0> They are individually sorted
<0> Bsaically, I have a list of drugs on one list
<0> And then I have the database in another
<0> I need to match the list of drugs with the database
<0> here's that part of the code:
<0> http://pastebin.ca/88395
<0> for (int x=0; x<registry_list.size() - 1;x++) {
<0> it's not doing the last line for some reason
<0> I had to use the -1 because I have the x+1 within the code
<0> so shouldn't it do the last line as well?
<1> I can't see any obvious problems with it, but then again, it's getting late
<1> I still haven't been able to reproduce your bug though, so it's hard to know where to begin
<0> Ok
<0> I think it might be:
<0> the other list has duplicates..
<0> maybe*



<0> So if I dump them into a hashset
<0> how do I convert back to ArrayList?
<1> you iterate through the set and put them in the list. the order will be scrambled though.
<0> ok, order doesnt matter for that one
<0> Ok, I almost got it man
<0> Now 2 problems
<0> One isnt showing up that should be
<0> And there's still one duplicate left
<1> how big is the entire app?
<0> very small
<0> 99 lines
<0> but there's a few text files
<1> can you put it somewhere so I can try it?
<0> that are dependent
<0> yeah
<0> um
<0> where can I send the 2 text files
<1> dunno, dcc send perhaps
<0> ok
<0> hang on
<0> Ok the last file
<0> Should_have_these_results
<0> Is the actual results we SHOULD see
<0> From the actual results list:
<0> "2002-01-00965"
<0> Is missing on our results
<0> "2003-01-00846","PROPHYLAXIS WITH TRIMETHOPRIM-SULFAMETHOXAZOLE, AZITHROMYCIN, FLUCONAZOLE, AND GANCICLOVIR","PROPHYLAXIS WITH TRIMETHOPRIM-SULFAMETHOXAZOLE, AZITHROMYCIN, AND FLUCONAZOLE"
<0> and that is a duplicate..
<0> That might be explained if a drug appears more than once..
<0> Ok, guess you left =(
<0> thanks for the help man..
<1> not really
<0> oh
<0> cool
<0> ok, so we have mostly solved:
<0> how can we say this: if the match occurs more than once for a specific article
<0> throw it out
<0> matching function is at the end
<1> stop matching after the first one
<1> just break;
<0> in that loop?
<0> ok cool
<0> worked.
<0> almost there:
<0> # 965 isn't showing up for some reason
<0> other than that, its 100%
<1> does it match anything?
<0> yes
<0> i just hand-verified
<0> Tamoxifen
<0> is in formularies.txt
<0> and also in #965
<0> "2002-01-00965","Tamoxifen plus surgery","Surgery alone"
<0> Tamoxifen
<0> its a mtch
<0> match*
<1> it's removed because it's a duplicate
<1> "2002-01-00965","Tamoxifen plus surgery","Surgery alone"
<1> "2002-01-00965","Adjuvant chemotherapy (AC) plus surgery","Tamoxifen plus surgery"
<1> the only one that will be left alone is the last one
<1> "2002-01-00965","Adjuvant chemotherapy (AC) plus surgery","Surgery alone"
<1> which doesn't contain "Tamoxifen" anywhere
<0> Ah
<0> Ok, so how can we solve this?
<0> How do we keep the 1st one
<0> And remove the ones that come after
<1> how about removing duplicates afterwards instead of before
<0> ok
<0> sounds good
<0> also
<0> for (int x=0; x<registry_list.size() - 1;x++) {
<0> That function isnt finishing the loop
<0> It's stopping right before the last part of registry_phase_3.txt


Name:

Comments:

Please enter the result of the sum 63 + 46 (to avoid spam):






Return to #java
or
Go to some related logs:

#allnitecafe
#allnitecafe
#chat-world
#kl
#india
nole geng
#linux
#chat-world
#chat-world
#allnitecafe



Home  |  disclaimer  |  contact  |  submit quotes