| |
| |
| |
|
Page: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Comments:
<0> joobie: The limitation is that you're using a regex. :) <1> sorry hobbs I was wrong: ) <2> m/<a href="([^"]+)">([^<]+)/ <0> joobie: to match something that regexes don't. <3> hobbs, you are telling me it's not possible with regex <3> it is. <0> joobie: Do you know what a regex is? <3> regex matches a string u give it <3> it can be html <3> it can be plain text <4> joobie: you were told how to do it. take that for what it's worth. no one here WANTS to parse HTML with a regex, so we aren't experts at it and no one's going to tell you. <3> regardless of the string you are matching, being htmkl or not <3> the concept is still the same. <0> joobie: Sorry, wrong. <5> do you know ANYTHING about language processors? <0> joobie: Regular expressions match regular languages.
<6> what kind of language parser <2> jpeg, that's just silly, never parsing html. <6> processor* <2> umm <4> bluebeard_: shaddup <3> hobbs <0> joobie <3> you need to read more about regex <1> joobie, why write something that someone else has already wrote? (and probably better than you'll ever can write) <5> no, you do <3> regex does not just match regular lanugages <2> html is Content-type: text/* <3> it matches text <3> any text u give it <2> you should be able to parse text/* with perl. <3> nod <0> joobie: Who the **** is this "u" person? <3> thanks bluebeard <3> u can <0> bluebeard_: of course. And fortunately for us PERL CONSISTS OF MORE THAN REGEXES <2> and? <3> hobbs <3> the point is <0> bluebeard_: and that is all :) <3> you're sitting there saying perl regex can't parse html <3> that's wrong. <2> how else would you do that? <3> perhaps you should saying something like 'use a perl module to parse html rather than straight regex' <1> joobie, What's the point about writing something that already exists, and you can use easily? <3> that'd be more supporting your point <2> umm <4> no, he's saying there ARE BETTER TOOLS FOR THE JOB <0> joobie: come back and collect your $10 when you have a working regex <3> because you are wrong in saying perl regex cant parse html <3> it can. <2> what would the perlmodule be doing? <0> joobie: until then, shut up :) <3> dood <2> scanning the text, just like a m// would. <3> i havent used regex in a while <0> bluebeard_: handing off to a LALR parser most likely <3> but i did years ago <3> and im 100% certain <1> you can parse HTML using ***embly if you want, but I don't understand the point of it <3> it can parse through html <3> if you think it cant <3> u need more skill in it <3> exactly ofer <0> joobie: WHO THE **** IS U? <2> and what would the parser do? <2> scan the text. <2> why add a module when you can create a simple regex? <3> exactly <7> hmm... using "enter" as the only punctuation. using "u" to mean "you", and blatantly starting arguments... <7> 1+2+3=troll? <8> yay the internet. <2> that's adding unecicarry bloat. <0> bluebeard_: From a falsehood you can infer anything <0> bluebeard_: you can't create a simple regex <1> joobie, do what you want to do, but don't expect people to help you solve problems in a stupid way <3> ofer
<3> i accept ur point <1> good <3> hobbs is saying it's impossible for perl regeex to parse html <3> that's what im arguing <5> the only thing you're succeeding in doing by arguing about regexs is wasting time <1> start with HTML::LinkExtractor as hobbs told you 10 minutes ago <2> why would you add a module when it's not neccicary? <0> bluebeard_: Show me a "simple regex" that even tokenizes HTML. <4> bluebeard_: why reinvent the wheel? <0> bluebeard_: I know a 20-line one that does XML, but HTML is based on SGML, and more complicated. <1> 20 line regex? <1> that's crazy <1> :) <3> ur presuming the entire html needs to be parsed <5> and html is full of garbage from the nonstandard browser dark ages <2> you want the link url, and the link text? <3> but if it's only 2% of the entire html <0> bluebeard_: no <3> why parse the entire thing with a module like that <0> bluebeard_: I want tags, attribute, and text <2> he didn't! <3> exactly. <0> joobie: because it's a context-sensitive language <2> if you need the rest, add the module, if not, don't bloat your code. <0> joobie: which means that if you pick up in the middle of the file and ignore context (like a regex) THEN YOU DON'T ****ING KNOW WHAT YOU'RE PARSING <3> i agree bluebeard. <0> SO <0> ****ING <0> STUPID <0> SO <0> STUPID <3> hobbs <8> perl is serious business. <0> SO <0> ****ING <0> STUPID <2> lol <0> SO <3> who said i was ignoring context? <4> bluebeard_: why do you think modules are bloat? <0> ****ING <0> STUPID <3> if context varies <3> that's where the regex adopts to it <0> joobie: you, when you said you were using a regex <3> like i said, why go to the trouble of parsing the whole html doc <3> when it's not even 1% of it that u wnat <3> just use regex <2> they take KB. <2> that is bleat <2> *bloat <0> joobie: Because you don't know your context until you've parsed the whole doc up to that point <3> really? <3> i do <2> if they take up uneccicary space or ram, they're bloat. <3> because this page is fairly static <3> you're presuming i dont know the page im parsing <0> joobie: how do you know there wasn't a CDATA block starting on line 1? <3> which is a presumtion you should not make. <3> because no-one modifies the site <4> bluebeard_: are you using a 486 or something? wtf is a few K? <3> exactly <0> bluebeard_: There's nothing unnecessary about using a solution that's tested and actually works <8> didn't bill gates once say 64 KB would be enough for any personal computer? <2> that's a bad stance, jpeg. <3> just to get that 1% of html <3> i have to load the module <3> waste ram <3> waste cpu cycles <3> then grab that 1% i want <4> bluebeard_: are you kidding? <3> then keep that crap in ram <3> why <5> code in C, you won't have that pesky perl interpreter taking up ram :P
Return to
#perl or Go to some related
logs:
python subprocess.popen hide shorten ubuntu shntool perl list all loaded packages all ebuilds that could satisfy nvidia-glx have been masked #perl #math gnome commander conf file #linux #linux #php
|
|