| |
| |
| |
|
Comments:
<0> Does GNU Sed support shy * and + (in regexs)? <1> shy? <1> You mean "lazy quantifiers" ? No. <0> Darn. <1> Instead, you should say what you mean. <1> don't use . unless you mean "match any character" <1> If you don't mean *any* character, don't use . <0> prec: well in this case I want to match <em>([^<]|<[^/]|</[^e]|</e[^m]|</em[^>])+ <1> Aha. Use an HTML/XML parser instead.
<0> That would be overkill. <0> Apart from which, this is pre-processing BEFORE I apply an SGML parser, because the code is badly formed. <1> Overkill? Then this should be fine: <em>.*</em> <0> That would match <em><em>k</em>-term</em>, which is precisely what I'm trying to avoid. <1> Well, the problem with something like ((x)-y) is that it is non-regular. <0> Yeah, you're right. <1> In general, you can't use a regex to parse a non-regular grammar. <1> There are exceptions (because regexes support back references) but they don't apply here. <1> Anyway, check the #regex FAQ suggests using XMLStarlet with sed. http://wiki.hypexr.org/wikka.php?wakka=RegexFAQ <1> preprocessing with Tidy to convert poorly-formed HTML to XHTML. <0> "converting HTML to XHTML using Tidy and then to PYX using XMLStarlet." <0> I *am* using tidy; tidy gets it wrong; hence this preprocessing p***. <1> Well, if you insist on using sed, you can do it. You can't do it in one regex of course, but it's possible. <1> Since sed is (theoretically) Turing-complete. <2> nearly anything still in existance is too <3> hello, i want just print all character between the <table> and </table>, so i use sed -n -e '/<table>.*<\/table>/' to get that, but there are many <table>...</table> pair in my file, this command just can match first token and then exit. how can i let it match all token pair ? thank you. <3> can anybody understand me ? <4> how would i go about escaping the character "^@" which, if you can't see, is a caret followed by a commercial '@' symbol. i'm trying to replace it with the unicode for each of those characters. <4> s/\^@/\@\^/g <num> <4> .. is what i'm trying. <5> I think you'll need to escape the ; <4> goldfish: no i still get the "\^@" not found. <4> similarly if i don't escape it of course. <6> echo ^@^@ | sed 's/\^@/\@\^/g' <6> @^@^ <6> or am I missing something here ? <4> monsieur: hmm works here too, but it's not in vim. <5> it's probably not a literal ^@ <4> monsieur: goldfish i appears in the document as a single character. <4> s/i/it <4> .. which to me implies that it is a literal character. <4> monsieur: goldfish here's an example of the character in a text file. less will interpret it as a binary file, but it's not: http://www.selectparks.net/~julian/tmp/test.txt <1> delire: Are you sure you don't have a NUL character? Vim will display this as ^@
<4> prec: ahah.. ****. that'll be it ;) <1> od -a input |${PAGER:more} ## see? <4> yeah it is a nul char. it's unsubstitutable then isn't it ;/ <1> more or less. You can use tr to change it to something else. <4> cheers. <4> prec: i can't see how you'd use tr as it's firstly not possible to tag this NUL char. <1> tag? <1> tr '\0' '@' <4> oh <1> <infile tr '\0@' '@\0' |sed 's/@/^@/g' |tr '\0@' '@\0' >outfile <1> ***uming your sed is "clean" <1> :!tr <1> :) <4> ;) <1> You can do this substitution in vim anyway. <1> type: colon percent s slash C-v C-@ slash caret at slash g enter <1> :%s/^V^@/^@/g <4> i'm going to need to break that down. what is the '^V' ? <4> (i actually want to convert it to unicode chars @^) <1> A % at the end of your prompt means that your running *csh, right? Right?! <1> delire: type this characters: colon percent s slash C-v C-@ slash caret at slash g enter <1> delire: C-v means hold Control while pressing v <4> yes i realise <1> delire: C-@ means hold Control while pressing @ (Control-Shift-2) <1> delire: Or, some terminals let you type C-space for C-@ <5> nice. <1> Also, you don't have to hold shift; C-2 will work just fine. <4> it's expanding into ther characters, i'm going to have to play around. <4> *other <1> Escape, C-3, C-[ <1> C-\ and C-4 ; C-] and C-5 ; C-7 and C-g ; C-_ and C-/ ; etc <4> on this host C-V expands to '^[OC' <1> delire: hrm. <1> My terminal won't let me type NUL. <4> prec: hah works now (on another host) <4> thanks <1> I wonder what's wrong with my terminal? :( <4> hehe
Return to
#sed or Go to some related
logs:
#perl +ncurces +select list can't call method on an undefined value at /usr/bin/cpan pygstreamer #gimp install phpmyadmin to unbuntu emerging by path is broken gfxmenu resolution suse dependency resolution failed packman #gimp
|
|