| |
| |
| |
|
Comments:
<0> hi <0> i have a raw unicode string ("\xd3\x05..."), how can i convert that to a python-compatible unicode string? <1> Depends on which encoding is used. <0> hund, localized non-english encoding <1> That didn't answer my question. Anyway, once you figure out which encoding it is, you can do unicode_string = my_raw_string.decode("whatever encoding it is") <0> ah, thanks <0> hund, do u know if its possible to convert that unicode string directly to ansi, like WideCharToMultiByte() does? <0> s/ansi/ascii <1> I have no idea what WideCharToMultiByte() is. <0> its a win32 api... <1> No wonder, then. :) If you want to encode a unicode string as ASCII, you have my_unicode_string.encode("ascii")
<0> ah, cool <0> thanks :) <2> hund: You're better at documenting than me, why do you want me to document things? ;) <1> I am? <2> Well you mentioned something about a phonecall <2> And that I needed to write some documentation <1> Yes, but what makes you think I'm any good at documenting? <1> And there's a difference between documenting something existing and documenting something unborn. :) <2> Well the fact that you take the time to type my_unicode_string for the sake of having people understand. I'd say 'do a s.encode("ascii")' and be over with it. <2> this is why you're good at it. <1> Yes, but designing something and writing the doc at the same time is hard. <0> how do i obtain a list of all the codecs python's unicode module supports? <1> docs.python.org <0> ok one last time since i cant seem to find a solution - I have a raw unicode string and i want to convert it to ascii, on win32 its a single API call (WideCharToMultiByte) - is there an easy way to do this? <0> i dont know the string's encoding. <1> A "raw unicode string" doesn't make any sense. Unicode has several encodings. <0> hund, WideCharToMultiByte() seems to pick the right one...maybe its the system's default <1> It can be UTF-8, UTF-16, UCS-4, UCS-2, UTF-7 ... <1> Well, I never code for Windows. <0> heh <0> well what would u do if u didnt know the encoding? <1> I don't know this "u", nor is he or she in the channel, but I would find out which encoding it is. <0> hund, how? (im sorry im pissing u ofF) <3> is there documentation somewhere for WideCharToMultiByte() ? <1> Again, as long as you're not actually talk to him, I'm sure you're not pissing u off. Where do you get the raw string? <1> *talking <0> madewokhe, msdn <0> hund, a POST request <1> You're writing a web server? <0> hund, yeah <1> I believe the encoding is sent in an HTTP header, then. <0> hund, let me recheck <1> And I believe it defaults to ISO-8859-1 if it is omitted. <0> hund, well its not in the request, but ill try that encoding. <1> I can't remember a server using plain POST requests off the top of my head. Any suggestions? <1> That is, a web page. <1> In HTTP, not HTTPS. <3> I don't know how http post works, sorry :/ <0> hund, s.decode("iso-8859-1") worked, but encode("ascii") doesnt <0> it seems to want all characters under 0x80, altho i dont understand why
<3> because that's what ascii is <3> it only includes characters that you can represent in 7 bits <0> characters above 0x80 are mapped to localized languages, which is what i need <3> then you want a different encoding <0> madewokhe, whats its called? <3> I don't know, there are a lot of encodings like that :/ <0> damnit why wont this solve :| <0> i might end up writing a python extension <3> you shouldn't have to do that <1> kradius: My browser has no problem sending encoding information in the Content-Type header. <0> hund, iexplore doesnt send it, i just checked <1> I just did a test with Firefox, it sent it as application/x-www-form-urlencoded <0> hund, anyway it doesnt matter right now because iso-8859-1 seems to work, the problem now is converting it to ascii <1> No, anything will let itself decode into ISO-8859-1, it's 8-bit clean. <1> That is, from it. <0> well <1> So there's no guarantee that's it. <0> yeh i understand...i think ill just write an extension that calls win32 api <1> Uh, okaay. <1> Now there's something for thedailywtf.com. :) <0> hund, i tend to agree :) but there doesnt seem to be an easier solution <1> I can't see why IE wouldn't send that header in the request. Do you have a list of headers you could show at http://rafb.net/paste/ ? And how did you retrieve that list? <1> As for encoding it as ASCII, I wonder if you believe that anything that's not Unicode is ASCII. ASCII is a very old and limited encoding with only 128 characters, of which only 95 are printable. <0> hund, i need it to become a "multi-byte" string, as in a string thats null-terminated and built of 8-bit fields ranging between 0x00 to 0xFF, but im sure you got what i mean. <1> No, that's the problem. We don't get what you mean. <3> the problem is there are lots of encodings like that <1> Multi-byte sounds like UTF-8. <0> anyway, what i got from iexplore is only "שכתוב", which converts to "\xe9\x05\xdb\x05\xea\x05\xd5\x05\xd1\x05\x00\x00" <0> i need it to become "\xf9\xeb\xfa\xe5\x31" (thats after WideCharToMultiByte) <0> er thats "\xf9\xeb\xfa\xe5\xe1" - typo <3> what do you type into IE to make it do that? <0> madewokhe, a phrase in a foreign language... <3> I mean <3> paste it into irc <0> ????? <3> :/ <0> madewokhe, lol u see not all is bad in windows land ;) <3> what client are you using for irc? <0> xchat <3> why doesn't xchat support unicode? <1> xchat supports UTF-8, you just have to tell it in the options. <0> no idea <0> hey nevermind im solving this with a small manual-convertion manuever (byte & 0xFF + 0x10), should work for my case <0> thanks for your help <3> I wish these things would eventually start to make sense :p <1> Oh well, I'm sure we could help you do it cleaner if you didn't read selectively.
Return to
#python or Go to some related
logs:
cdosys Access_is_denied
base64_decode using c# #beginner #windows #gamedev evildadpoorkid.wmv #directx #nhl xgl white border #winxp
|
|