| |
| |
| |
|
Page: 1 2 3
Comments:
<0> Hello. <1> hell-o <2> Whet does it mean where you are ****ered? <2> Sorry this is not about asm... simple question then Ill shut up <0> ****ered? <0> I think it means you've been tricked, but I'm not up with the latest slang... <2> I see, thanks.. <3> opCoder32x: what gave you the idea to ask that question here? <1> ****ered == removed from being a ****er <4> wtf the chinese space shuttle? http://www.i-am-bored.com/bored_link.cfm?link_id=18383 <1> what's on your mind <4> the next hammer, this site is gread. http://www.i-am-bored.com/bored_link.cfm?link_id=18386 <5> geez whoda thunk it lol.. asm's easier to code in than C++ :/ <6> hi <7> Quick question about two's complement notation: Is this notation 'built-in' to the machine, or does the programmer have to implement it herself? <7> I'm looking for examples in the tutorial section covering it, but can't seem to find any.
<8> depends on what you want to do... <7> Just, say, differentiating between negative and positive numbers <9> this notion isn't implemented *anywhere*. <9> it just works. <9> it makes use of nice tricks on binary system <9> which means it works anywhere where there is binary addition/substraction <9> some other issues are big multiply, division, and compares <7> so I won't have to worry about it if I just want the ***embler to do something like -10+5? <9> and all of them are built-in to machine, at least on x86 [and on loads of other arches too] <7> ah <9> right. <7> cool <7> thanks <10> btw, is there any [***embler-written] standalone dlfcn functionality implementation (at least for linux)? <8> that'd be cool <10> so, no chance to find it? :-) <9> Yurik: why not use libc? <8> never looked for it <8> id' use libc <9> if you're one of those ****ers who think they're too cool to use libc c0z they're doing something in ***embler, then just roll your own dynlinker. come on, it isn't that hard. <10> markos_64, of course I can. I'm just playing with bootstrapping frame programming language and exploring how I can use shared libraries' cdecl functions from it, and just thinking about each option. and yes, that language loader currently not yet linked with libc <9> so make some hack to call cdecl functions <9> it can come useful anyway <9> and no danger of polluting your namespace with C **** if your compiler prefixes all external ELF names with, say, FLANG<functionname> <10> my language compiler isn't ELF based. it's image-based <9> alternatively, if that's problematic [ie. because of weird stack structure], you could just write your own ELF dynlinker. contrary to popular belief, it isn't that hard <10> i was just looking for already written simplistic ELF dynlinker. if there is no such thing available, probably I'd better link with libc <9> but weird stack structure can cause problems in other things... one i can already predict is messed up signal handling <9> another issue is that linking with dynamic libc isn't as simple as ld -lc... what's your final linking command? <10> i have no linking for image loader now at all <10> since it's very simplistic <10> http://verb.org.ua/gitweb/?p=xi.git;a=blob;h=8d563c45178eb74147beecab89d1c23dfdc940c1;hb=b180fa9595373c2b392e96073e669a9f2499bab3;f=mkxiro/ldxiro-linux-x86.asm <10> and yes, I should warn: my ***embler knowledge is still pretty weak, since it isn't my primary need, and only loader and compiler required me to learn asm a bit <8> mov [esp], edx <8> could be replaces by "push" <8> but in reverse order. <9> pireau: pushes ****. <8> push edx, eax, 1, 7, eax, edx <8> why ? <8> oh, it moves the pointer <9> pireau: they complicate stack management in high-level languages. or any complicated asm functions. <8> you have to add/pop them <9> pireau: it's much better to sub something, %esp at beginning of function, and then just execute mov's. <10> actually, the most important linkes in ldxiro-linux-x86.asm are 26-29 <10> s/linkes/lines/ <9> pireau: also, push is SomePath" instruction and mov is sOmEoThErPath", which means basically that push counts as two insns. <9> [substitute thoe names with appropriate trademarks, depending on whether you're talking about intel or AMD CPUs] <9> but the most optimisation-defeating thing here is probably pusha/popa <9> pusha/popa should be avoided at all costs... well, ok, maybe in context-switching code when optimising for size, but other than that, it just plain ****s <9> y'know, those instructions are slow because compilers don't usually use them, which means intel won't care about speeding them up <8> ah <10> thanks for a pusha/popa note. my compiler actually uses pushad/popad pair :) i should rethink it, probably <10> at least in stage2 compiler <9> your best bet is to save only registers that you need to save <10> yeah, I'll probably replace pusha/popa with more efficient code in stage2 <9> or maybe better, use some calling convention when some registers are defined to be saved by functions, and some other are not. this means called function gets to use 3 registers without needing to save them. and calling function can keep its local variables in other 3 registers, without danger of overwriting them. <10> found one more markos_64 note in my notes <9> C functions do this <10> about SIB <9> with %eax, %ecx, %edx being non-saved, %ebx, %esi, %edi, %ebp being saved, just FYI.
<9> hehe, this one <10> "... there is SIB if r/m field of r/m byte is 100 and mod isn't 11" <10> "if mod is 11 or r/m field isn't 100, there is no SIB" <10> :) <9> anyway, the save-half-of-registers thingie works nice, since you only need to save regs if you want to use more than 3 <10> it was important for me, since I do not use any ready ***embler and ***emble instructions on my own <10> actually, i still not implemented it, hehe <9> anyway, off for ~2 hours. cya. <11> morning' <9> morinin <12> hello <9> hi <13> there is any way to write a "intel virtual machine tech" code without have cpu that support it? ( like on some kind of cpu emulator ? ) <9> wtf is 'intel virtual machine tech'? <9> ah, wait, this thing <9> well, it'd be theoretically possible to write software CPU emulator that supports virtualisation thingy <13> "Intel Virtualization Technology: <13> the offical name... <9> but i don't know if anything like that exists <13> intel dont provide simulator for their cpu right? <9> you could check out bochs and/org qemu code if you want to write one... those guys already have base CPU emulation done, which is the hardest part IMHO <9> dunno. never heard of one. <9> so probably not <10> is it a good idea to use mmx and xmm register to p*** some values (including 32 bit ones) over calls? <9> Yurik: it's better to use GPRs <14> is there a way to force gcc to use PC relative 64 bit calls instead of 32 bit PC relative calls? <9> Yurik: though it's theoretically possible to abandon normal ALU at all and use only SSE for calculations... then it's more convenient to use xmm's <10> movd isn't that fast? <10> or is it fast enough? <9> Yurik: dunno, but since GPRs and XMMs are in two separate units, i suspect it's about the same performance as moving to/from memory <10> markos_64, so, no real difference between p***ing data around using XMMs and memory? <9> Yurik: OTOH, you could do half calculations in GPRs and half in XMMs, then use both for paramp***ing. <10> markos_64, and it could be basically faster, than p***ing everything via stack, right? <9> Yurik: i don't know how fast is it. but i suspect it could be slower than GPR mov's... you'd better try it and check <9> Yurik: yes, p***ing params in registers is ALWAYS faster than in any memory. you should only use memory when you run out of regs. <9> [this is the single psABI-i386 thingie that annoys me the most... it p***es everything on stack] <9> bill[1]: -mcmodel=large <10> markos_64, I plan to use stack as less as possible in my frame language, in fact, and that is why I'm using mm/xmm registers now for p***sing params <9> bill[1]: besides, there is no reason you'd like to use 64-bit PC-relatives. it should be avoided at all costs. and is just plain retarded. specifically, i'll personally go kill you if you ever make a program that has >4GB code segment compiled. <9> bill[1]: oh, and FYI, -mcmodel=large isn't implemented in current versions of gcc. you can go bother those poor folks at #gcc if you really want it. but i'd expect them to just show you the middle finger. <9> Yurik: good boy. *pat* <14> markos_64: I'm generating code on the fly. I can't guarentee that memmaped memory will be within 2 gigs of each other in the virtual memory space. <14> err mmaped <9> bill[1]: and you're referring to on-the-fly-generated code from main program? wouldn't that, like, cause undefined reference? <14> Not if I built in a linker. <9> Yurik: by dumping recurency and reentrancy [and thread support], you could use static data... but this seems to be dumb idea <9> bill[1]: i ***ume you've read psABI-x86_64? <9> bill[1]: specifically, part about PLT and GOT? <10> markos_64, yeah, not very nice. I want to eliminate all pushes in my current code, that is used to save registers <14> Negative. <14> link? <9> http://x-os.homeip.net/x-os/hdir/pdf/sysv/psABI-x86_64.pdf <9> bill[1]: basically, gcc doesn't care and always generates 32-bit PC-relatives. <9> bill[1]: but it is nevertheless able to go anywhere in 64-bit space. <9> bill[1]: by a trick called Procedure Linkage Table <9> bill[1]: when linker sees call to a function outside main executable, it generates Procedure Linkage Table entry <9> bill[1]: its task is to figure out address of that function and jump there <9> bill[1]: and the PLT resides in main executable, which is probably within 2GB reach. <9> bill[1]: also, this is very optimised sequence. this PLT entry only executes one instruction in average case, which is indirect jump. <14> PLT uses 64 bit jumps? <9> bill[1]: this is accomplished by having so-called Global Offset Table, which has 64-bit entry for each external function referenced by executable <9> bill[1]: PLT accesses this GOT by pc-relative addressing and takes full 64-bit address from GOT directly, then jumps to it. all in one instruction. <14> That seems alot more complicated then just using 64 bit jumps in the first place. <9> bill[1]: GOT is inside executable too, so pc-relatives work here too. <9> bill[1]: yeah, it's a bit more complicated. there is just one problem with 64-bit jumps, though. namely, they don't exist. <14> or 64 bit indirect jumps rather <9> bill[1]: besides, it's been all carefully planned. it's optimised for code size. <14> hm I see. <9> bill[1]: besides, using PLT allows for lazy linking. <9> bill[1]: the first time PLT stub is called, it's possible that GOT doesn't contain entry for this function yet. <9> bill[1]: PLT realises that and calls dynamic linker to figure out its address, which is then written straight to GOT, so that next calls go directly <9> bill[1]: read the psABI. it describes all code sequences involved, reasons for them, and the like. <14> whats the advantage of that over just hardcoding a mov instruction at runtime?
Return to
#asm or Go to some related
logs:
xml.simple woosta #python install trendmicro iwss on debian mysl sort between timestamps gnomeapplet ubuntu 1280x800 fc5 855 #debian DELL LANTITUDE D610 Too many tables; MySQL can only use 61 tables in a join ensoniq es1371 Audiopci-97 soundcard
|
|