@# Quotes DB     useful, funny, interesting





Google
 
Web www.quotesdb.info
Undernet  |  EFnet  |  Quakenet  |  Freenode  |  Dalnet  |  Ircnet  |  Galaxynet
Page: 1 2 3 4 5 6 7 8 9 10



Comments:

<0> is there a way to index a longtext field that will allow me to do LIKE '%foo%' operations really fast?
<0> I tried fulltext and MATCH, but that does full word matches and is not useful to what I am doing
<1> No, that sort of ridiculous pattern requires scanning all values
<1> It's as efficient as it can be
<2> Setien: give a list of keywords and it would be really fast
<0> domas, how do you mean?
<2> if that keyword list is known before building the index ;-)
<0> domas, it's not
<2> pity then
<0> yes :)
<2> you totally miss the concept of indexing
<2> :((
<0> you mean I don't understand it?
<2> that's right.
<2> I wonder which mysql version optimzied %blah% lookups quite a lot
<2> maybe it was 4.1 or 4.0



<0> no, I understand it alright, but I was hoping the fine folks at mysql had some clever trickery up their sleeve that would at least allow them to do it faster
<2> Setien: yeah. guessing.
<3> it is possible to index all the words in mysql text fields
<2> !m Sneaky_Bastard fulltext
<4> Sneaky_Bastard: (Full-Text Search Functions) : http://dev.mysql.com/doc/mysql/en/Fulltext_Search.html
<3> yeah
<2> here.
<2> sure it is!
<0> I know it wouldn't be an index as such, but I am sure it would be possible to make some kind of preprocessing that would make partial matches faster
<3> fulltext fields can be indexed, too
<0> domas, I am using mysql 4.1 by the way
<3> to speed up the searches
<2> Setien: in theory - maybe.
<0> Sneaky_Bastard, but not with partial matches, right?
<2> say... split all text into fixed width lexemes
<0> I seem to remember seeing something about being able to use wildcards in the MATCH, but I can't find it
<2> Setien: you can use prefix wildcard matching in fulltext
<0> domas, that would be one possibility
<3> Setien: there is no environmenton earth, there there are signficant speedups for regular expression searches
<0> domas, only prefix?
<3> environment on earth
<0> Sneaky_Bastard, I didn't get that
<3> "partial" match is a regular expression
<0> Sneaky_Bastard, you can't seriously be saying that a '%foo%' match uses the full overhead of a regular expression match?
<5> no, it doesn't
<3> I said exactly and precisely what I said
<5> but it does do a table scan
<2> if all keywords are longer than X, then there's a possibility to split the text into X/2 lexemes and use shifting of lexeme for key lookups..
<2> ergh
<2> that's terribly inefficient but.. works in some situations
<5> '%foo%' doesn't use regular expressions
<3> domas, the question is what does MySQL support internally
<2> and still.. is search.
<2> Sneaky_Bastard: prefix keyword matching for fulltext.
<2> and... custom parsers. in 5.1
<3> if you index the fulltext
<3> then the search on it's index is doing to have to find all keys that match the partial match
<2> usually the magic of fulltext indexing lies in how you tokenize your text.
<2> if you tokenize into keywords you use, it will work.
<3> is going
<2> if not, it won't.
<3> I mean, that's the mechanics
<3> of how it *has* to be done internally
<2> haha, it's not about 'internally', it's about 'done', ha ha ha
<3> <phrase>* match still requires running through sorted keys starting with first match
<2> yes.
<3> sorry, but the question was wether MySQL had an efficient internal method for doing this
<2> for <phrase>*? yes.
<3> my response was that there is not
<3> you do realize that a regular expression gets compiled into a little state machine for doing the compare ?
<3> and that <pattern>* is just a relatively easy one to test against ?
<6> it's an index.
<6> you can walk in indexes
<7> is the question will mysql use an index for a regular expression?
<6> sometimes. if ordered.
<3> walking the index STILL REQUIRES TESTING EACH ENTRY FOR MATCH
<3> binary search will work
<7> or is someone just trolling
<3> but you still have to do the compare
<3> index is ordered, by definition!!!
<3> WTF?



<6> well, index can be unordered.
<6> like... in... hash ;-)
<0> so what I am going to do
<3> ok, you're right
<6> or... rd-tree
<6> or...
<3> a hash index has to be randomly searched
<3> sequentially searched
<0> nm
<3> but then, a database designer who is not a moron, doesn't use hash-based indexes for keys that have to searched for partial matches :p
<6> :-)
<6> well, you can always scan a hash! :)
<3> yeah, sure. easy to say
<3> no fast internal implementation that doesn't involve a regular expression in some form or other, even simplified
<6> why do you call simple string matching 'regular expression'?
<3> even if you don't think it looks like a regular expression.
<3> partial string match
<3> is not a simple string match
<3> that's why
<6> boy, look at etymology of 'regular expression'
<7> :-D
<6> partial string match might be cheaper than 'simple string match';
<3> don't call me boy, child
<3> >:(
<6> boooo
<7> now now children
<6> look at mysys/
<6> it has some of optimized functions for string matching
<6> or wait, string/
<6> none of these things resemble 'regular expressions'
<6> compiled or not.
<7> I'm still trying to figure out what you two are arguing about
<6> firewire: I don't know
<7> hahaha
<6> firewire: something about regexps
<3> I suggest that you study "Introduction to Automata Theory, Language and Computation", if you think a partial string match does not involve a regular expression
<7> partial string match involes ranges not regular expressions in the PCRE sense
<3> even if it is possible code that particular cl*** of expression match highly efficiently.
<3> I never said it did
<6> introduction sounds a bit too difficult for me. I need basics!
<3> we were discussing internals
<7> col like 'foo%' is exapnded to col >= 'foo' and col < 'fop'
<3> uh, no.\
<7> are you sure?
<7> because that's how mysql does it :)
<3> that is not descriptive of how the code that does the match operates
<6> 'A regular expression (abbreviated as regexp, regex, or regxp, with plural forms regexps, regexes, or regexen) is a string that describes or matches a set of strings, according to certain syntax rules. '
<7> like 'foo%' has NOTHING to do with PCRE
<6> 'foo%' by itself is regexp
<6> though, internally it is not handled as such
<3> I see we are having basic Comp Sci terminology problem here.
<7> yes
<6> yes! :)
<7> why don't you define regular expression for us
<3> so I'm just going to drop it as it is apparently not possible to communicate what I mean
<6> 'expression'. mind this word.
<6> and don't invent terminology ;-)
<3> I referenced the definitive text
<3> not my problem if you have not studied it.
<3> have fun
<7> heh
<7> I didn't realize we had a definitive text
<3> you do, like just about everything in Comp. Sci.
<7> regular expression in my world refers to PCRE or what grep uses for "pattern matching"
<6> somewhat in my world too.
<3> if I had meant that, I would have specified PCRE
<3> which I did not
<6> so... unix regexps are not pcre, what are they then?
<6> posix ones that is
<6> or is it for another PCRE?
<6> as in posix-compatible instead of perl-compatible?
<7> pcre is expanded posix compatible expressions
<3> languages for representing "regular expressions" is what they are
<3> there being more than one.


Name:

Comments:

Please enter the result of the sum 63 + 46 (to avoid spam):






Return to #mysql
or
Go to some related logs:

pcap_tPtr
download MzBot
gaim trebuchet MS
ubuntu dvd::rip encrypted
python hex range(*
#web
#linux
deauthenticated due to local deauth request
#perl
tuxtheslacker



Home  |  disclaimer  |  contact  |  submit quotes