|
First, what you described is exactly what you want to do. A database tries to offer good performance and scalability, and, to do so, it needs optimization, like the type you describe.
Now, that said, the query engine shouldn't need much help (or "tricking") on your part to find those optimizations. The query you typed should NOT read in and then evaluate every row in the table - only the ones from 1955. You can verify that it's actually doing this in the query execution plan.
If it's doing a table scan, you're dead in the water - as far as performance goes. If that's what you see, the way to fix it is with a data structure called an index. You would want one on the date column, so that even given the whole table, your database will know exactly where all of the 1955 entries are, and not even have to touch any of the others. On the other hand, if it was "rainy" so few times over the course of the century that you have fewer of these than days in 1955, an index here would be even more beneficial. Here's the cool part - if you build both indexes, the query optimizer will decide when you run the query which one is better, and use it as appropriate.
And this might be the first time I've disagreed with Tripy, but in fact, at least in SQL Server, the query he provided will be slower. It forces a table scan, which means (potentially) massive disc IO. When you ask for a function to be applied to column data, as opposed to column data to match the results of a function, the database has to load the data and apply the function, to make sure it matches (or doesn't) to satisfy your query. (Newer db engines can optimize this away if the function is deterministic.)
Last edited by Learning Newbie; 12-15-2008 at 02:03 PM..
|