|
Like others have said, 2.5 m rows is nothing to worry about. I've worked with a quarter billion rows, broken vertically instead of horizontally ( ie so you pull them back together with a join instead of a union ) on a mid range Dell server.
The main question you need to find an answer to is how will the data be used? Divide it up in a way that will let you get at what you need, but only what you need.
Don't have too few indexes, but don't have too many, either. They can make it thousands of times faster to find data, and take a great deal of load off a server by letting it read in the specific records it needs, instead of doing a table scan, which will dramatically slow any other transactions that are running at the time. But nothing is free: indexes make it slower to add, edit, or delete data. When you change anything, the index needs to be kept up to date so it can be trusted to find the data.
A good rule of thumb is that any column that's used in where or join clauses ought to be indexed, unless it's used, for example as select * from table where column1 like '%something%'. They're best at exact match lookups, good at range lookups, and in many db systems they're good at where column1 like 'something%' ... but if you need to find records based on text in the middle of a column, you're forcing an index scan instead of an index seek ... in some situations the database will choose to ignore the index and go straight to the tables.
Also use compound indexes when they're appropriate. Older databases could only use one index per query; that's not the case any more, but some are better than others at using five indexes to speed up a particular query. If you're always running queries that find or join based on a group of columns, say doctype and docnumber, make one index with both columns. In that example you'd put them in the opposite order; you can exploit a compound index if you only want its first column. Finally, if all the columns you want are covered by a compound index, the database can run your query without ever going to the base table. But again there's no free lunch, and these indexes are less efficient in other ways, like disc space and maintenance time when you change the underlying data.
|