In addition to raw disk space you will want to consider number of physical disks, SAN vs. local disks (direct attached storage), RAID level, number of controllers, and number of arrays. Along with that is the decision on how to spread your data, indexes, and logs over the various arrays. This reply will concentrate purely on the overall disk size question. However, I do want to emphasize that it is VERY important to properly layout your files over multiple disks on the correct type of RAID and these things will affect the amount of disk space you need.
The size of your server will potentially include space for:
· Each user database (data and log files)
o Space used on initial population.
o Estimate for data growth over the life of the hardware (often a 1-3 year plan, will vary depending on your needs).
o Index fill factors.
o Variable length columns.
o Internal fragmentation over time.
o Free space in the database (little free space means inserts/updates take longer as they search for free space, no room to rebuild/add indexes).
o Free space for maintenance (rebuilding indexes, adding indexes, creating indexed views, etc.).
o Each index will take more space. The type, size, number of indexes almost always changes between the initial estimate and optimized production so add some space for additional/changed indexes.
o Transaction logs may be larger when you are using transactional replication.
o Will you archive data?
o Will you use encryption and if so will you add columns to help with performance related to the encryption?
· TempDB
· MSDB (agent/replication/maintenance jobs, history, etc.)
· Distribution database (optional - used for replication)
· For SQL Server 2008, you may need to plan for space for the new features such as the performance warehouse or change data capture features.
· Potentially extra copies of each database (to restore a single table, to do a load, to do a DBCC, warm standbys, history, etc.).
· Backups (transaction, full, differential) – you may keep 0, 1, or many copies of each on the local (or SAN) disks with or without the SQL Server 2008 compression.
· Free space on disks (including external fragmentation issues).
· Overhead for RAID 10, overhead of separating data from each log.
· Mirrored databases.
· Files used to load data (performance is generally faster if local to the instance).
· The size of the OS and SQL Server binaries plus the master database.
· Other applications, files, etc.
· Unplanned growth, additions, changes, etc.
References